From R.M.Everson at exeter.ac.uk Fri Mar 1 01:42:05 2002 From: R.M.Everson at exeter.ac.uk (R.M.Everson) Date: Fri Mar 1 01:42:05 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) In-Reply-To: References: Message-ID: Hi, On 28 Feb 2002, Travis Oliphant wrote: > On 28 Feb 2002, A.Schmolck wrote: >> Two essential matrix operations (matrix-multiplication and >> transposition (which is what I am mainly using) are both >> considerably >> >> a) less efficient and >> b) less notationally elegant > You are not alone in your concerns. The developers of SciPy are > quite concerned about speed, hence the required linking to ATLAS. > The question of notational elegance is stickier because we just > can't add new operators. > The solution I see is to use other classes. At the moment, I agree this is probably the best solution, although it would be nice if the core python was able to add operators :) >> The following Matlab fragment >> M * (C' * C) * V' * u >> > This becomes (using SciPy which defines Mat = Matrix.Matrix and > could later redefine it to use the ATLAS libraries for matrix > multiplication). > C, V, u, M = apply(Mat, (C, V, u, M)) > M * (C.H * C) * V.H * M Yes, much better. > not bad.. and with a Mat class that uses the ATLAS blas (not a > very hard thing to do now.), this could be made as fast as > MATLAB. > Perhaps, as as start we could look at how you make the current > Numeric use blas if it is installed to do dot on real and complex > arrays (I know you can get rid of lapack_lite and use your own > lapack) but, the dot function is defined in multiarray and would > have to be modified to use the BLAS instead of its own homegrown > algorithm. This is precisely what Alex and I have done. Please see the patch to Numeric and timings on http://www.dcs.ex.ac.uk/~aschmolc/Numeric/ It's not beautiful but about 40 times faster on 1000 by 1000 matrix multiplies. I'll attempt to provide a similar patch for numarray over the next week or so. Many thanks for your comments. Richard. From focke at SLAC.Stanford.EDU Fri Mar 1 17:53:10 2002 From: focke at SLAC.Stanford.EDU (Warren Focke) Date: Fri Mar 1 17:53:10 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) In-Reply-To: <198c01c1c09b$4e981c60$74460344@cx781526b> Message-ID: On Thu, 28 Feb 2002, Tim Hochberg wrote: > > not going to do you any good). The above C' * C actually creates, > AFAIK, > > _3_ versions of C, 2 of them transposed (prior to 20.3; > > I think you're a little off track here. The transpose operation doesn't > normally make a copy, it just creates a new object that points to the same > data, but with different stride values. So the transpose shouldn't be slow > or take up more space. Numeric.transpose quickly returns an object which takes up little space. But, in many cases, when you actually use the object returned, a contiguous copy gets created. Glancing over the 21.0b3 sources, it looks like this might not happen as often as it used to, but there are still plenty of calls to PyArray_ContiguousFromObject in there, so transposition is not always as cheap as it might seem. Especially if you say Aprime=Numeric.transpose(A) and then use Aprime several times, you could end up repeatedly creating and discarding temporary transposed copies. Warren Focke From huaiyu_zhu at yahoo.com Fri Mar 1 21:09:02 2002 From: huaiyu_zhu at yahoo.com (Huaiyu Zhu) Date: Fri Mar 1 21:09:02 2002 Subject: [Numpy-discussion] Python 2.2 seriously crippled for numerical computation? Message-ID: There appears to be a serious bug in Python 2.2 that severely limits its usefulness for numerical computation: # Python 1.5.2 - 2.1 >>> 1e200**2 inf >>> 1e-200**2 0.0 # Python 2.2 >>> 1e-200**2 Traceback (most recent call last): File "", line 1, in ? OverflowError: (34, 'Numerical result out of range') >>> 1e200**2 Traceback (most recent call last): File "", line 1, in ? OverflowError: (34, 'Numerical result out of range') This produces the following serious effects: after hours of numerical computation, just as the error is converging to zero, the whole thing suddenly unravels. Note that try/except is completely useless for this purpose. I hope this is unintended behavior and that there is an easy fix. Have any of you experienced this? Huaiyu From tim.one at comcast.net Fri Mar 1 21:43:12 2002 From: tim.one at comcast.net (Tim Peters) Date: Fri Mar 1 21:43:12 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: [Huaiyu Zhu] > There appears to be a serious bug in Python 2.2 that severely limits its > usefulness for numerical computation: > > # Python 1.5.2 - 2.1 > > >>> 1e200**2 > inf A platform-dependent accident, there. > >>> 1e-200**2 > 0.0 > > # Python 2.2 > > >>> 1e-200**2 > Traceback (most recent call last): > File "", line 1, in ? > OverflowError: (34, 'Numerical result out of range') That one is surprising and definitely not intended: it suggests your platform libm is setting errno to ERANGE for pow(1e-200, 2.0), or that your platform C headers define INFINITY but incorrectly, or that your platform C headers define HUGE_VAL but incorrectly, or that your platform C compiler generates bad code, or optimizes incorrectly, for negating and/or comparing against its definition of HUGE_VAL or INFINITY. Python intends silent underflow to 0 in this case, and I haven't heard of underflows raising OverflowError before. Please file a bug report with full details about which operating system, Python version, compiler and C libraries you're using (then it's going to take a wizard with access to all that stuff to trace into it and determine the true cause). > >>> 1e200**2 > Traceback (most recent call last): > File "", line 1, in ? > OverflowError: (34, 'Numerical result out of range') That one is intended; see http://sf.net/tracker/?group_id=5470&atid=105470&func=detail&aid=496104 for discussion. > This produces the following serious effects: after hours of numerical > computation, just as the error is converging to zero, the whole thing > suddenly unravels. It depends on how you write your code, of course. > Note that try/except is completely useless for this purpose. Ditto. If your platform C lets you get away with it, you may still be able to get an infinity out of 1e200 * 1e200. > I hope this is unintended behavior Half intended, half both unintended and never before reported. > and that there is an easy fix. Sorry, "no" to either. From paul at pfdubois.com Sat Mar 2 07:56:15 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Sat Mar 2 07:56:15 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: <000001c1c202$7996be90$1001a8c0@NICKLEBY> I also see the underflow problem on my Linux box 2.4.2-2. This is certainly untenable. However, I am able to catch OverflowError in both cases. I had a user complain about this just yesterday, so I think it is a new behavior in Python 2.2 which I was just rolling out. A small Fortran test problem did not exhibit the underflow bug, and caught the overflow bug at COMPILE TIME (!). There are two states for the IEEE underflow: one in which the hardware sets it to zero, and the other in which the hardware signals the OS and you can tell the OS to set it to zero. There is no standard for the interface to this facility that I am aware of. (Usually I have had to figure out how to make sure the underflow was handled in hardware because the sheer cost of letting it turn into a system call was prohibitive.) I speculate that on machines where the OS call is the default that Python 2.2 is catching the signal when it should let it go by. I have not looked at this lately so something may have changed. You can use the kinds package that comes with Numeric to test for maximum and minimum exponents. kinds.default_float_kind.MAX_10_EXP (equal to 308 on my Linux box, for example) tells you how big an exponent a floating point number can have. MIN_10_EXP (-307 for me) is also there. Work around on your convergence test: instead of testing x**2 you might test log10(x) vs. a constant or some expression involving kinds.default_float_kind.MIN_10_EXP. -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Tim Peters Sent: Friday, March 01, 2002 9:43 PM To: Huaiyu Zhu; numpy-discussion at lists.sourceforge.net Cc: python-list at python.org Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? [Huaiyu Zhu] > There appears to be a serious bug in Python 2.2 that severely limits > its usefulness for numerical computation: > > # Python 1.5.2 - 2.1 > > >>> 1e200**2 > inf A platform-dependent accident, there. > >>> 1e-200**2 > 0.0 > > # Python 2.2 > > >>> 1e-200**2 > Traceback (most recent call last): > File "", line 1, in ? > OverflowError: (34, 'Numerical result out of range') That one is surprising and definitely not intended: it suggests your platform libm is setting errno to ERANGE for pow(1e-200, 2.0), or that your platform C headers define INFINITY but incorrectly, or that your platform C headers define HUGE_VAL but incorrectly, or that your platform C compiler generates bad code, or optimizes incorrectly, for negating and/or comparing against its definition of HUGE_VAL or INFINITY. Python intends silent underflow to 0 in this case, and I haven't heard of underflows raising OverflowError before. Please file a bug report with full details about which operating system, Python version, compiler and C libraries you're using (then it's going to take a wizard with access to all that stuff to trace into it and determine the true cause). > >>> 1e200**2 > Traceback (most recent call last): > File "", line 1, in ? > OverflowError: (34, 'Numerical result out of range') That one is intended; see http://sf.net/tracker/?group_id=5470&atid=105470&func=detail&aid=496104 for discussion. > This produces the following serious effects: after hours of numerical > computation, just as the error is converging to zero, the whole thing > suddenly unravels. It depends on how you write your code, of course. > Note that try/except is completely useless for this purpose. Ditto. If your platform C lets you get away with it, you may still be able to get an infinity out of 1e200 * 1e200. > I hope this is unintended behavior Half intended, half both unintended and never before reported. > and that there is an easy fix. Sorry, "no" to either. _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From andymac at bullseye.apana.org.au Sat Mar 2 10:18:22 2002 From: andymac at bullseye.apana.org.au (Andrew MacIntyre) Date: Sat Mar 2 10:18:22 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: On 2 Mar 2002, Konrad Hinsen wrote: > Tim Peters writes: > > > > # Python 2.2 > > > > > > >>> 1e-200**2 > > > Traceback (most recent call last): > > > File "", line 1, in ? > > > OverflowError: (34, 'Numerical result out of range') > > > > That one is surprising and definitely not intended: it suggests your > > platform libm is setting errno to ERANGE for pow(1e-200, 2.0), or that your > > platform C headers define INFINITY but incorrectly, or that your platform C > > headers define HUGE_VAL but incorrectly, or that your platform C compiler > > generates bad code, or optimizes incorrectly, for negating and/or comparing > > I just tested and found the same behaviour, on RedHat Linux 7.1 > running on a Pentium machine. Python 2.1, compiled and running on the > same machine, returns 0. So does the Python 1.5.2 that comes with the > RedHat installation. Although there might certainly be something wrong > with the C compiler and/or header files, something has likely changed > in Python as well in going to 2.2, the only other explanation I see > would be a compiler optimization bug that didn't have an effect with > earlier Python releases. Other examples... FreeBSD 4.4: Python 2.1.1 (#1, Sep 13 2001, 18:12:15) [GCC 2.95.3 20010315 (release) [FreeBSD]] on freebsd4 Type "copyright", "credits" or "license" for more information. >>> 1e-200**2 0.0 >>> 1e200**2 Inf Python 2.3a0 (#1, Mar 1 2002, 00:00:52) [GCC 2.95.3 20010315 (release) [FreeBSD]] on freebsd4 Type "help", "copyright", "credits" or "license" for more information. >>> 1e-200**2 0.0 >>> 1e200**2 Traceback (most recent call last): File "", line 1, in ? OverflowError: (34, 'Result too large') Both builds built with "./configure --with-fpectl", although the optimisation settings for the 2.3a0 (CVS of yesterday) build were tweaked (2.1.1: "-g -O3"; 2.3a0: "-s -m486 -Os"). My 2.2 OS/2 EMX build (which uses gcc 2.8.1 -O2) produces exactly the same result as 2.3a0 on FreeBSD. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac at bullseye.apana.org.au | Snail: PO Box 370 andymac at pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From pearu at cens.ioc.ee Sun Mar 3 04:41:15 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sun Mar 3 04:41:15 2002 Subject: [Numpy-discussion] How can CDOUBLE_to_CDOUBLE work correctly? Message-ID: Hi! I am trying to copy a 2-d complex array to another 2-d complex array in an extension module. Both arrays may be noncontiguous. But using the same routine (namely Travis's copy_ND_array, you can find it at the end of this messsage) as for real arrays seems not work. After some playing around and reading docs about strides and stuff, I found that the reason might be in how Numeric (20.3) defines the CDOUBLE_to_CDOUBLE function: static void CDOUBLE_to_CDOUBLE(double *ip, int ipstep, double *op, int opstep, int n) {int i; for(i=0;i<2*n;i++,ip+=ipstep,op+=opstep) {*op = (double)*ip;}} It seems not to take into account that real and imaginary part are always contiguous in memory, even if an array itself is not. Actually, I don't understand how this code can work (unless some magic is done in places where this code is used). I would have expected that the code for CDOUBLE_to_CDOUBLE to be analoguous to relative functions but for the real data. For example, DOUBLE_to_DOUBLE is defined as static void DOUBLE_to_DOUBLE(double *ip, int ipstep, double *op, int opstep, int n) {int i; for(i=0;i= (max_ind)[k]) { \ while (k >= 0 && ((ret_ind)[k] >= (max_ind)[k]-1)) \ (ret_ind)[k--] = 0; \ if (k >= 0) (ret_ind)[k]++; \ else (ret_ind)[0] = (max_ind)[0]; \ } \ } #define CALCINDEX(indx, nd_index, strides, ndim) \ { \ int i; \ indx = 0; \ for (i=0; i < (ndim); i++) \ indx += nd_index[i]*strides[i]; \ } extern int copy_ND_array(const PyArrayObject *in, PyArrayObject *out) { /* This routine copies an N-D array in to an N-D array out where both can be discontiguous. An appropriate (raw) cast is made on the data. */ /* It works by using an N-1 length vector to hold the N-1 first indices into the array. This counter is looped through copying (and casting) the entire last dimension at a time. */ int *nd_index, indx1; int indx2, last_dim; int instep, outstep; if (0 == in->nd) { in->descr->cast[out->descr->type_num]((void *)in->data,1, (void*)out->data,1,1); return 0; } if (1 == in->nd) { in->descr->cast[out->descr->type_num]((void *)in->data,1, (void*)out->data,1,in->dimensions[0]); return 0; } nd_index = (int *)calloc(in->nd-1,sizeof(int)); last_dim = in->nd - 1; instep = in->strides[last_dim] / in->descr->elsize; outstep = out->strides[last_dim] / out->descr->elsize; if (NULL == nd_index ) { fprintf(stderr,"Could not allocate memory for index array.\n"); return -1; } while(nd_index[0] != in->dimensions[0]) { CALCINDEX(indx1,nd_index,in->strides,in->nd-1); CALCINDEX(indx2,nd_index,out->strides,out->nd-1); /* Copy (with an appropriate cast) the last dimension of the array */ (in->descr->cast[out->descr->type_num])((void*)(in->data+indx1),instep, (void*)(out->data+indx2),outstep,in->dimensions[last_dim]); INCREMENT(nd_index,in->nd-1,in->dimensions); } free(nd_index); return 0; } /* EOF copy_ND_array */ From pearu at cens.ioc.ee Sun Mar 3 07:10:08 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sun Mar 3 07:10:08 2002 Subject: [Numpy-discussion] How can CDOUBLE_to_CDOUBLE work correctly? In-Reply-To: Message-ID: Hi again, On Sun, 3 Mar 2002, Pearu Peterson wrote: > So, I would have expected CDOUBLE_to_CDOUBLE to be > > static void CDOUBLE_to_CDOUBLE(double *ip, int ipstep, > double *op, int opstep, int n) > { int i; > for(i=0;i *op = (double)*ip; /* copy real part */ > *(op+1) = (double)*(ip+1); /* copy imaginary part that always > follows the real part in memory */ > } > } After some testing I found that CDOUBLE_to_CDOUBLE should be static void CDOUBLE_to_CDOUBLE(double *ip, int ipstep, double *op, int opstep, int n) { int i; for(i=0;i Message-ID: A lot of this speculation should have been cut short by my first msg. Yes, something changed in 2.2; follow the referenced link: http://sf.net/tracker/?group_id=5470&atid=105470&func=detail&aid=496104 For the rest of it, it looks like the "1e-200**2 raises OverflowError" glitch is unique to platforms using glibc. What isn't clear is whether it's dependent on which version of glibc, or on whether Python is linked with -lieee, or both. Unfortunately, the C standard (neither one) isn't a lick of help here -- error reporting from C math functions is a x-platform crapshoot. Can someone who sees this problem confirm or deny that they link with -lieee? If they see this problem and don't link with -lieee, also please try linking with -lieee and see whether the problem goes away then. On boxes with this problem, I'm also curious what import math print math.pow(1e-200, 2.0) does under 2.1. One probably-relevant thing that changed between 2.1 and 2.2 is that float**int calls the platform pow(float, int) in 2.2. 2.1 did it with repeated multiplication instead, but screwed up endcases. An example under 2.1: >>> x = -1. >>> import sys >>> x**(-sys.maxint-1L) Traceback (most recent call last): File "", line 1, in ? ValueError: negative number cannot be raised to a fractional power >>> The same thing under 2.2 returns 1.0, provided your platform pow() isn't braindead. Repeated multiplication is also less accurate than a decent-quality pow(). From paul at pfdubois.com Mon Mar 4 08:21:18 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Mar 4 08:21:18 2002 Subject: [Numpy-discussion] Matrix.py change breaks LinearAlgebra Message-ID: <000201c1c398$8c1a8c80$1001a8c0@NICKLEBY> Your change to Matrix.py has a fatal flaw: Python 2.2 (#1, Mar 1 2002, 11:11:28) [GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-81)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import LinearAlgebra Traceback (most recent call last): File "", line 1, in ? File "/home/dubois/linux/lib/python2.2/site-packages/Numeric/LinearAlgebra.py ", line 10, in ? import MLab File "/home/dubois/linux/lib/python2.2/site-packages/Numeric/MLab.py", line 10, in ? import Matrix File "/home/dubois/linux/lib/python2.2/site-packages/Numeric/Matrix.py", line 8, in ? from LinearAlgebra import inverse ImportError: cannot import name inverse >>> From paul at pfdubois.com Mon Mar 4 11:55:03 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Mar 4 11:55:03 2002 Subject: [Numpy-discussion] RE: Matrix.py change breaks LinearAlgebra In-Reply-To: Message-ID: <000101c1c3b6$60369aa0$1001a8c0@NICKLEBY> Travis fixed the error he accidentally made when improving MLab.py. This incident points out that our test suite does not include any tests for the non-core modules. Recently the discussion over the meaning of cov shows this too: I didn't even have a test that showed what the INTENDED answer is. We need test suites for FFT, LinearAlgebra, MLab, etc. If you have the subject competence to make a test file for us, please volunteer. I'd like them to be like the ones in Test/test.py, but as a separate file so that I can test the modules separately. From bsder at allcaps.org Mon Mar 4 13:12:23 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Mon Mar 4 13:12:23 2002 Subject: [Numpy-discussion] Interface for numpy C-API <-> simple C++ matrix class In-Reply-To: <20020215211700.GA22189@gibbs.physik.uni-konstanz.de> Message-ID: <20020304131041.G51226-100000@mail.allcaps.org> You might want to check out the Boost Python Library. It is peer reviewed and seems to get most things correct. It should make writing wrappers a lot easier. -a From bsder at allcaps.org Mon Mar 4 13:17:15 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Mon Mar 4 13:17:15 2002 Subject: [Numpy-discussion] Interface for numpy C-API <-> simple C++ matrix class In-Reply-To: <20020304131041.G51226-100000@mail.allcaps.org> Message-ID: <20020304131616.P51226-100000@mail.allcaps.org> Ummmm, it helps if I include the URL. Sorry. http://www.boost.org/libs/python/doc/ -a On Mon, 4 Mar 2002, Andrew P. Lentvorski wrote: > You might want to check out the Boost Python Library. It is peer reviewed > and seems to get most things correct. > > It should make writing wrappers a lot easier. > > -a > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From tim.one at comcast.net Tue Mar 5 16:51:03 2002 From: tim.one at comcast.net (Tim Peters) Date: Tue Mar 5 16:51:03 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: [Huaiyu Zhu] > ... > 1. The -lieee is indeed the most direct cure. On the specific platform tried. The libm errno rules changed between C89 and C99, and I'm afraid there's no errno behavior Python can rely on anymore. So I expect more changes will be needed in Python, regardless of how things turn out on this specific platform. > ... > 2. Is there a configure option to guarantee -lieee? If anyone can answer this question, please don't answer it here: it will just get lost. Attach it to Huaiyu's bug report instead: Thanks. > ... > 3. errno 34 (or -lieee) may not be the sole reason. > > On a RedHat 6.1 upgraded to 7.1 (both gcc and glibc), errno 34 > is indeed raised in a C program linked without -lieee, and Python is > indeed compiled without -lieee, but Python does not raise > OverflowError. I expect you're missing something. Skip posted the Python code before, and if errno gets set, Python *does* raise OverflowError: errno = 0; /* Skip forgot to post this line, and it's important */ ... ix = pow(iv, iw); ... if (errno != 0) { /* XXX could it be another type of error? */ PyErr_SetFromErrno(PyExc_OverflowError); return NULL; If you can read C, believe your eyes . What you may be missing is what an utter mess C is here. How libm behaves may depend on compiler options, linker options, global system flags, and even options set for other libraries you link with. > ... > 4. Is there an easier way to debug such problems? The cause was obvious to the first person (Skip) who stepped into Python to see what the code did on a platform where it failed. It's not going to be obvious to someone who doesn't. > 5. How is 1e200**2 handled? It goes through exactly the same code. > Since both 1e-200**2 and 1e200**2 produce the same errno all the time, > but Python still raises OverflowError for 1e200**2 when linked with > -lieee, there must be a separate mechanism at work. You're speculating from a false base: if platform pow(x, y) sets errno to any non-zero value, Python x**y raises OverflowError. What differs is when platform pow(x, y) does not set errno. In that case, Python synthesizes errno=ERANGE if the pow() result equals +- platform HUGE_VAL. > What is that and how can I override it? Sorry, you can't override it. > I know this is by design, but I think the design is dumb (to put it > politely). I won't get into an argument here. I'll write > up my rationale against this when I have some time. I'm afraid a rationale won't do any good. I'm in favor of supplying full 754 compatibility myself, but: A) Getting from here to there requires volunteers to design, implement, document, and test the code. Given the atrocious state of C's 754 story, and the subtlety of 754's requirements, this needs volunteers who are both 754 experts and platform C experts. That combination is rare. B) Python's floating-point users will never agree it's a good thing, so such a change requires careful backward compatibility work too. This isn't likely to get done by someone who characterizes the other side as "dumb (to put it politely)" <0.9 wink>. Note that the only significant floating-point code ever contributed to the Python core was the fpectl module, and its purpose is to *break* 754 "non-stop" exception semantics in the places Python happens to let them sneak through. > I do remember there being several discussions in the past, but I don't > remember any convincing argument for the current decision. Any URL > would be greatly appreciated, beside the one pointed by Tim. Which "current decision" do you have in mind? There is no design doc for Python's numerics, if that's what you're looking for. As the text at the URL I gave you said, much of Python's fp behavior is accidental, inherited from platform C quirks. From newton at admin.kias.re.kr Tue Mar 5 16:52:19 2002 From: newton at admin.kias.re.kr (=?ks_c_5601-1987?B?wdax4sf8?=) Date: Tue Mar 5 16:52:19 2002 Subject: [Numpy-discussion] [Q] RandomArray has a problem ?? Message-ID: <001701c1c4a8$d135d340$e31d62d2@kias.re.kr> Hi, RandomArray has a problem? I use python2.0, Numpy 20.3. simple source code is ... -------------------------------- #!/usr/bin/env python import math import Numeric import RandomArray import sys RandomArray.seed(1234,5678) i=0L while 1: i = i+1 a = RandomArray.randint(0,100) if a==100: print 'i=',i, 'a=',a --------------------------------- and result is --------------------------------- i= 70164640 a= 100 i= 152242967 a= 100 i= 159619195 a= 100 i= 173219763 a= 100 i= 200933959 a= 100 i= 233219191 a= 100 i= 276114822 a= 100 i= 313589319 a= 100 i= 340689813 a= 100 i= 402397265 a= 100 i= 456099215 a= 100 i= 506078935 a= 100 i= 547758957 a= 100 i= 559163554 a= 100 i= 570211180 a= 100 .......... --------------------------------- RandomArray.randint(0,100) has range 0<= RandomArray.randint(0,100) <100. But, result is not...somtime, a==100 arise. So, I upgrade to python 2.2 and Numpy 21b3. But, I met same problem. And, so, I change the os Mandrake 8.0 to Redhat 7.2. But, same problem... I don't know what is my mistake... Please help me ... Kee-Hyoung Joo -------------- next part -------------- An HTML attachment was scrubbed... URL: From newton at ns.kias.re.kr Tue Mar 5 17:20:17 2002 From: newton at ns.kias.re.kr (newton at ns.kias.re.kr) Date: Tue Mar 5 17:20:17 2002 Subject: [Numpy-discussion] [Q] RandomArray module has a problem ???? Message-ID: Hi, RandomArray has a problem? I use python2.0, Numpy 20.3. simple source code is ... -------------------------------- #!/usr/bin/env python import math import Numeric import RandomArray import sys RandomArray.seed(1234,5678) i=0L while 1: i = i+1 a = RandomArray.randint(0,100) if a==100: print 'i=',i, 'a=',a --------------------------------- and result is --------------------------------- i= 70164640 a= 100 i= 152242967 a= 100 i= 159619195 a= 100 i= 173219763 a= 100 i= 200933959 a= 100 i= 233219191 a= 100 i= 276114822 a= 100 i= 313589319 a= 100 i= 340689813 a= 100 i= 402397265 a= 100 i= 456099215 a= 100 i= 506078935 a= 100 i= 547758957 a= 100 i= 559163554 a= 100 i= 570211180 a= 100 .......... --------------------------------- RandomArray.randint(0,100) has range 0<= RandomArray.randint(0,100) <100. But, result is not...somtime, a==100 arise. So, I upgrade to python 2.2 and Numpy 21b3. But, I met same problem. And, so, I change the os Mandrake 8.0 to Redhat 7.2. But, same problem... I don't know what is my mistake... Please help me ... Kee-Hyoung Joo ------------------------------------------------------------------ I love Jesus Christ who is my savior. He gives me meanning of life. In Christ, I have become shepherd and bible teacher. e-mail : newton at kias.re.kr home : http://newton.skku.ac.kr/~newton (My old home page) ------------------------------------------------------------------ From newton at ns.kias.re.kr Tue Mar 5 17:51:14 2002 From: newton at ns.kias.re.kr (newton at ns.kias.re.kr) Date: Tue Mar 5 17:51:14 2002 Subject: [Numpy-discussion] [Q] RandomArray module has a problem ???? Message-ID: Hi, RandomArray has a problem? I use python2.0, Numpy 20.3. simple source code is ... -------------------------------- #!/usr/bin/env python import math import Numeric import RandomArray import sys RandomArray.seed(1234,5678) i=0L while 1: i = i+1 a = RandomArray.randint(0,100) if a==100: print 'i=',i, 'a=',a --------------------------------- and result is --------------------------------- i= 70164640 a= 100 i= 152242967 a= 100 i= 159619195 a= 100 i= 173219763 a= 100 i= 200933959 a= 100 i= 233219191 a= 100 i= 276114822 a= 100 i= 313589319 a= 100 i= 340689813 a= 100 i= 402397265 a= 100 i= 456099215 a= 100 i= 506078935 a= 100 i= 547758957 a= 100 i= 559163554 a= 100 i= 570211180 a= 100 .......... --------------------------------- RandomArray.randint(0,100) has range 0<= RandomArray.randint(0,100) <100. But, result is not...somtime, a==100 arise. So, I upgrade to python 2.2 and Numpy 21b3. But, I met same problem. And, so, I change the os Mandrake 8.0 to Redhat 7.2. But, same problem... I don't know what is my mistake... Please help me ... Kee-Hyoung Joo ------------------------------------------------------------------ I love Jesus Christ who is my savior. He gives me meanning of life. In Christ, I have become shepherd and bible teacher. e-mail : newton at kias.re.kr home : http://newton.skku.ac.kr/~newton (My old home page) ------------------------------------------------------------------ From oliphant.travis at ieee.org Tue Mar 5 20:36:02 2002 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Mar 5 20:36:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. Message-ID: Recently there has been discussion on the list about the awkwardness of matrix syntax when using Numeric Python. Matrix expressions can be awkard to express in Numeric which is a negative mark on an otherwise excellent computing environment. Currently part of the problem can be solved by working with Matrix objects explicitly: a = Matrix.Matrix("[1 2 3; 4 5 6]") # Notice the strings. However, most operations return arrays which have to be recast to matrices using at best a character with parenthesis: M = Matrix.Matrix M(sin(a)) * M(cos(a)).T The suggestion was made to add ".M" as an attribute of arrays which returns a matrix. Thus, the code above can be written: sin(a).M * cos(a).M.T While some aesthestic simplicity is obtained, the big advantage is in consistency. Somebody else may decide that P = Matrix.Matrix is a better choice. But, if we establish that .M always returns a matrix for arrays < 2d, then we gain consistency. I've made this change and am ready to commit the change to the Numeric tree, unless there are strong objections. I know some people do not like the proliferation of attributes, but in this case the notational convenience it affords to otherwise overly burdened syntax and the consistency it allows Numeric to deal with Matrix equations may be worth it. What do you think? -Travis Oliphant From oliphant.travis at ieee.org Tue Mar 5 20:44:14 2002 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Mar 5 20:44:14 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking Message-ID: I have not heard any feedback back on my proposal to add a final object to the extended slice syntax to current Numeric to allow for unambiguous index and mask-array access. As a modification to the proposal, suppose we just check to see if the last argument (of at least two) is a 0d array of type signed byte (currently this is illegal and will raise an error). This number would be a flag indicating how to interpret the previous objects. Of course these numbers would be hidden from the user who would write: a[index_array, _I] = b = a[index_array, _I] or a[mask_array, _M] = b = a[mask_array, _M] where _M is a 0d signed byte array indicating that the mask_array should be interpreted as a mask while _I is a 0d signed byte array indicating that the index_array should be interpreted as a integers into the flattened version of a. Other indexing schemes could be envisioned as well a[a1,a2,a3,_X] could be the cross product of the integer arrays a1, a2, and a3 for example. or a[a1, a2, a3, _Z] could select elements from a by "zipping" the sequences a1, a2, and a3 together to form a list of tuples to grab from a. Comments? From pearu at cens.ioc.ee Tue Mar 5 23:46:02 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Tue Mar 5 23:46:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: Hi! On Tue, 5 Mar 2002, Travis Oliphant wrote: > The suggestion was made to add ".M" as an attribute of arrays which returns a > matrix. Thus, the code above can be written: > > sin(a).M * cos(a).M.T > > While some aesthestic simplicity is obtained, the big advantage is in > consistency. > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? Would it be possible to use own Matrix classes instead of what is in Matrix.py? I gather that there must be some setter method in Numeric for that: Numeric.set_matrix_factory(MyMatrixClass) with a requirement that MyMatrixClass must be a subclass of Matrix.Matrix. I think it would be a very important feature as users can define their own matrix operations, for example, using their own BLAS routines to speed up operations with matrices (yes, I am thinking of SciPy specific Matrix class). Thanks, Pearu From hinsen at cnrs-orleans.fr Wed Mar 6 00:50:04 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Wed Mar 6 00:50:04 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. References: Message-ID: <200203060813.g268DeA09890@chinon.cnrs-orleans.fr> Travis Oliphant writes: > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it At the risk of sounding unconstructively negative, I think this is a misuse of attributes. For someone used to read standard Python code, where attributes are, well, attributes, code using this notation is just weird. Personally, consistent notation is more important than short notation. The Pythonesque solution to this problem, in my opinion, is separate matrix and array objects (which can and should of course share implementation code) plus explicit constructors to convert between the two. I am a bit worried that kludges such as fake attributes set bad precedents for the future. One of the main reasons why I like Python is its clean syntax and its simple object model. This kind of notation messes up both of them. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From geus at inf.ethz.ch Wed Mar 6 00:51:10 2002 From: geus at inf.ethz.ch (Roman Geus) Date: Wed Mar 6 00:51:10 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines Message-ID: <3C85D86D.EDCCB1EE@inf.ethz.ch> Dear Numerical Python user and developers I ran into the following problem: The python application I'm developing uses Numerical Python and other C modules that call LAPACK. My application runs well on 32bit architectures: When I tried to run the application on a HP-UX 64bit machine the application produced bus errors. After a long debugging session I found out that Fortran integers are still 32bit wide on this machine. Therefore also the HP LAPACK library has to called using 32bit integers. Numerical Python however codes Fortran integers as C 'long int' variables, which are 64bit wide on this machine. To make my application run on the HP-UX 64bit machine, I had to change all 'long int' to 'int' variables in Src/lapack_litemodule.c, which is a rather painful hack (see end of message for an example). My question is: Should Fortran integers not be coded as 'int' instead of 'long int' in Numerical Python? This way, it would still work on all 32 bit machines and also on the 64-bit machines I know. Would this work on all 64-bit machines? Thanks for your comments/help. -- Roman Geus E.g. the lapack_lite_dgetrf() function now looks like this: static PyObject *lapack_lite_dgetrf(PyObject *self, PyObject *args) { int lapack_lite_status__; int m; int n; PyObject *a; int lda; PyObject *ipiv; int info; int i; int *ipiv_int; int ipiv_len; TRY(PyArg_ParseTuple(args,"iiOiOi",&m,&n,&a,&lda,&ipiv,&info)); TRY(lapack_lite_CheckObject(a,PyArray_DOUBLE,"a","PyArray_DOUBLE","dgetrf")); TRY(lapack_lite_CheckObject(ipiv,PyArray_LONG,"ipiv","PyArray_LONG","dgetrf")); ipiv_len = m < n ? m : n; ipiv_int = (int *)malloc(ipiv_len * sizeof(int)); assert(ipiv_int); for (i = 0; i < ipiv_len; i ++) ipiv_int[i] = LDATA(ipiv)[i]; #if defined(NO_APPEND_FORTRAN) lapack_lite_status__ = dgetrf(&m,&n,DDATA(a),&lda,ipiv_int,&info); #else lapack_lite_status__ = dgetrf_(&m,&n,DDATA(a),&lda,ipiv_int,&info); #endif for (i = 0; i < ipiv_len; i ++) LDATA(ipiv)[i] = ipiv_int[i]; free(ipiv); return Py_BuildValue("{s:l,s:l,s:l,s:l,s:l}","dgetrf_",(long)lapack_lite_status__,"m",(long)m,"n",(long)n,"lda",(long)lda,"info",(long)info); } From Roy.Dragseth at cc.uit.no Wed Mar 6 01:03:03 2002 From: Roy.Dragseth at cc.uit.no (Roy Dragseth) Date: Wed Mar 6 01:03:03 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines In-Reply-To: <3C85D86D.EDCCB1EE@inf.ethz.ch> References: <3C85D86D.EDCCB1EE@inf.ethz.ch> Message-ID: <200203060902.g2692Df12551@newton.cc.uit.no> On Wednesday 06 March 2002 09:50 am, Roman Geus wrote: > Dear Numerical Python user and developers > > I ran into the following problem: > > The python application I'm developing uses Numerical Python and other > C modules that call LAPACK. My application runs well on 32bit > architectures: > > When I tried to run the application on a HP-UX 64bit machine the > application produced bus errors. After a long debugging session I > found out that Fortran integers are still 32bit wide on this > machine. Therefore also the HP LAPACK library has to called using > 32bit integers. Numerical Python however codes Fortran integers as C > 'long int' variables, which are 64bit wide on this machine. Have you tried the +i8 flag for the HP fortran compiler? It converts all fortran integers to 8bytes entities. Regards, Roy. From geus at inf.ethz.ch Wed Mar 6 01:27:03 2002 From: geus at inf.ethz.ch (Roman Geus) Date: Wed Mar 6 01:27:03 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines References: <3C85D86D.EDCCB1EE@inf.ethz.ch> <200203060902.g2692Df12551@newton.cc.uit.no> Message-ID: <3C85E0DE.F7F3ADC7@inf.ethz.ch> Roy Dragseth wrote: > > On Wednesday 06 March 2002 09:50 am, Roman Geus wrote: > > Dear Numerical Python user and developers > > > > I ran into the following problem: > > > > The python application I'm developing uses Numerical Python and other > > C modules that call LAPACK. My application runs well on 32bit > > architectures: > > > > When I tried to run the application on a HP-UX 64bit machine the > > application produced bus errors. After a long debugging session I > > found out that Fortran integers are still 32bit wide on this > > machine. Therefore also the HP LAPACK library has to called using > > 32bit integers. Numerical Python however codes Fortran integers as C > > 'long int' variables, which are 64bit wide on this machine. > > Have you tried the +i8 flag for the HP fortran compiler? It converts all > fortran integers to 8bytes entities. > > Regards, > Roy. I think this wouldn't help. The optimized BLAS/LAPACK library supplied by HP expects 32bit integers and other software incorporated into my python application (e.g. SuperLU) calls BLAS/LAPACK library using 32bit integers (C 'int' type). So, what really needs to be changed (at least for this machine) is how Numerical Python calls BLAS/LAPACK. It also needs to use 32bit integers. So this means using 'int' instead of 'long int'. Regards, Roman From pearu at cens.ioc.ee Wed Mar 6 01:43:01 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 6 01:43:01 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines In-Reply-To: <3C85E0DE.F7F3ADC7@inf.ethz.ch> Message-ID: Hi, On Wed, 6 Mar 2002, Roman Geus wrote: > So, what really needs to be changed (at least for this machine) is how > Numerical Python calls BLAS/LAPACK. It also needs to use 32bit integers. > So this means using 'int' instead of 'long int'. Having wrapped a lot of Fortran codes to Python, I agree, that Numerical Python should use 'int', instead of, 'long'. Though I have little influence to make this change to happen in Numeric but just agreeing with you. Pearu From hinsen at cnrs-orleans.fr Wed Mar 6 03:10:05 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Wed Mar 6 03:10:05 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines In-Reply-To: References: Message-ID: Pearu Peterson writes: > Having wrapped a lot of Fortran codes to Python, I agree, that Numerical > Python should use 'int', instead of, 'long'. Though I have little > influence to make this change to happen in Numeric but just agreeing with > you. Even without the Fortran aspect, I'd prefer 'int' for integer arrays in general. There may be applications that need 64-bit integers, but any portable application wouldn't rely on them anyway. 64-bit arrays take up more memory, and when you pickle them you cannot read those files on 32-bit machines. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From gpk at bell-labs.com Wed Mar 6 05:28:13 2002 From: gpk at bell-labs.com (Greg Kochanski) Date: Wed Mar 6 05:28:13 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: Message-ID: Please, no. There are many short names that you could use that would avoid overloading the [] operator. Especially in Python, where one cannot trivially decide the type of a variable, the behavior should change as little as possible as the type of each variable changes. Here, the indexing operation changes completely if you change the last index from an int to an array. That means you have to execute the code to understand it -- one can't just look and assume from local syntax. Besides, you know some idiot is going to eventually write code that looks like this: def access(a, b, x): return a[b, x] # I think that a must be a 2-D array... # 1000 lines later... access(a, _I) # Whoops all my assumptions were wrong... > From: Travis Oliphant > Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking > > > I have not heard any feedback back on my proposal to add a final object to > the extended slice syntax to current Numeric to allow for unambiguous index > and mask-array access. > ...hidden from the user who would write: > > a[index_array, _I] = > b = a[index_array, _I] > > or > > a[mask_array, _M] = > b = a[mask_array, _M] > > where _M is a 0d signed byte array indicating that the mask_array should be > interpreted as a mask while _I is a 0d signed byte array indicating that the > index_array should be interpreted as a integers into the flattened version of > a. > > Other indexing schemes could be envisioned as well... From perry at stsci.edu Wed Mar 6 09:05:02 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Mar 6 09:05:02 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: Message-ID: Travis Oliphant writes: > > I have not heard any feedback back on my proposal to add a final > object to > the extended slice syntax to current Numeric to allow for > unambiguous index > and mask-array access. > > As a modification to the proposal, suppose we just check to see > if the last > argument (of at least two) is a 0d array of type signed byte > (currently this > is illegal and will raise an error). This number would be a > flag indicating > how to interpret the previous objects. Of course these numbers would be > hidden from the user who would write: > > a[index_array, _I] = > b = a[index_array, _I] > > or > > a[mask_array, _M] = > b = a[mask_array, _M] > > where _M is a 0d signed byte array indicating that the > mask_array should be > interpreted as a mask while _I is a 0d signed byte array > indicating that the > index_array should be interpreted as a integers into the > flattened version of > a. > > Other indexing schemes could be envisioned as well > > a[a1,a2,a3,_X] could be the cross product of the integer arrays > a1, a2, and > a3 for example. > > or > > a[a1, a2, a3, _Z] could select elements from a by "zipping" the > sequences a1, > a2, and a3 together to form a list of tuples to grab from a. > > Comments? > Like Greg I'm wary of having many different interpretations for indexing behavior (I'm not even that crazy about having numarray handle boolean index arrays differently than the others --something we haven't implemented yet, and perhaps we shouldn't). Before discussing the merits of this, shouldn't we take the attitude that absence of feedback is not necessarily equivalent to approval, particularly for something that affects the public interface of the module? I would feel better about this if I saw several affirming the need for such features rather than few openly opposing it. But if one were to do something like this, I would use a different kind of object than 0d arrays, e.g., an instance of a class defined for just that purpose. You would really want to make sure that no data could mistakenly be interpreted as a flag, even if the chances were remote. I would also not use an underscore as the beginning of the name. Maybe I'm wrong about this, but I've come to take that to mean its a private variable that should not be used by users of the module, and that usage would confuse that. Finally, the name of the flag should be descriptive (e.g. MaskInd). But there could be better alternatives. As an example, x[nonzero(maskarray)] instead of x[maskarray, MaskInd] (Yes, it does generate a temporary so that is a drawback) Perry From jwpark at aeroguy.snu.ac.kr Wed Mar 6 10:03:16 2002 From: jwpark at aeroguy.snu.ac.kr (Jin Woo Park) Date: Wed Mar 6 10:03:16 2002 Subject: [Numpy-discussion] problem with integer type array object Message-ID: <1015437799.22162.23.camel@Maestro> I was working on an external module build in C where a function returns a 2-D integer array. However, when I imported module, I found a strange behavior concerning the type of the element of the array. This is what basically happens: static PyObject* foo(PyObject* arg, PyObject* args) { PyObject* a; int dims[2]={2,2}; a = PyArray_FromDims(2,dims,PyArray_INT); return a; } in python, >>> a=foo() >>> type(a[0,0]) I expected the type 'int'. If you make a Numpy array in python, then you get the expected type of 'int'. >>> b=array([[0,0],[0,0]]) >>> type(b[0,0]) Is this somehow intended? Thanks for any info, -- +----------------------------------------+ | Jin Woo Park (jwpark at aeroguy.snu.ac.kr)| | Research Assistant,Dept.Aerospace Eng. | | Seoul National University, Korea | +----------------------------------------+ From jwpark at aeroguy.snu.ac.kr Wed Mar 6 10:44:22 2002 From: jwpark at aeroguy.snu.ac.kr (Jin Woo Park) Date: Wed Mar 6 10:44:22 2002 Subject: [Numpy-discussion] Re: problem with integer type array object In-Reply-To: <1015437799.22162.23.camel@Maestro> References: <1015437799.22162.23.camel@Maestro> Message-ID: <1015440300.22162.29.camel@Maestro> Guess, I didn't read the DOC carefully. I just found out that Python int is equivalent to a C long. And PyArray_INT doesn't have a corresponding Python scalar type. Sorry for a 'dumb' question. -- +----------------------------------------+ | Jin Woo Park (jwpark at aeroguy.snu.ac.kr)| | Research Assistant,Dept.Aerospace Eng. | | Seoul National University, Korea | +----------------------------------------+ From perry at stsci.edu Wed Mar 6 10:49:05 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Mar 6 10:49:05 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: Travis Oliphant writes: > > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > I've made this change and am ready to commit the change to the > Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational > convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? > > -Travis Oliphant > > I'd have to agree with Konrad and Paul on this one. While it makes simple matrix expressions clearer, it opens a whole can of worms that were discussed (and never resolved) a couple years ago. Suppose I do this: x = a.M * libfunc(b.M, c.M) where libfunc is a 3rd party module written in Python that was written assuming that operators were elementwise operators. It may silently do a matrix multiplication (depending on the shapes) instead of the intended elementwise multiplication. Yet the usage above looks just as legitimate as x = a.M * b.M In other words, it raises the issues of having incompatible modules, some written with Numeric objects in mind, others with Matrix objects in mind. Conceivably there will be modules useful for both kinds of objects. Do we need to support two kinds? How do we deal with this? This is still a problem if we don't allow the .M attribute but still have widespread usage of a array object with different behavior for operations. Unlike masked arrays, whose basic behavior is unchanged for "good" data, the behavior for identical data is completely different. I wish I had a good answer for this. I don't remember all of the past suggestions, but it looks like one of the following solutions is needed: 1) Campaign for new operators in Python (various proposals to this affect. This is probably best from the Numeric point of view (maybe not from Python in general though). 2) Allow different array classes with different behavior, but come up with conventions and utilities for library developers to produce versions of arrays compatible with the convention assumed for the module (and convert back to the input type for output values). This doesn't prevent all opportunities for confusion and errors however. It also puts a stronger burden on library developers. 3) Do nothing and deal with the resulting mess. Perhaps the two camps have little need for each other's tools and it won't be much of a problem. Do option 2 retroactively if it is a problem. Other suggestions? Perry Perry From oliphant at ee.byu.edu Wed Mar 6 10:50:08 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Mar 6 10:50:08 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: Message-ID: > Please, no. There are many short names that you could use that would > avoid overloading the [] operator. Especially in Python, where > one cannot trivially decide the type of a variable, > the behavior should change as little as possible as the type > of each variable changes. Right now, what I'm suggesting fails with an error. Everyone talks about things not changing when types change but this is actually almost never the case. There is almost always a different behavior if the objects have different types. Are you opposed to anything going inside the [] operator to help indicate how the objects inside should be interpreted > > Here, the indexing operation changes completely if you change the > last index from an int to an array. That means you have to > execute the code to understand it -- one can't just look and > assume from local syntax. No, only a very specific kind of array. Currently, such a change gives you an error. And if the array was in the wrong place it would also give you an error. Your concerns seem motivated by not really understanding the suggested change. Could you provide me with other examples that show precisely what you mean? > Besides, you know some idiot is going to eventually write > code that looks like this: > > def access(a, b, x): > return a[b, x] # I think that a must be a 2-D array... > > # 1000 lines later... > access(a, _I) # Whoops all my assumptions were wrong... I have no idea, what your concern is here. This would result in an error currently and under the scheme I suggested. -Travis From tpitts at accentopto.com Wed Mar 6 11:21:37 2002 From: tpitts at accentopto.com (Todd Alan Pitts, Ph.D.) Date: Wed Mar 6 11:21:37 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: ; from perry@stsci.edu on Wed, Mar 06, 2002 at 01:48:15PM -0500 References: Message-ID: <20020306122211.A32137@fermi.accentopto.com> I often (perhaps inappropriately) fall into the "silent user" category. However, many of those in this conversation have put significant effort into python development and the least I can do is offer a comment from the standpoint of someone who uses python and Numeric extensively. Perhaps I am stepping into the middle of a conversation here -- I hope I have read all the relevant material. People may "like" Matlab syntax because it requires less typing or because it pleases them aesthetically. I personally feel that explicit function based operators (like transpose()) are very clear and unambiguous. While I understand the desire to have the code and the "math" look similar I think, in general, this is leads to the same kind of difficulty one has with notation in mathematics -- Notation that works well in some fields is extremely cumbersome in others. I don't expect it to look like an equation. I find orderly, predictable behavior that doesn't send me to the source code too often to figure out what is happening very helpful. Treating 1d or 2d arrays as matrices is admittedly *very* useful in some applications but cumbersome in others. This problem is reminiscent of the "clash" between the PIL and Numeric modules or between the C-language row-major matrix storage format and the (in my opinion) better thought out FORTRAN column-major matrix storage format. These differences place limitations on the potential for synergistic profits in the project. It is my personal experience/opinion that "convenience" methods are best added in a specific application that is not intended to be released generally. -Todd * Perry Greenfield (perry at stsci.edu) wrote: > Travis Oliphant writes: > > > > > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > > > I've made this change and am ready to commit the change to the > > Numeric tree, > > unless there are strong objections. I know some people do not like the > > proliferation of attributes, but in this case the notational > > convenience it > > affords to otherwise overly burdened syntax and the consistency it allows > > Numeric to deal with Matrix equations may be worth it. > > > > What do you think? > > > > -Travis Oliphant > > > > > I'd have to agree with Konrad and Paul on this one. While it makes simple > matrix expressions clearer, it opens a whole can of worms that were > discussed (and never resolved) a couple years ago. Suppose I do this: > > x = a.M * libfunc(b.M, c.M) > > where libfunc is a 3rd party module written in Python that was written > assuming that operators were elementwise operators. It may silently > do a matrix multiplication (depending on the shapes) instead of the > intended elementwise multiplication. Yet the usage above looks just as > legitimate as > > x = a.M * b.M > > In other words, it raises the issues of having incompatible modules, some > written with Numeric objects in mind, others with Matrix objects in mind. > Conceivably there will be modules useful for both kinds of objects. Do > we need to support two kinds? How do we deal with this? > > This is still a problem if we don't allow the .M attribute but still have > widespread usage of a array object with different behavior for operations. > Unlike masked arrays, whose basic behavior is unchanged for "good" data, > the behavior for identical data is completely different. > > I wish I had a good answer for this. I don't remember all of the past > suggestions, but it looks like one of the following solutions is needed: > > 1) Campaign for new operators in Python (various proposals to this affect. > This is probably best from the Numeric point of view (maybe not from > Python in general though). > 2) Allow different array classes with different behavior, but come up with > conventions and utilities for library developers to produce versions of > arrays compatible with the convention assumed for the module (and convert > back to the input type for output values). This doesn't prevent all > opportunities for confusion and errors however. It also puts a stronger > burden on library developers. > 3) Do nothing and deal with the resulting mess. Perhaps the two camps have > little need for each other's tools and it won't be much of a problem. > Do option 2 retroactively if it is a problem. > > Other suggestions? > > Perry > > > > Perry > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From hinsen at cnrs-orleans.fr Wed Mar 6 11:25:07 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Wed Mar 6 11:25:07 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: "Perry Greenfield" writes: > discussed (and never resolved) a couple years ago. Suppose I do this: > > x = a.M * libfunc(b.M, c.M) > > where libfunc is a 3rd party module written in Python that was written > assuming that operators were elementwise operators. It may silently Then you are calling a routine with wrong arguments - that can happen in Python all the time. >From my point of view, arrays and matrices are two entirely different things. A function written for matrix objects cannot be expected to work with array objects, and vice versa. Matrix operations should return matrix objects, and array operations should return array objects. What arrays and matrices have in common is not semantics, but implementation. That is something that implementors should profit from, but users shouldn't even need to know about. The discussion about matrices has focused on matrix multiplication as the main difference between the two objects. I suppose this was motivated by comparisons to Matlab and similar environments, which do not have the notion of data types and thus cannot properly distinguish between matrices and arrays. I don't see why should follow this limited approach. A matrix object should not only do matrix multiplication properly, but also provide methods such as diagonalization, application of functions as matrix functions, etc. That would be much more than syntactic sugar, it would be a real implementation of the mathematical concept "matrix". Seen from this point of view, it is not at all clear why an array should have an attribute that is an "equivalent" matrix, as no such equivalence exists in general (only for 2D arrays). > Conceivably there will be modules useful for both kinds of objects. Do I don't think so. The only analogous operations between arrays and matrices are addition, subtraction, negation, and multiplication with a scalar, and those would use the same syntax anyway. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From paul at pfdubois.com Wed Mar 6 11:38:51 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Mar 6 11:38:51 2002 Subject: [Numpy-discussion] Final prep for 21.0 Message-ID: <000001c1c546$45df76e0$1001a8c0@NICKLEBY> I have committed to CVS that which I expect to become 21.0. The final set of changes is a revision to the way we use distutils to make RPMs. I am unqualified to test these but the submitter (Vermeulen) did. Please update your CVS and test this version. Developers, please make no further commits until I make the release. I will make the release this weekend unless I receive advice to the contrary from testers. From paul at pfdubois.com Wed Mar 6 11:49:22 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Mar 6 11:49:22 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: <000801c1c547$dcfdb950$1001a8c0@NICKLEBY> I believe the correct solution is a major upgrade to Matrix.py along the lines of what is done in MA; that is, to craft an object that uses Numeric for its implementation but which defines all its own operators in a manner that is semantically sensible for the type of object it is. Such an upgrade could subsequently be improved by using different underlying software for various operations, or even more sophisticated changes such as using a transposed attribute to lazily evaluate transposes in a cleaner way than Numeric does it. Also an upgrade to Numarray would then be virtually painless. If you have never looked at MA, please examine source file Packages/MA/Lib/MA.py before commenting. This file is fairly complex and the required changes to Matrix.py would be considerably simpler; but you can verify that it is fairly straightforward to do. On my project we have done something similar to create a "climate data variable" object. Such a design includes an "exit" function to allow the instance to cheaply view itself as the underlying Numeric array. (In MA, this is "filled", which makes a Numeric array by replacing missing values, but if there are no missing values returns the underlying Numeric array). I'm willing to do this for the community but it would have a side effect; if anyone has been doing "from Matrix import *" they would suddenly get a lot more names imported that would conflict with any imported from Numeric. From pearu at cens.ioc.ee Wed Mar 6 12:00:31 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 6 12:00:31 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: On Wed, 6 Mar 2002, Perry Greenfield wrote: > Other suggestions? Here is one suggestion that is based on the observation that all we need is an easy way to tell that the following operation should be applied as a matrix operation. So, the suggestion is to provide an attribute or a member function that returns an array (note, not a Matrix instance) that has a bit, called it asmatrix, set true but _only_ temporally. The bit is cleaned on every operation. And before applying an operation, the corresponding method (currently there seem to be only four relevant methods: __mul__, __pow__ and their r-versions) checks if either of the operants has asmatrix bit true, then performs the corresponding matrix operation, otherwise the default element-wise operation. And before returning, it cleans up asmatrix bit. For the sake of an example, let .m be Numeric array attribute that when accessed sets asmatrix=1 and returns the array. Examples: a * b - is element-wise multiplication a.m * b, a * b.m - are matrix multiplications, the resulting array, as well a and b, have asmatrix=0 a.m ** -1 - is matrix inverse sin(a) - element-wise sin sin(a.m) - matrix sin To summarize the main ideas: * array has asmatrix bit that most of the time is false. * there is a way to set the asmatrix bit true, either by .m or .M attributes or .m(), .M(), .. methods that return the same array. * __mul__, __pow__, etc. methods check if either operant has asmatrix true, then performs the corresponding matrix operation, otherwise the corresponding element-wise operation. * all operations clean asmatrix bit. So, what do you think? Pearu From perry at stsci.edu Wed Mar 6 12:43:59 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Mar 6 12:43:59 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: Pearu writes: > > Here is one suggestion that is based on the observation that all we need > is an easy way to tell that the following operation should be applied > as a matrix operation. So, the suggestion is to provide an attribute or a > member function that returns an array (note, not a Matrix instance) that > has a bit, called it asmatrix, set true but _only_ temporally. The bit is > cleaned on every operation. And before applying an operation, the > corresponding method (currently there seem to be only four relevant > methods: __mul__, __pow__ and their r-versions) checks if either of > the operants has asmatrix bit true, then performs the corresponding matrix > operation, otherwise the default element-wise operation. And before > returning, it cleans up asmatrix bit. > > For the sake of an example, let .m be Numeric array attribute that when > accessed sets asmatrix=1 and returns the array. > Examples: > > a * b - is element-wise multiplication > a.m * b, a * b.m - are matrix multiplications, the resulting > array, as well a and b, have asmatrix=0 > a.m ** -1 - is matrix inverse > sin(a) - element-wise sin > sin(a.m) - matrix sin > > To summarize the main ideas: > * array has asmatrix bit that most of the time is false. > * there is a way to set the asmatrix bit true, either by .m or .M > attributes or .m(), .M(), .. methods that return the same array. > * __mul__, __pow__, etc. methods check if either operant has asmatrix > true, then performs the corresponding matrix operation, otherwise > the corresponding element-wise operation. > * all operations clean asmatrix bit. > > So, what do you think? > > Pearu > This is a clever idea that reminds me of something we were considering for something else (exactly what I can't quite remember :-). But like all such schemes it still does produce an object, and a user might infer (reasonably or unreasonably) that if they can type x = a.m * b Then they can treat a.m as an array, e.g., x = a.m In that case, the special case behavior still becomes camouflaged, when x is used later, albeit only to bite you once. It is clear what the operation does when the attribute is used in the expression, but if it isn't, there is still room for confusion. I like the idea, but I'll have to think about whether the downside outweighs the benefits. Perry From perry at stsci.edu Wed Mar 6 12:49:06 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Mar 6 12:49:06 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: Konrad Hinsen writes: > > discussed (and never resolved) a couple years ago. Suppose I do this: > > > > x = a.M * libfunc(b.M, c.M) > > > > where libfunc is a 3rd party module written in Python that was written > > assuming that operators were elementwise operators. It may silently > > Then you are calling a routine with wrong arguments - that can happen > in Python all the time. > > From my point of view, arrays and matrices are two entirely different > things. A function written for matrix objects cannot be expected to > work with array objects, and vice versa. Matrix operations should > return matrix objects, and array operations should return array > objects. > > What arrays and matrices have in common is not semantics, but > implementation. That is something that implementors should profit > from, but users shouldn't even need to know about. > > The discussion about matrices has focused on matrix multiplication as > the main difference between the two objects. I suppose this was > motivated by comparisons to Matlab and similar environments, which do > not have the notion of data types and thus cannot properly distinguish > between matrices and arrays. I don't see why should follow this > limited approach. > > A matrix object should not only do matrix multiplication properly, but > also provide methods such as diagonalization, application of functions > as matrix functions, etc. That would be much more than syntactic > sugar, it would be a real implementation of the mathematical concept > "matrix". > > Seen from this point of view, it is not at all clear why an array > should have an attribute that is an "equivalent" matrix, as no such > equivalence exists in general (only for 2D arrays). > Not an unreasonable position. Are you also arguing that the two types should know about each other and raise an exception if there is an attempt to mix them in operations? Perry From bsder at allcaps.org Wed Mar 6 12:50:18 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Wed Mar 6 12:50:18 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: <20020306123153.U55509-100000@mail.allcaps.org> Well, I believe that it solves the wrong problem. What I really want are Matrix objects that stay as Matrix objects even through their associated functions. And Array objects that stay as array objects. Why add any characters or casts at all when the objects can stay in their original type? Please correct me if I'm missing something here. -a On Tue, 5 Mar 2002, Travis Oliphant wrote: > > Recently there has been discussion on the list about the awkwardness of > matrix syntax when using Numeric Python. > > Matrix expressions can be awkard to express in Numeric which is a negative > mark on an otherwise excellent computing environment. > > Currently part of the problem can be solved by working with Matrix objects > explicitly: > > a = Matrix.Matrix("[1 2 3; 4 5 6]") # Notice the strings. > > However, most operations return arrays which have to be recast to matrices > using at best a character with parenthesis: > > M = Matrix.Matrix > > M(sin(a)) * M(cos(a)).T > > The suggestion was made to add ".M" as an attribute of arrays which returns a > matrix. Thus, the code above can be written: > > sin(a).M * cos(a).M.T > > While some aesthestic simplicity is obtained, the big advantage is in > consistency. Somebody else may decide that > > P = Matrix.Matrix is a better choice. But, if we establish that > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? > > -Travis Oliphant > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From oliphant at ee.byu.edu Wed Mar 6 13:01:39 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Mar 6 13:01:39 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: Message-ID: > Like Greg I'm wary of having many different interpretations > for indexing behavior (I'm not even that crazy about having > numarray handle boolean index arrays differently than the others > --something we haven't implemented yet, and perhaps we shouldn't). > You may be wary, but there are already multiple ways people think about using integers to index arrays. I'm trying to suggest a facility that allows several different interpretations of array access. > Before discussing the merits of this, shouldn't we take the attitude > that absence of feedback is not necessarily equivalent to approval, > particularly for something that affects the public interface of > the module? I would feel better about this if I saw several > affirming the need for such features rather than few openly > opposing it. > I do have this view. I'm not changing anything, right now. Well, I affirm that this is one of the drawbacks of Numeric as compared with other array-oriented environments. We definitely need a way to index an array using integers and masks. I guess if nobody else feels this way, then I'm alone in my discomfort. > But if one were to do something like this, I would use a different kind > of object than 0d arrays, e.g., an instance of a class defined for just > that purpose. We could do that as well. > You would really want to make sure that no data could > mistakenly be interpreted as a flag, even if the chances were remote. > I would also not use an underscore as the beginning of the name. I'm not particularly wedded to _I notation, it was just a start. > Maybe > I'm wrong about this, but I've come to take that to mean its a private > variable that should not be used by users of the module, and that usage > would confuse that. Finally, the name of the flag should be descriptive > (e.g. MaskInd). > > But there could be better alternatives. As an example, > > x[nonzero(maskarray)] instead of x[maskarray, MaskInd] I've thought about that, too, it would work if nonzero returned some class that stored away (but didn't copy) the maskarray info. -Travis From pearu at cens.ioc.ee Wed Mar 6 13:03:40 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 6 13:03:40 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: On Wed, 6 Mar 2002, Perry Greenfield wrote: > This is a clever idea that reminds me of something we were considering > for something else (exactly what I can't quite remember :-). But like > all such schemes it still does produce an object, and a user might > infer (reasonably or unreasonably) that if they can type > > x = a.m * b > > Then they can treat a.m as an array, e.g., > > x = a.m Yes, I was also thinking about the same issue. If we could somehow confince users that .m attribute comes only with operations, e.g. a .m* b then it should be safe... Note that in order to use .m feature, an user must read about it somewhere, say, from tutorial. And there should be noted explicitely where _not_ to use .m feature, that is, in assignments and in function arguments, in order to avoid side any unwanted side effects. Actually, library functions probably use asarray for arguments, this function could also clean up the asmatrix bit. Pearu From pearu at cens.ioc.ee Wed Mar 6 13:05:57 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 6 13:05:57 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: On Wed, 6 Mar 2002, Pearu Peterson wrote: > Actually, library functions probably use asarray for arguments, this > function could also clean up the asmatrix bit. Ok, ignore this remark. Library functions actually would use this bit for triggering different operations, eg. sin(a), sin(a.m). Pearu From oliphant at ee.byu.edu Wed Mar 6 13:06:50 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Mar 6 13:06:50 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: > > Other suggestions? > > Here is one suggestion that is based on the observation that all we need > is an easy way to tell that the following operation should be applied > as a matrix operation. So, the suggestion is to provide an attribute or a > member function that returns an array (note, not a Matrix instance) that > has a bit, called it asmatrix, set true but _only_ temporally. The bit is > cleaned on every operation. And before applying an operation, the > corresponding method (currently there seem to be only four relevant > methods: __mul__, __pow__ and their r-versions) checks if either of > the operants has asmatrix bit true, then performs the corresponding matrix > operation, otherwise the default element-wise operation. And before > returning, it cleans up asmatrix bit. > Frankly, I like this kind of proposal. I disagree with Konrad about the separation between arrays and matrices. From my discussions with other people, it sounds like this is actually a point of disagreement for many in the broader community. To me, matrices are just arrays of rank <=2 which should be interpreted with their specific algebra. > For the sake of an example, let .m be Numeric array attribute that when > accessed sets asmatrix=1 and returns the array. > Examples: > > a * b - is element-wise multiplication > a.m * b, a * b.m - are matrix multiplications, the resulting > array, as well a and b, have asmatrix=0 > a.m ** -1 - is matrix inverse > sin(a) - element-wise sin > sin(a.m) - matrix sin > > To summarize the main ideas: > * array has asmatrix bit that most of the time is false. > * there is a way to set the asmatrix bit true, either by .m or .M > attributes or .m(), .M(), .. methods that return the same array. > * __mul__, __pow__, etc. methods check if either operant has asmatrix > true, then performs the corresponding matrix operation, otherwise > the corresponding element-wise operation. > * all operations clean asmatrix bit. > Again, I wouldn't mind it, but I suspect the more aesthetically critical on the list will dislike it because it blurs the (currently clumsy) distinction between arrays and Matrices that I'm beginning to see people actually like. -Travis From pearu at cens.ioc.ee Wed Mar 6 13:25:25 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 6 13:25:25 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: <000801c1c547$dcfdb950$1001a8c0@NICKLEBY> Message-ID: On Wed, 6 Mar 2002, Paul F Dubois wrote: > If you have never looked at MA, please examine source file > Packages/MA/Lib/MA.py before commenting. This file is fairly complex and You mean those who never used MA should take a day off to read 2000 line code in order to understand the implications of using MA and give a comment? ;-) Indeed, I have never used MA but as a first look it does not look too promising regarding performance: e.g. there seem to be lots of python code involved to apply a simple multiplication of arrays. Could someone more familiar with MA give a comment on performance issues, especially, keeping in mind number crunchers? Pearu From paul at pfdubois.com Wed Mar 6 13:52:07 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Mar 6 13:52:07 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: <000101c1c559$1ce27950$1001a8c0@NICKLEBY> Travis wrote: To me, matrices are just arrays of rank <=2 which should be interpreted with their specific algebra. -- If a class is roughly data plus behaviors, a matrix is not simply an array of rank <=2. You can express the concept of a matrix most cleanly as a separate class. Adding an argumentless member function .M to "convert" from one class to the other, and not make the other class explicit, is a bit weird. But if the other class "Matrix" is explicit, you needn't give it a privleged status with respect to Numeric.array by having a member function in Numeric.array that amounts to a Matrix constructor. The only real motivation for that seems to me to be the feeling that M(x) is somehow less clear than x.M. Note that except for a tricky property behavior, you really ought to have to write the latter as x.M(). As I said, I think we can beef up Matrix to make the linear algebra freaks happy, even to making things like transpose(A)*(B) as optimized operations. From gpk at bell-labs.com Wed Mar 6 14:19:01 2002 From: gpk at bell-labs.com (Greg Kochanski) Date: Wed Mar 6 14:19:01 2002 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #406 - 13 msgs References: Message-ID: <3C869538.202DC99F@bell-labs.com> > Are you opposed to anything going inside the [] operator to help indicate > how the objects inside should be interpreted I'm opposed to overloading any operator so that it becomes confusing. Confusion is generated when a look at a small part of code (a line or two) does not tell you what the code is doing. This is a real problem for languages like Python and Perl, where any variable can contain any data, IF the code behaves in different ways, dependent on the data type. "Different" is defined by the user, and it roughly translates into the amount of coffee you have to drink to fix the code, if the input data changes it's type. > > Besides, you know some idiot is going to eventually write > > code that looks like this: > > > > def access(a, b, x): > > return a[b, x] # I think that a must be a 2-D array... > > > > # 1000 lines later... > > access(a, _I) # Whoops all my assumptions were wrong... > > I have no idea, what your concern is here. This would result in an error > currently and under the scheme I suggested. Then, perhaps you should explain again. I assumed you were proposing a magic second argument to [] that would cause the first argument to be interpreted differently. If so, I think that's a bad idea from a human-interface point of view, because (a) To a person who only uses Numpy occasionally, it is not obvious that an argument is "magic". That makes the code less readable. (b) It is possible to write code where one can pass in the "magic" value in a variable, and no simple inspection of the code will tell if it is magic or not. Using an explicit function or method call fixes (a) by telling the naive user that "this is not normal array access here." It also fixes (b) by making it more obvious that fancy stuff is going on. From oliphant at ee.byu.edu Wed Mar 6 17:42:03 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Mar 6 17:42:03 2002 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #406 - 13 msgs In-Reply-To: <3C869538.202DC99F@bell-labs.com> Message-ID: > > > # 1000 lines later... > > > access(a, _I) # Whoops all my assumptions were wrong... > > > > I have no idea, what your concern is here. This would result in an error > > currently and under the scheme I suggested. > > > Then, perhaps you should explain again. No you grasp it, I think your example contained errors (you only called access with two arguments rather than the three expected by the interface, for example). Thank you for explaining your concerns in more detail, below. > > If so, I think that's a bad idea from a human-interface point of view, > because > > (a) To a person who only uses Numpy occasionally, it is not obvious > that an argument is "magic". That makes the code less readable. That can be a "problem", but it is a "problem" in many languages that currently are in wide-spread use in numerical computing. Apparently, the convenience outweighs the perceived concern. Currently, whenever you use variables to index arrays you know that something is going on that you can't see by just looking at it (i.e. something fancy). For example, you can currently write a[b] and have this do different things depending on whether b is a sequence or a slice object or an integer. I don't see how the addition of another check drastically changes the current state. I actually think the flexiblility is a good thing and it makes Python very powerful. It comes down to trusting people to write code you can understand (if you have any reason to interface with them in the first place). My programming philosophy definitely leans toward empowering people, even if it means they can do something stupid later. > > (b) It is possible to write code where one can pass in the "magic" value > in a variable, and no simple inspection of the code will tell if it is > magic > or not. This is already possible (and frequently used), now. > Using an explicit function or method call fixes (a) by telling the naive > user > that "this is not normal array access here." > You are focusing on the naive user at the expense of convenience for the power user. I think this is appropriate sometimes, but not when we are talking about a language that somebody will use constantly for many years to implement their daily work. I think it makes the code much more readable and therefore understandable and maintainable to overload the [] operator rather than use a method call.. MATLAB's big advantage over Numeric Python right now is that it allows this sort of indexing already which Python users currently have to implement using a function call. > It also fixes (b) by making it more obvious that fancy stuff is going > on. > Whenever you see a[b] instead of a[1:3,4] you alread know that something fancy is going on... Nothing would change here. From eric at enthought.com Wed Mar 6 22:05:02 2002 From: eric at enthought.com (eric) Date: Wed Mar 6 22:05:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. References: Message-ID: <020601c1c595$24bd76c0$6b01a8c0@ericlaptop> Boy did this one get a rise! Nice to hear so many voices. I also feel we need a more compact notation for linear algebra and would like to be able to do it without explicitly casting arrays to Matrix.Matrix objects. This attribute approach will work, but I wonder if trying the "adding an operator to Python" approach one more time would be worth while. At Python10 developer's day, Guido explicitly mentioned the linear algebra operator in a short comment saying something to the affect that, if the numeric community could agree on an appropriate operator, he would strongly consider the addition. He also mentioned the strangness of the 2 PEPs proposed on the topic at a coffee break... I noticed the status of both PEPs is "deferred." http://python.sourceforge.net/peps/pep-0211.html This one proposes the @ operator for outer products. http://python.sourceforge.net/peps/pep-0225.html This one proposes decorating the current binary ops with some symbols to indicate that they have different behavior than the standard binary ops. This is similar to Matlab's use of * for matrix multiplication and .* for element-wise multiplication or to R's use of * for element-wise multiplication and %*% for "object-wise" multiplication. It proposes prepending ~ to operators to change their behavior so that ~* would become matrix multiply. The PEP is a little more general, but this gives the flavor. My hunch is that some form of the second (perhaps drastically reduced) would meet with more success. The suggested ~* or even the %*% operator are both palitable. Such details can be decided later. The question is whether there is sufficient interest to try and push the operator idea through? It would take much longer than choosing something we can do ourselves (like .M), but the operator solution seems more desirable to me. eric ----- Original Message ----- From: "Travis Oliphant" To: Sent: Tuesday, March 05, 2002 11:44 PM Subject: [Numpy-discussion] adding a .M attribute to the array. > > Recently there has been discussion on the list about the awkwardness of > matrix syntax when using Numeric Python. > > Matrix expressions can be awkard to express in Numeric which is a negative > mark on an otherwise excellent computing environment. > > Currently part of the problem can be solved by working with Matrix objects > explicitly: > > a = Matrix.Matrix("[1 2 3; 4 5 6]") # Notice the strings. > > However, most operations return arrays which have to be recast to matrices > using at best a character with parenthesis: > > M = Matrix.Matrix > > M(sin(a)) * M(cos(a)).T > > The suggestion was made to add ".M" as an attribute of arrays which returns a > matrix. Thus, the code above can be written: > > sin(a).M * cos(a).M.T > > While some aesthestic simplicity is obtained, the big advantage is in > consistency. Somebody else may decide that > > P = Matrix.Matrix is a better choice. But, if we establish that > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? > > -Travis Oliphant > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From eric at enthought.com Wed Mar 6 23:27:05 2002 From: eric at enthought.com (eric) Date: Wed Mar 6 23:27:05 2002 Subject: [Numpy-discussion] big picture? Message-ID: <022f01c1c5a0$a5364100$6b01a8c0@ericlaptop> While we're discussing several big issues, it would be worthwhile to step back and get an idea of what people feel is still missing for numeric computation in the Python language and also what is missing from Numeric itself. I'm not talking about libraries here, but issues with notation and array functionality that you run into in day-to-day programming. Things are pretty dang good right now, but there are some areas (array indexing, matrix multiply, etc.) that some people see as sub-optimal or better implemented in other langauges. A list will provide a "big picture" of where we want to go (maybe we are there...) and also help us pick our battles on language changes, etc. So, what mathematical expressions are commonly used and yet difficult to write in Python? I don't mean integrals divergence, etc. I mean things like matrix multiply and transpose. Here is a beginning to the list: 1. Matrix Multiply -- should we ask for ~*? 2. Transpose -- In a perfect world, we'd have an operator for this. 3. complex conjugate -- An operator for this would also be welcomed. 4. Others?? These three have all been discussed on this list or on the SciPy list in the last month, so they are obvious. I don't think there is a solution for 2 and 3 besides using the current function or method calls (but they are still on my list). As I mentioned in my last post, 1 might be fixable. As far as core Numeric functionality: 1. Array indexing with arrays. 2. .M attributes -- an alternative to (1) in language changes. And I'll add a third, that I'd like. 3. tensor notation indexing as in the Blitz++ array library http://www.oonumerics.org/blitz/manual/blitz03.html#l75 NewAxis and Ellipses allow for the same functionality, but the tensor notation is much easier to read. This requires yet more indexing trickery though... Please limit this thread to language changes or Numeric enhancements. Desired changes to the current behavior or interface of Numeric should be saved for a different discussion. thanks, eric -- Eric Jones Enthought, Inc. [www.enthought.com and www.scipy.org] (512) 536-1057 From nwagner at mecha.uni-stuttgart.de Thu Mar 7 00:23:02 2002 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Thu Mar 7 00:23:02 2002 Subject: [Numpy-discussion] OverflowError: math range error Message-ID: <3C873258.C238E33D@mecha.uni-stuttgart.de> Hi, >gdb /usr/bin/python GNU gdb 20010316 Copyright 2001 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-suse-linux"...(no debugging symbols found)... (gdb) run fredholm.py Starting program: /usr/bin/python fredholm.py (no debugging symbols found)...[New Thread 1024 (LWP 13596)] (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)... Traceback (most recent call last): File "fredholm.py", line 2, in ? from LinearAlgebra import * File "/usr/lib/python2.1/site-packages/Numeric/LinearAlgebra.py", line 10, in ? import MLab File "/usr/lib/python2.1/site-packages/Numeric/MLab.py", line 17, in ? import RandomArray File "/usr/lib/python2.1/site-packages/Numeric/RandomArray.py", line 30, in ? seed() File "/usr/lib/python2.1/site-packages/Numeric/RandomArray.py", line 24, in seed ndigits = int(math.log10(t)) OverflowError: math range error (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)... Program exited with code 01. (gdb) Any idea ? Thanks in advance. Nils From hinsen at cnrs-orleans.fr Thu Mar 7 00:59:04 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 00:59:04 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. References: Message-ID: <200203070858.g278wJJ14269@chinon.cnrs-orleans.fr> "Perry Greenfield" writes: > Not an unreasonable position. Are you also arguing that the two types > should know about each other and raise an exception if there is an > attempt to mix them in operations? No need to know about each other, they'd be different types and therefore by default incompatible. There should of course be some explicit conversion facility. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Thu Mar 7 01:12:05 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 01:12:05 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. References: <020601c1c595$24bd76c0$6b01a8c0@ericlaptop> Message-ID: <200203070910.g279AlE14306@chinon.cnrs-orleans.fr> "eric" writes: > Matrix.Matrix objects. This attribute approach will work, but I > wonder if trying the "adding an operator to Python" approach one > more time would be worth while. At Python10 developer's day, Guido If it were only one operator, perhaps, although I might even give up on Python completely if starts to use Perlish notations like ~@!. But if you really want to have a short-hand syntax for the common matrix operations, you'd need multiplication, division (shorthand for multiplying by inverse), power, transpose and hermitian transpose. If you want to go the "operator way", the goal should rather be something like APL, with composite operators. Matrix multiplication would then be a special case of a reduction operator that uses multiplication and addition (in APL this is written as "+.x"). Note that I am *not* suggesting this, my opinion is still that matrices and arrays should be semantically different types. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From geus at inf.ethz.ch Thu Mar 7 01:21:04 2002 From: geus at inf.ethz.ch (Roman Geus) Date: Thu Mar 7 01:21:04 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines References: Message-ID: <3C8730E9.293490E8@inf.ethz.ch> Hello, Pearu Peterson wrote: > > Hi, > > On Wed, 6 Mar 2002, Roman Geus wrote: > > > So, what really needs to be changed (at least for this machine) is how > > Numerical Python calls BLAS/LAPACK. It also needs to use 32bit integers. > > So this means using 'int' instead of 'long int'. > > Having wrapped a lot of Fortran codes to Python, I agree, that Numerical > Python should use 'int' instead of, 'long'. Though I have little > influence to make this change to happen in Numeric but just agreeing with > you. > > Pearu What would be the best way to convince the NumPy developers to use 'int' instead 'long' for Fortran integers? I would be willing to help making the necessary changes. -- Roman From hinsen at cnrs-orleans.fr Thu Mar 7 01:22:04 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 01:22:04 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking References: Message-ID: <200203070915.g279Fhl14309@chinon.cnrs-orleans.fr> Travis Oliphant writes: > Well, I affirm that this is one of the drawbacks of Numeric as compared > with other array-oriented environments. We definitely need a way to index > an array using integers and masks. > > I guess if nobody else feels this way, then I'm alone in my discomfort. No, I basically agree, I just don't have that need immediately and therefore am less motivated to work on it. My preferred solution would be to use special objects (in the spirit of the slice object) for special indexing methods, rather than special cases of existing objects. The advantage is that any number of those can be added over time as the need arises, and there is never a risk of changing the meaning of existing code. However, I do think that this should be thought out and discussed carefully, but unfortunately I won't be able to help much due to lack of time. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Thu Mar 7 01:24:01 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 01:24:01 2002 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #406 - 13 msgs References: Message-ID: <200203070922.g279MUl14314@chinon.cnrs-orleans.fr> Travis Oliphant writes: > > (a) To a person who only uses Numpy occasionally, it is not obvious > > that an argument is "magic". That makes the code less readable. > > That can be a "problem", but it is a "problem" in many languages that > currently are in wide-spread use in numerical computing. > Apparently, the convenience outweighs the perceived concern. Currently, But some people, like me, prefer Python to other languages for exactly that reason. I'll give up shortness for clarity any time. So I am certainly agains "magic" objects, but different kinds of indexing objects, provided that they can be inspected/printed in the code, are nothing magic to me. At the moment, each axis index can be an integer, a range, and a slice. Adding a "boolean mask" to this seems like a natural extension. And even "reduction" operations such as Paul mentioned make perfect sense. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bsder at allcaps.org Thu Mar 7 01:32:03 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Thu Mar 7 01:32:03 2002 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #406 - 13 msgs In-Reply-To: <200203070922.g279MUl14314@chinon.cnrs-orleans.fr> Message-ID: <20020307012721.T56438-100000@mail.allcaps.org> On Thu, 7 Mar 2002, Konrad Hinsen wrote: > So I am certainly agains "magic" objects, but different kinds of > indexing objects, provided that they can be inspected/printed in the How do the new Iterator mechanisms now active in Python 2.2 play into these ideas of indexing objects? Can we get these kinds of slices by providing appropriate iterator objects and references? -a From bsder at allcaps.org Thu Mar 7 02:31:04 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Thu Mar 7 02:31:04 2002 Subject: [Numpy-discussion] big picture? In-Reply-To: <022f01c1c5a0$a5364100$6b01a8c0@ericlaptop> Message-ID: <20020307014154.N56438-100000@mail.allcaps.org> On Thu, 7 Mar 2002, eric wrote: > 1. Matrix Multiply -- should we ask for ~*? > 2. Transpose -- In a perfect world, we'd have an operator for this. > 3. complex conjugate -- An operator for this would also be welcome I, personally, don't find the arguments particularly compelling for extra operators for numeric stuff. While extra operators may make code more "math-like", "MATLAB-like" or "Fortran-like", it won't help with efficiency. If I have code to compute A*x+B*y+C, I'm going to have to call out the A*x+Z and Z=B*y+C primitives as functions anyway. No set of binary operators will work out that optimization. Requesting domain specific operators actually scares me. The main problem is that it is *impossible* to remove them if your choices later turn out to be confusing or wrong. If operators must be added, I would rather see a generic operator mechanism in place in Python. Choice 1 would be a fixed set of operators getting allocated (~* ~+ ~- etc.) which the core language *does not use*. Then any domain can override with their special meaning without collapsing the base language under the weight of domain specific extensions. Now the specific domains can make their changes and only break their own users rather than the Python community at large. Choice 2 would be for a way for Python to actually adjust the interpretation semantics and introduce new operators from inside code. This is significantly trickier and more troublesome, but has the potential of being a much more generally useful solution (far beyond the realm of numerics). Furthermore, it allows people to make code look like whatever they choose. -a From hinsen at cnrs-orleans.fr Thu Mar 7 02:42:02 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 02:42:02 2002 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #406 - 13 msgs In-Reply-To: <20020307012721.T56438-100000@mail.allcaps.org> (bsder@allcaps.org) References: <20020307012721.T56438-100000@mail.allcaps.org> Message-ID: <200203071040.g27AewT14812@chinon.cnrs-orleans.fr> > How do the new Iterator mechanisms now active in Python 2.2 play into > these ideas of indexing objects? Can we get these kinds of slices by > providing appropriate iterator objects and references? An iterator needs to be called for each element, which is probably too slow for a general "extended indexing" solution. But an iterator yielding boolean values could perhaps be a useful class of index object in some cases. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Thu Mar 7 03:23:02 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 03:23:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: <000801c1c547$dcfdb950$1001a8c0@NICKLEBY> (paul@pfdubois.com) References: <000801c1c547$dcfdb950$1001a8c0@NICKLEBY> Message-ID: <200203071121.g27BLe714885@chinon.cnrs-orleans.fr> > I believe the correct solution is a major upgrade to Matrix.py along the > lines of what is done in MA; that is, to craft an object that uses > Numeric for its implementation but which defines all its own operators > in a manner that is semantically sensible for the type of object it is. That is exactly my idea as well. However, from a quick glance at MA, it seems that this solution could suffer from performance problems when done in Python. What are the real-life experiences with MA in that respect? I suppose the new type inheritance mechanisms in Python 2.2 could help to make this more efficient, but I haven't used them for anything yet. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From paul at pfdubois.com Thu Mar 7 07:57:03 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Thu Mar 7 07:57:03 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines In-Reply-To: <3C8730E9.293490E8@inf.ethz.ch> Message-ID: <000001c1c5f0$8ca17920$1001a8c0@NICKLEBY> If someone is going to make the change they should change the source to use FortranInt or some similar typedef so that one ifdef could be used to change it. I believe the current lapack/blas were made by an automatic conversion tool. It is easy to make a case that they shouldn't even be in the distribution, that rather a user should install their own library. However, this is a problem on Windows, where many users do not have a development environment, and in general, because it makes the instructions for installing more complicated. So we have sort of felt stuck with it. I have no real way of convincing myself that the proposed change won't break some other platform, although it seems unlikely. -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Roman Geus Sent: Thursday, March 07, 2002 1:21 AM To: numpy-discussion at lists.sourceforge.net Subject: Re: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines Hello, Pearu Peterson wrote: > > Hi, > > On Wed, 6 Mar 2002, Roman Geus wrote: > > > So, what really needs to be changed (at least for this machine) is > > how Numerical Python calls BLAS/LAPACK. It also needs to use 32bit > > integers. So this means using 'int' instead of 'long int'. > > Having wrapped a lot of Fortran codes to Python, I agree, that > Numerical Python should use 'int' instead of, 'long'. Though I have > little influence to make this change to happen in Numeric but just > agreeing with you. > > Pearu What would be the best way to convince the NumPy developers to use 'int' instead 'long' for Fortran integers? I would be willing to help making the necessary changes. -- Roman _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From oliphant at ee.byu.edu Thu Mar 7 09:11:08 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Thu Mar 7 09:11:08 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: <200203070915.g279Fhl14309@chinon.cnrs-orleans.fr> Message-ID: > No, I basically agree, I just don't have that need immediately and > therefore am less motivated to work on it. > > My preferred solution would be to use special objects (in the spirit > of the slice object) for special indexing methods, rather than special > cases of existing objects. The advantage is that any number of those > can be added over time as the need arises, and there is never a risk > of changing the meaning of existing code. > Thanks for the comments you have made. I always appreciate them. Are you suggesting something like: b = IndexArray([1,3,10,100]) a[b]? This is really not much different than. a[[1,3,10,100],IndexArray] which is essentially what I've suggested (I was looking for shortcuts), but in principle IndexArray could be a class with a method that the code in Numeric interfaces with. > However, I do think that this should be thought out and discussed > carefully, but unfortunately I won't be able to help much due to lack > of time. > Thanks for participating thus far. -Travis From hinsen at cnrs-orleans.fr Thu Mar 7 10:32:04 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 10:32:04 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: (message from Travis Oliphant on Thu, 7 Mar 2002 10:13:05 -0500 (EST)) References: Message-ID: <200203071830.g27IUxA16088@chinon.cnrs-orleans.fr> > Are you suggesting something like: > > b = IndexArray([1,3,10,100]) > > a[b]? Exactly. With IndexArray being some special object (if only a thin wrapper), that prints differently from a simple array and can be type-tested. > This is really not much different than. > > a[[1,3,10,100],IndexArray] Except that in the first case, there is exactly one indexing object per axis, the operation can be a different one along each axis, and the index object carries specifies its own meaning. But the effect is the same, of course. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From perry at stsci.edu Thu Mar 7 17:29:02 2002 From: perry at stsci.edu (Perry Greenfield) Date: Thu Mar 7 17:29:02 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <022f01c1c5a0$a5364100$6b01a8c0@ericlaptop> Message-ID: Eric makes a good point about stepping back and thinking about these issues in a broader context. Along these lines I'd like to make a proposal and see what people think. I think Konrad made a very good point about matrix vs array representation. If we made it illegal to combine them in expressions without explicit conversions, we could prevent much confusion about what kind of operations would be performed. An attempt to use one kind of in place of an other would trigger an exception and thus users would always know when that was a problem. Implementing this behavior in numarray would be simple, as would having both share the same implementation for common operations (without any extra performance penalty). That still leaves the question of how to do the conversions, i.e., one of the following options matrix(a) * b # matrix multiply of array (a) with matrix (b) a.M * b a.M() * b likewise: a * array(b) # element-wise multiply of array (a) with matrix (b) a * b.A a * b.A() I strongly prefer the first (functional) form. Rick White has also convinced me that this alone isn't sufficient. There are numerous occasions where people would like to use matrix multiply, even in a predominately "array" context, enough so that this would justify a special operator for matrix multiplication. If the Numeric community is united on this, I think Guido would be receptive. We might suggest a particular operator symbol or pair (triple) but leave him some room to choose alternatives he feels are better for Python (he could well come up with a better one). It would be nice if it were a single character (such as @) but I'd be happy with many of the other operator suggestions (~*, (*), etc.) Note that this does not imply we don't need a seperate matrix object. I think it is clear that simply providing a matrix multiply operator is not going to answer all their needs. As to the other related issues that Eric raises, in particular: operators for transpose and complex conjugate, I guess I don't see these as so important. Both of these are unary operators, and as such either of the following options does not seem to be notationally much worse (whereas using binary functions in place of binary operators is much less readable) transpose(x) conjugate(x) x.transpose() x.conjugate() x.T() x.C() x.T x.C (Personally, I prefer the first two) Perry From hinsen at cnrs-orleans.fr Fri Mar 8 00:06:04 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Mar 8 00:06:04 2002 Subject: [Numpy-discussion] big picture? One proposal References: Message-ID: <200203080804.g2884uY16995@chinon.cnrs-orleans.fr> "Perry Greenfield" writes: > That still leaves the question of how to do the conversions, i.e., one > of the following options ... > I strongly prefer the first (functional) form. Me too. I wouldn't call it "functional" though, it's exactly the way object constructors are written. > Rick White has also convinced me that this alone isn't sufficient. > There are numerous occasions where people would like to use matrix > multiply, even in a predominately "array" context, enough so that this > would justify a special operator for matrix multiplication. If the Could you summarize those reasons please? I know that there are applications of matrix multiplication in array processing, but in my experience they are rare enough that writing dot(a, b) is not a major distraction. Maybe we need to take another step back as well: Python is a general-purpose language, with several specialized subcommunities such as ours, some of them even more important in size. Most likely they are having similar discussions. Perhaps the database guys are discussing why they need two more special operators for searching and concatenating databases. I don't think such requests are reasonable. It is tempting to think that it doesn't matter, if you don't need that operator, you just don't use it. But a big advantage of Python is readability. If we get our (well, *yours*, I don't want it ;-) matrix multiply operator, a month later someone will decide that it's just great for his database application, and the database community will have to get used to it as well. > Numeric community is united on this, I think Guido would be receptive. > We might suggest a particular operator symbol or pair (triple) but Actually I feel quite safe: there might be a majority for another operator, but I don't expect we'd ever agree on a symbol :-) Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From pearu at cens.ioc.ee Fri Mar 8 02:27:13 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri Mar 8 02:27:13 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <200203080804.g2884uY16995@chinon.cnrs-orleans.fr> Message-ID: Hi, On Fri, 8 Mar 2002, Konrad Hinsen wrote: > > Numeric community is united on this, I think Guido would be receptive. > > We might suggest a particular operator symbol or pair (triple) but > > Actually I feel quite safe: there might be a majority for another > operator, but I don't expect we'd ever agree on a symbol :-) Thanks Konrad for this excellent point. Seeing all these proposals about solving our matrix-array issues makes also me feel safer with the current situation. My general point is that a _good_ solution is simple, if a solution is not simple, then it is probably a bad solution. I find separating array and matrix instances (in a sense of raising exception when doing ) not a very simple solution: New concepts are introduced that actually do not solve the simplicity problem of representing matrix operations. As I see it, they only introduce restrictions and the main assumption behind the rationale is that "users are dumb and they don't know what is best for them". This is how I interpret the raised exception as behind the scenes matrix and array are the same (in the sense of data representation). Let me remind at least to myself that the discussion started from a proposal that aimed to write matrix multiplication of arrays Numeric.dot(a,b) in somewhat simpler form. Current solution is to use Matrix class that being just a wrapper of arrays, redefines __mul__ method; and after one has defined a = Matrix.Matrix(a) the matrix multiplication of arrays looks simple: a * b Travis's proposal was to reduce the first step and have it inside an expression in a short form: a.M * b (there have been two implementation approaches proposed for this: (i) a.M returns a Matrix instance, (ii) a.M returns the same array with a temporarily set bit saying that the following operation is somehow special). To me, this looks like a safe solution. Though it is a hack, at least it is simple and understandable in anywhere where it is used (having a * b where b can be either matrix or array, it is not predictable from just looking the code what the result will be -- not very pythonic indeed). The main objection to this proposal seems to be that it deviates from a good pythonic style (ie don't mess with attributes in this way). I'd say that if python does not provide a good solution to our problem, then we are entitled to deviate from a general style. After all, in doing numerics the efficiency issue has a rather high weight. And a generally good style of Python cannot always support that. I guess I am missing here a goal to get Numeric or numarray joined to Python. With this perspective the only efficient solution seems to be introducing a new operator (or new operators). Few candidates have been proposed: a ~* b - BTW, to me this looks like dot(conjugate(a),b). a (*) b - note the in situ version of it: a (*)= b a (**) b - looks ugly enough? ;-) Actually, why not a [*] b, a {*} b for direct products of matrices (BTW (*) seems more appropriate here). So, my ideal preference would be: a .* b - element-wise multiplication of arrays, 2nd pref.: a * b a * b - matrix multiplication of arrays, 2nd preference: a [*] b a (*) b - direct matrix multiplication (also know as tensor product) of arrays a~ - conjugate of arrays a` - transpose of arrays This looks great but requires many new features to Python (new operators, the concept of right-hand unary operator). I don't think that Python should introduce these new operators just because of Numeric community. It is fine if they get used in other fields as well that suffer from the lack of operators. About unary operations: transpose and conjugate. BTW, in complex linear algebra their composition is equally frequent operator. Let me propose the following solution: to have a ** T for Numeric.transpose(a) a ** H for Numeric.transpose(Numeric.conjugate(a)) define T = TransposeOp() H = TransposeOp(conjugate=1) where class TransposeOp: def __init__(self, conjugate=0): self.conjugate = conjugate def __rpow__(self,arr): if self.conjugate: return Numeric.transpose(Numeric.conjugate(a)) return Numeric.transpose(arr) Looks Pythonic to me;-) Regards, Pearu From hinsen at cnrs-orleans.fr Fri Mar 8 03:38:18 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Mar 8 03:38:18 2002 Subject: [Numpy-discussion] big picture? One proposal Message-ID: <200203081137.g28BbKF17817@chinon.cnrs-orleans.fr> > situation. My general point is that a _good_ solution is simple, > if a solution is not simple, then it is probably a bad solution. Agreed. So we just need to agree on what is "simple". > I find separating array and matrix instances (in a sense of raising > exception when doing ) not a very simple solution: > New concepts are introduced that actually do not solve the simplicity > problem of representing matrix operations. As I see it, they I disagree there, separating the concepts of matrix and array *does* solve the simplicity problem in my opinion. Matrices use operators for common matrix operations, and arrays use the same operators for common array operations. > only introduce restrictions and the main assumption behind the rationale > is that "users are dumb and they don't know what is best for them". No, not at all. On the contrary, the rationale is "users are smart and know that arrays and matrices are different" ;-) Unfortunately, I have the impression that there are two schools of thought in collision here (and not just when it comes to programming). There is the "mathematical" school that defines matrices and arrays as abstract entities with certain properties and associated operations. And there is the "engineering" school that sees arrays as a convenient data structure to express certain operations, of which "matrix operations" are a subset. As a student, I had a friend who studied mechanical engineering, and his math exercises made me go mad more than once. When I read "...the vector of the masses...", I just had to scream ;-) Many engineering textbooks have the same effect on me. Now obviously I belong to the "mathematical" school, but I don't expect to convert everyone else to it. So my arguments will remain pythonic and pragmatic: the "mathematical" approach solves the problem without asking for new operators, and thus has a better chance of getting realized. > This is how I interpret the raised exception as behind the scenes matrix > and array are the same (in the sense of data representation). But data representation and data semantics are two different things. Readibility of code depends on semantics, not on internal representations or even implementation. Using the same representation merely implies that conversion should be efficient, but not necessarily implicit. > The main objection to this proposal seems to be that it deviates from a > good pythonic style (ie don't mess with attributes in this way). > I'd say that if python does not provide a good solution to our problem, > then we are entitled to deviate from a general style. After all, in doing That's another point where I disagree. I use Python for many different uses, numerics is only one of them (though the most important one). Uniformity of style is an important value for me. Moreover, I claim that Python *does* provide a good solution, it is merely a very different one. > numerics the efficiency issue has a rather high weight. And a generally > good style of Python cannot always support that. Computational efficiency is not the issue here. If that's all you want, call a BLAS routine for matrix multiplication with two array arguments - doable today, without any modification of whatsoever. Even Fortran programmers do that, instead of suggesting that Fortran 2002 should add a "multiply-by-calling-BLAS" operator. > define > > T = TransposeOp() > H = TransposeOp(conjugate=1) Does that work? I'd expect that a**T would first call .__pow__(T) which quite probably crashes... (Not that it matters to me, I find this almost as abusive as the matrix attributes.) Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From peterson at math.utwente.nl Fri Mar 8 05:05:06 2002 From: peterson at math.utwente.nl (Pearu Peterson) Date: Fri Mar 8 05:05:06 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <200203081117.g28BHXK17770@chinon.cnrs-orleans.fr> Message-ID: On Fri, 8 Mar 2002, Konrad Hinsen wrote: > Unfortunately, I have the impression that there are two schools of > thought in collision here (and not just when it comes to programming). > There is the "mathematical" school that defines matrices and arrays > as abstract entities with certain properties and associated operations. > And there is the "engineering" school that sees arrays as a convenient > data structure to express certain operations, of which "matrix operations" > are a subset. I see arrays as a convenient data structure (being implemented in computer programs) to hold matrices (being members of a mathematical concept). I guess that my views are narrow-minded (but willing to widen it) regarding to consider arrays as a mathematical concept too. Just in mathematics I never (need to) use arrays in that way (my fields are mathematical analysis, integrable systems, and not computer science nor engineering). So, I also belong to the school of "mathematics", but may be into a different one. > That's another point where I disagree. I use Python for many different > uses, numerics is only one of them (though the most important one). > Uniformity of style is an important value for me. Me too. Just I am not too crazy about the constant style but more of if something can be accomplished efficiently. To be honest, I don't like programming in Python because it has a nice style, but because I can accomplish a lot with it in a very efficient way (and not only by using efficient algorithms). Writing, for example, Numeric.transpose(a) instead of a**T, a.T, a`, or whatever just reduces this efficiency. I also realize and respect that for computer scientists (that I presume the developers of Python are) it is crucial to have consistent style for their reasons. Sometimes this style makes some site-specific simple tasks too verbose to follow. > Moreover, I claim that Python *does* provide a good solution, it is > merely a very different one. So, what is it? > Does that work? I'd expect that a**T would first call .__pow__(T) > which quite probably crashes... (Not that it matters to me, I find > this almost as abusive as the matrix attributes.) Yes, it works: >>> from Numeric import * >>> class T: __rpow__ = lambda s,o: transpose(o) ... >>> print array([[1,2],[3,4]]) ** T() [[1 3] [2 4]] And I don't understand why it is abusive (because it is a different approach?). It's just an idea. Pearu From rlw at stsci.edu Fri Mar 8 05:22:15 2002 From: rlw at stsci.edu (Rick White) Date: Fri Mar 8 05:22:15 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <200203080804.g2884uY16995@chinon.cnrs-orleans.fr> Message-ID: On Fri, 8 Mar 2002, Konrad Hinsen wrote: > "Perry Greenfield" writes: > > > Rick White has also convinced me that this alone isn't sufficient. > > There are numerous occasions where people would like to use matrix > > multiply, even in a predominately "array" context, enough so that this > > would justify a special operator for matrix multiplication. If the > > Could you summarize those reasons please? I know that there are > applications of matrix multiplication in array processing, but in my > experience they are rare enough that writing dot(a, b) is not a major > distraction. A couple of quick examples: I do lots of image processing (e.g. deconvolution) using arrays. It is often helpful to take the outer product of two 1-D vectors; e.g. if there is a separable function f(x,y) = g(x)*h(y), you can compute separate g & h vectors and then combine them with outer product (a special case of matrix multiply) to get the desired 2-D image. Another example: when I'm working with either 2-D images or 1-D vectors, it is helpful to be able to compute projections using a set of basis vectors (e.g. for singular value decomposition, eigenvectors, etc.) This is most easily expressed using matrix multiplies - but most uses of the data still treat them as simple arrays instead of matrices. Being able to group these operations together is helpful both for readability of the code and for efficiency of execution. Having said that, I think I actually agree with Konrad that these sorts of operations are rare enough (in the data processing context) that it is no great burden to write them using function calls instead of operators. If we could agree on a matrix-multiply operator, that would be nice -- but if we can't, I can live with that too. For my purposes, I certainly don't see the need to add special operations to do things like transpose. Those should be limited to a separate matrix class as Konrad proposes and should be available as function calls for arrays. Rick ------------------------------------------------------------------ Richard L. White rlw at stsci.edu http://sundog.stsci.edu/rick/ Space Telescope Science Institute Baltimore, MD From hinsen at cnrs-orleans.fr Fri Mar 8 06:52:15 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Mar 8 06:52:15 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: (message from Pearu Peterson on Fri, 8 Mar 2002 14:04:15 +0100 (CET)) References: Message-ID: <200203081449.g28EnET18086@chinon.cnrs-orleans.fr> > regarding to consider arrays as a mathematical concept too. Just in > mathematics I never (need to) use arrays in that way (my fields are > mathematical analysis, integrable systems, and not computer science nor I meant "mathematical" as a school of thought (going from the abstract to the concrete), not as a domain of research. I don't know any area of mathematics either that uses the array concept, but it is definitely common in computer science (as a structured collection of similar data). Image data is a good example. > something can be accomplished efficiently. To be honest, I don't like > programming in Python because it has a nice style, but because I can > accomplish a lot with it in a very efficient way (and not only by using I want both :-) > > Moreover, I claim that Python *does* provide a good solution, it is > > merely a very different one. > > So, what is it? Separate matrix and array objects, with computationally efficient but explicit (verbose) interconversion. > Yes, it works: > >>> from Numeric import * > >>> class T: __rpow__ = lambda s,o: transpose(o) > ... > >>> print array([[1,2],[3,4]]) ** T() > [[1 3] > [2 4]] Right, it works as long as the left argument doesn't try to do the power operation itself. > And I don't understand why it is abusive (because it is a different > approach?). It's just an idea. For me, "power" is a shorthand for repeated multiplication, with certain properties attached to it. I have no problem with using the ** operator for something else, but then on different data types. The idea that a**b could be completely different operations for the same a as a function of b is not very appealing to me. In fact, the idea that an operand instead of the operator defines the operation is not very appealing to me. There's also a more pragmatic objection which is purely technical, I like to stay away from playing tricks with the binary operator type coercion system in Python. Sooner or later it always bites back. And the details have changed over Python releases, which is a compatibility nightmare. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From perry at stsci.edu Fri Mar 8 07:12:11 2002 From: perry at stsci.edu (Perry Greenfield) Date: Fri Mar 8 07:12:11 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: Message-ID: Pearu Peterson writes: [...] > > I find separating array and matrix instances (in a sense of raising > exception when doing ) not a very simple solution: > New concepts are introduced that actually do not solve the simplicity > problem of representing matrix operations. As I see it, they > only introduce restrictions and the main assumption behind the rationale > is that "users are dumb and they don't know what is best for them". > This is how I interpret the raised exception as behind the scenes matrix > and array are the same (in the sense of data representation). > I don't think the issue is whether users are "dumb" but rather, will it be more or less transparent to them what is supposed to happen. Remember, this particular proposal affects in no way the notational convenience when operands are of the same type. It doesn't even affect the notational convenience of most of the examples presented (e.g., a.M * b.M) as long as the resulting operands are of the same type. It only affects cases involving mixed types. Do we really want * (yes, it would be possible to have one dominate over * the other always regardless of order) to mean two different things for example? Will a user always be aware that a module function returns arrays rather than matrices. Yes, users ought to check the documentation, but they often don't or they misremember. The more I think about it the more I come to think it really is better to be safer in this case. It will not be hard for users to explicitly convert, nor should it be notationally cumbersome. E.g. (just to use one the proposed options) matrix(a) * b a * array(b) I don't see this as a big burden. I would rather do it this way myself for my own code. [...] > (there have been two implementation approaches proposed for this: (i) a.M > returns a Matrix instance, (ii) a.M returns the same array with a > temporarily set bit saying that the following operation is somehow > special). > To me, this looks like a safe solution. Though it is a hack, at least it > is simple and understandable in anywhere where it is used (having a * b > where b can be either matrix or array, it is not predictable from just > looking the code what the result will be -- not very pythonic indeed). > It's safer, but it isn't safe. Besides, one could still do this and raise exceptions on mixed types. Is the issue that people strongly want to do (if both a an b are arrays) a.M * b (or matrix(a) * b) instead a.M * b.M (or matrix(a) * matrix(b)) to get matrix behavior? Perry From pearu at cens.ioc.ee Fri Mar 8 07:38:04 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri Mar 8 07:38:04 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: Message-ID: On Fri, 8 Mar 2002, Perry Greenfield wrote: > It's safer, but it isn't safe. Besides, one could still do this and raise > exceptions on mixed types. Is the issue that people strongly want to do > (if both a an b are arrays) > > a.M * b (or matrix(a) * b) > > instead > > a.M * b.M (or matrix(a) * matrix(b)) > > to get matrix behavior? Just to be clear, my suggestion consisted on the following points: 1) not to introduce any new type or concept such as matrix 2) to forget the current Matrix class 3) all objects are arrays, including a.M 4) a.M has a temporary bit set only for the following operation So, with this setup there is no issue with mixed types at all and it is easy to implement. But if there will be introduced a new type, matrix, then this setup does not work. The reason why I proposed the above setup was exactly because I didn't like that a.M would return a different object type in the middle of an expresssion. Pearu From dfb at mrao.cam.ac.uk Fri Mar 8 08:05:29 2002 From: dfb at mrao.cam.ac.uk (David Buscher) Date: Fri Mar 8 08:05:29 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <200203081449.g28EnET18086@chinon.cnrs-orleans.fr> Message-ID: On Fri, 8 Mar 2002, Konrad Hinsen wrote: > > regarding to consider arrays as a mathematical concept too. Just in > > mathematics I never (need to) use arrays in that way (my fields are > > mathematical analysis, integrable systems, and not computer science nor > > I meant "mathematical" as a school of thought (going from the abstract > to the concrete), not as a domain of research. I don't know any area > of mathematics either that uses the array concept, but it is > definitely common in computer science (as a structured collection of > similar data). Image data is a good example. Just my 2c worth: I count myself in the "mathematical" school despite being a physicist. I look at matrices as having a specific algebra which, for instance, cannot be easily made to apply to higher-dimensional arrays. Therefore they are not just arrays looked at in a different way. For object-oriented thinkers this means they are different objects. They may "inherit" a lot of attributes from arrays but are not arrays. Another point to note is that a specific complaint earlier in the thread was the computational inefficiency of using numpy arrays for matrix-intensive operations. It seems to me that it would be far easier to write an optimised set of code for matrices if they were known to be a separate class. An example (which is probably not useful, but serves for illustration) is that one could "cache" or delay transposes etc, knowing that a matrix-multiply was likely to be about to come up. This sort of thing would be more difficult if the result of the transpose would have to be sensible when followed by a generic array operation. David From paul at pfdubois.com Fri Mar 8 08:25:06 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Mar 8 08:25:06 2002 Subject: [Numpy-discussion] An historical precedent for matrix operation symbols In-Reply-To: <200203080804.g2884uY16995@chinon.cnrs-orleans.fr> Message-ID: <000001c1c6bd$97902750$1001a8c0@NICKLEBY> To sum up my previous postings: I prefer the "constructor" notation, not the funny-attribute notation. However, I believe an efficient matrix class can be done in Python on top of a numeric array object. Shadow classing just doesn't seem to be an overhead problem in most cases. The coding for MA is very complicated because of its semantics are so different; for Matrix it would be much less complicated. When I designed Basis (1984) I was faced with the operator issue. The vast majority of the operations were going to be elementwise but some users would also want matrix multiply and the solution of linear systems. (Aside: a/b meaning the solution x of bx = a, is best calculated not as (b**-1) * a but by solving the system without forming the inverse.) My choice was: matrix multiply and divide were *!, /!. This was successful in two senses: the users found it easy to remember, and you could implement it in the tokenizer or just in the grammar. I chose to make it a token so as to forbid internal spaces, but for Python it could be done without touching the tokenizer, and it doesn't use up any new symbols. Symbols are precious; when I designed Basis I had a keyboard map and I would cross out the keys I had "used up". If I were Guido I would be very reluctant to give up anything valuable like @ for our purposes. One should not have any illusions: putting such operators on the array class is just expediency, a way to give the arrays a bit of a dual life. But a real matrix facility would have an abstract base class, be restricted to <= 2 dimensions, have realizations including symmetric, Hermitian, sparse, tridiagonal, yada yada yada. Another aside: There are mathematical operations whose output type depends not just on the input type but some of these other considerations. This led me to believe that the Numpy approach is essentially correct, that the type of the elements be variable rather than having separate classes for each type. From paul at pfdubois.com Fri Mar 8 08:46:04 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Mar 8 08:46:04 2002 Subject: [Numpy-discussion] How the a**T trick works Message-ID: <000801c1c6c0$9df53d80$1001a8c0@NICKLEBY> I confess I admire the a**T suggestion for a notation for transpose(a). The question was raised as to whether this really works and an empirical proof was offered that it does. Here is how it works. In a**T the first thing tried is to ask a.__pow__ if it can manage with T as an argument. The array says, hmm, T is an instance of a class that I never heard of, and it doesn't have an __array__ attribute to call in order to make it an array. I pass. Then T's class' __rpow__ is given a turn, and asked if it can do the job with a as an argument, which it can. This is how 2**a works, or 2 - a, too. The int type disavows any knowledge of what to do, so the other operand gets a chance to save the day. So this is a "trick" we use every day. From perry at stsci.edu Fri Mar 8 08:48:09 2002 From: perry at stsci.edu (Perry Greenfield) Date: Fri Mar 8 08:48:09 2002 Subject: [Numpy-discussion] RE: An historical precedent for matrix operation symbols In-Reply-To: <000001c1c6bd$97902750$1001a8c0@NICKLEBY> Message-ID: Paul Dubois writes: [...] Paul puts this very well and I agree with virtually everything he says. > > One should not have any illusions: putting such operators on the array > class is just expediency, a way to give the arrays a bit of a dual life. > But a real matrix facility would have an abstract base class, be > restricted to <= 2 dimensions, have realizations including symmetric, > Hermitian, sparse, tridiagonal, yada yada yada. > A few comments on this point. I do think that this is correct. If a matrix is stored as an array intrinsically, then constructing an array representation (e.g., array(b) where b is a matrix) would be very efficient since a new array object consists of creating a new object that still uses the same underlying data buffer the matrix object does. There is no significant increase in memory. On the other hand, if a matrix class does use other representations, particularly such as sparse or tridiagnonal, then there naturally would be some cost to creating an array object since a new copy of the data would be required. Having such various matrix representations would certainly be useful (particularly sparse) but will certainly require work to support (not something we (STScI) can sign up to do, but I hope someone else is willing). Perry From hinsen at cnrs-orleans.fr Fri Mar 8 09:56:06 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Mar 8 09:56:06 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <20020308104153.A145426@oakland.edu> (message from Jon Moody on Fri, 8 Mar 2002 10:41:53 -0500) References: <200203080804.g2884uY16995@chinon.cnrs-orleans.fr> <20020308104153.A145426@oakland.edu> Message-ID: <200203081754.g28HshQ19721@chinon.cnrs-orleans.fr> > The Python core has long had at least 2 examples of operators which > act as object constructors: 'j' which performs complex() and 'L' which > performs long() (you can't get much more `pythonic' than a built-in > type). Those are suffixes for constants, not operators. If they were operators, you could apply them to variables - which you can't. More importantly, the L suffix wouldn't even work as an operator, as the preceding number might extend the range of integers before it has a chance of being converted to a long integer. > I would venture to say that the numeric community is pretty high up > there in importance if not size, given the early appearance of the > complex number type and strong math capacity not to mention GvR's The complex type was introduced for the benefit of NumPy (I remember it all too well, as I did the initial implementation), but after a long discussion on the Python list, with many expressing disapprovement because of its special-need status. I'd say it shows the limits of what one can get accepted. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From jjl at pobox.com Fri Mar 8 13:42:07 2002 From: jjl at pobox.com (John J. Lee) Date: Fri Mar 8 13:42:07 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: <200203070910.g279AlE14306@chinon.cnrs-orleans.fr> Message-ID: On Thu, 7 Mar 2002, Konrad Hinsen wrote: > "eric" writes: > > > Matrix.Matrix objects. This attribute approach will work, but I > > wonder if trying the "adding an operator to Python" approach one > > more time would be worth while. At Python10 developer's day, Guido [...] > If you want to go the "operator way", the goal should rather be > something like APL, with composite operators. Matrix multiplication [...] How about general operator - function equivalence, as explained here by Alex Martelli? The change is large in one sense, but it is conceptually very simple: http://groups.google.com/groups?q=operator+Martelli+Haskell+group:comp.lang.python&hl=en&selm=8t4dl301a4%40news2.newsguy.com&rnum=1 > 2 div 3 > or > div(2,3) > or > 2 `div 3 > [Haskell-ishly syntax-sugar note: Haskell lets you > use any 2-operand function as an infix operator by > just enclosing its name in ``; in Py3K, I think a > single leading ` would suffice -- far nicer than the > silly current use of ` for the rare need of repr -- > and we might also, with pleasing symmetry, let any > operator be used as a normal function a la > `+(a,b) > i.e., the ` marker could lexically switch functions > to operators and operators to functions, without > needing to 'import operator' and recall what the > operator-name for a given operator IS...!-). The > priority and associativity of these infinitely > many "new operators" could be fixed ones...]. Since GvR seems to have given up the idea of 'Py3K' in favour of gradual changes, perhaps this is a real possibility? Travis' r = a.M * b.M would then be written as M = Numeric.matrixmultiply r = a `M b (Konrad also complains about Perl's nasty syntax. This is frequently complained about, but do you really think the syntax is the problem -- surely it's Perl's horribly complicated semantics that is the real issue? The syntax is just inconvenient, in comparison at least. Sorry, a bit OT...) John From bsder at allcaps.org Fri Mar 8 15:36:02 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Fri Mar 8 15:36:02 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: Message-ID: <20020308144903.L2621-100000@mail.allcaps.org> On Fri, 8 Mar 2002, Pearu Peterson wrote: > This is how I interpret the raised exception as behind the scenes matrix > and array are the same (in the sense of data representation). Matricies can have many different storage formats. Sparse, Banded, Dense, Triangular, etc. Arrays and Matricies are the same behind the scenes *for the moment*. For examples of matrix storage formats, check the BLAST Technical Forum documentation at netlib. Unfortunately, www.netlib.org appears to be down right now. Locking the assumption that Matrix and Array are "the same behind the scenes" into the main Python specification is not a good idea. -a From tim.one at comcast.net Fri Mar 8 21:18:01 2002 From: tim.one at comcast.net (Tim Peters) Date: Fri Mar 8 21:18:01 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: I hope the bogus underflow problem is fixed in CVS Python now. Since my platform didn't have the problem (and still doesn't, of course), we won't know for sure until people try it and report back. Python used to link with -lieee, but somebody changed it for a reason history may have lost. See comments attached to Huaiyu Zhu's bug report for more on that: If someone on a box that had the problem can build current CVS Python with and without -lieee (I'm told it defaults to "without" today, although that appears to mixed up with whether or not __fpu_control is found outside of libieee (search for "ieee" in configure.in)), please try the following in a shell: import math x = 1e200 y = 1/x math.pow(x, 2) # expect OverflowError math.pow(y, 2) # expect 0. pow(x, 2) # expect OverflowError pow(y, 2) # expect 0. x**2 # expect OverflowError y**2 # expect 0. If those all work as annotated on a box that had the problem, please report the success *as a comment to the bug report* (link above). If one or more still fail, likewise please give box details and paste the failures into a comment on the bug report. Thanks! From hinsen at cnrs-orleans.fr Sat Mar 9 14:38:02 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Sat Mar 9 14:38:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. References: Message-ID: <200203092236.g29Matv21585@chinon.cnrs-orleans.fr> "John J. Lee" writes: > (Konrad also complains about Perl's nasty syntax. This is frequently > complained about, but do you really think the syntax is the problem -- > surely it's Perl's horribly complicated semantics that is the real issue? > The syntax is just inconvenient, in comparison at least. Sorry, a bit > OT...) It's both, of course. I don't really wish to decide which is worse, especially not because I'd have to read more Perl code to reach such a decision ;-) But syntax is an issue for readability. There are some symbols that are generally used as operators in computer languages, and I think Python uses all of them already. Moreover, the general semantics are quite uniform as well: * stands for multiplication, for example, although the details of what multiplication means can vary. Symbols like @ are not operators everywhere, and where they are there is no uniform meaning attached to them, so they create confusion. As a test, take a Python program and replace all * by @. It does look weird. Konrad. From jmiller at stsci.edu Tue Mar 12 14:21:02 2002 From: jmiller at stsci.edu (Todd Miller) Date: Tue Mar 12 14:21:02 2002 Subject: [Numpy-discussion] ANN: numarray-0.3 Message-ID: <3C8E7F39.7090100@stsci.edu> Numarray 0.3 ------------ Numarray is a Numeric replacement which features c-code generated from python template scripts, the capacity to operate directly on arrays in files, and improved type promotion semantics. Numarray-0.3 incorporates safety checks to prevent crashing Python when a user accidentally changes private variables in numarray. The new safety checks ensure that: 1. Numarray C-functions are called with properly sized buffers. 2. Numarray C-functions are called with properly aligned buffers. 3. Parameters match the C-function in count and i/o direction. 4. The correct generic function wrapper is used to call each C-function. 5. All indices implied by the array strides are valid. Failed checks result in python exceptions. A new memory object fixes an unforunate limitation of the python buffer object, namely the lack of guaranteed double aligned storage. The largest generated source module, _ufuncmodule.c, has been partitioned by data type into several smaller, more gcc-friendly modules, e.g. _ufuncFloat64module.c. The sort and argsort functions are fixed. The dot function is fixed for 1D arrays. Transpose, swapaxes, and reshape once again return views. WHERE ----------- Numarray-0.3 windows executable installers and source code tar ball is here: http://sourceforge.net/project/showfiles.php?group_id=1369 Numarray is hosted by Source Forge in the same project which hosts Numeric: http://sourceforge.net/projects/numpy/ The web page for Numarray information is at: http://stsdas.stsci.edu/numarray/index.html Trackers for Numarray Bugs, Feature Requests, Support, and Patches are at the Source Forge project for NumPy at: http://sourceforge.net/tracker/?group_id=1369 REQUIREMENTS -------------------------- numarray-0.3 requires Python 2.0 or greater. AUTHORS, LICENSE ------------------------------ Numarray was written by Perry Greenfield, Rick White, Todd Miller, JC Hsu, Paul Barrett, and Phil Hodge at the Space Telescope Science Institute. Numarray is made available under a BSD-style License. See LICENSE.txt in the source distribution for details. -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 From paul at pfdubois.com Wed Mar 13 15:02:06 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Mar 13 15:02:06 2002 Subject: [Numpy-discussion] Numerical Python 21.0 released Message-ID: <000001c1cae3$0b5aede0$0e01a8c0@NICKLEBY> Version 21.0 March 13, 2002 Fixed bugs: [ #482603 ] Memory leak in MA/Numeric/Python Reported by Reggie Dugard. Turned out to be *two* memory leaks in one case in a routine in Numeric, array_objectype. (Dubois) [ none ] if vals was a null-array array([]) putmask and put would crash. Fixed with check. [ #469951 ] n = n1[0] gives array which shares dimension of n1 array. This causes bugs if shape of n1 is changed (n didn't used to have it's own dimensions array (Travis Oliphant) [ #514588 ] MLab.cov(x,x) != MLab.cov(x) (Travis Oliphant) [ #518702 ] segfault when invalid typecode for asarray (Travis Oliphant) [ #497530 ] MA __getitem__ prevents 0 len arrays (Reggie Duggard) [ #508363 ] outerproduct of noncontiguous arrays (Martin Wiechert) [ #513010 ] memory leak in comparisons (Byran Nollett) [ #512223 ] Character typecode not defined (Jochen Kupper) [ #500784 ] MLab.py diff error (anonymous, fixed by Dubois) [ #503741 ] accuracy of MLab.std(x) (Katsunori Waragai) [ #507568 ] overlapping copy a[2:5] = a[3:6] Change uses of memcpy to memmove which allows overlaps. [ numpy-Patches-499722 ] size of buffer created from array is bad (Michel Sanner). [ #502186 ] a BUG in RandomArray.normal (introduced by last bug fix in 20.3) (Katsunori Waragai). Fixed errors for Mac (Jack Jensen). Make rpm's properly, better Windows installers. (Gerard Vermeulen) Added files setup.cfg; setup calculates rpm_install.sh to use current Python. New setup.py, eliminate setup_all.py. Use os.path.join everywhere. Revision in b6 added file README.RPM, further improvements. Implement true division operations for Python 2.2. (Bruce Sherwood) Note: true division of all integer types results in an array of floats, not doubles. This decision is arbitrary and there are arguments either way, so users of this new feature should be aware that the decision may change in the future. New functions in Numeric; they work on any sequence a that can be converted to a Numeric array. Similar change to average in MA. (Dubois) def rank (a): "Get the rank of a (the number of dimensions, not a matrix rank)" def shape (a): "Get the shape of a" def size (a, axis=None): "Get the number of elements in a, or along a certain axis." def average (a, axis=0, weights=None, returned = 0): """average(a, axis=0, weights=None) Computes average along indicated axis. If axis is None, average over the entire array. Inputs can be integer or floating types; result is type Float. If weights are given, result is: sum(a*weights)/(sum(weights)) weights must have a's shape or be the 1-d with length the size of a in the given axis. Integer weights are converted to Float. Not supplying weights is equivalent to supply weights that are all 1. If returned, return a tuple: the result and the sum of the weights or count of values. The shape of these two results will be the same. raises ZeroDivisionError if appropriate when result is scalar. (The version in MA does not -- it returns masked values). """ From amundt at pvv.org Thu Mar 14 05:36:10 2002 From: amundt at pvv.org (Amund Tveit) Date: Thu Mar 14 05:36:10 2002 Subject: [Numpy-discussion] Plans for RandomArray support in numarray? Message-ID: <03b401c1cb5d$27272b40$4132f181@AMUND> Are there any plans for supporting RandomArray in numarray type arrays? Amund http://www.idi.ntnu.no/~amundt/ From perry at stsci.edu Thu Mar 14 06:30:16 2002 From: perry at stsci.edu (Perry Greenfield) Date: Thu Mar 14 06:30:16 2002 Subject: [Numpy-discussion] Plans for RandomArray support in numarray? In-Reply-To: <03b401c1cb5d$27272b40$4132f181@AMUND> Message-ID: > > Are there any plans for supporting RandomArray in numarray type arrays? > > Amund > http://www.idi.ntnu.no/~amundt/ > > Yes, we plan to support that and some version of the other existing libraries available (e.g. linear algebra, fft...). Now that we are done (mostly anyway) with the changes needed to add safety checks to numarray, our short-term plans are: 1) Further changes to make numarray more backward compatible. (This is fairly minor work and probably involves less than a weeks work). 2) Documenting the C-API by: a) Adding the appropriate chapters to the numarray manual. b) providing examples of various kinds of C functions, e.g., i) How to add a ufunc. ii) how to add simple C functions... iii) how to add more sophisticated C functions... iv) (possibly) how to use SWIG with numarray 3) To help us do 2) better, add some existing Numeric libraries. In particular: a) FFT b) RandomArray c) linear algebra Probably in that order. Items 2) or 3) may or may not require new releases of numarray. If no basic changes to numarray are required. We probably will try to release the work piecemeal (e.g., updates to the manual, examples, or libraries as add-ons). Hopefully, you will begin to see results here starting in 2 to 4 weeks. Perry From paul at pfdubois.com Thu Mar 14 11:43:15 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Thu Mar 14 11:43:15 2002 Subject: [Numpy-discussion] Numerical Python 21.0 released In-Reply-To: <02031419303100.10325@taco.polycnrs-gre.fr> Message-ID: <000001c1cb90$64e2d140$0e01a8c0@NICKLEBY> Gerard Vermeulen asked me to remove the RPMs from SourceForge, and I have done so. Apparently what is made by default is not useful except for me. I absolutely refuse to learn about RPMs; it is not useful for my job. If the RPM-cult wants this stuff to work then I need patches to setup.py that ALWAYS produce a suitable RPM, or a volunteer who will make and install such files for each release. I will run an automated test if supplied but it can't interfere with my system or require superuser privs, because I don't have that. I appreciate people trying to help and I'm sorry if it wasn't clear just how incompetent I intend to be on this. From gvermeul at polycnrs-gre.fr Thu Mar 14 23:41:11 2002 From: gvermeul at polycnrs-gre.fr (Gerard Vermeulen) Date: Thu Mar 14 23:41:11 2002 Subject: [Numpy-discussion] Numerical Python 21.0 released In-Reply-To: <000001c1cb90$64e2d140$0e01a8c0@NICKLEBY> References: <000001c1cb90$64e2d140$0e01a8c0@NICKLEBY> Message-ID: <02031508403700.11584@taco.polycnrs-gre.fr> Because all RPM based Linux distributions have subtle incompatibilities, it is impossible to write a setup.py script that produces RPMs that will work on all those distributions. Normally, RPMs build on a particular version of a particular distribution will work on other systems with exactly the same version of the same distribution (Paul's setup is not "normal", because his Python interpreter lives in his home directory). If anybody wants to provide RPMs, please code distribution+version in the name of the RPM. The RPM.README in the tar.gz explains how to do this. Gerard On Thursday 14 March 2002 20:42, Paul F Dubois wrote: > Gerard Vermeulen asked me to remove the RPMs from SourceForge, and I > have done so. Apparently what is made by default is not useful except > for me. > > I absolutely refuse to learn about RPMs; it is not useful for my job. If > the RPM-cult wants this stuff to work then I need patches to setup.py > that ALWAYS produce a suitable RPM, or a volunteer who will make and > install such files for each release. I > will run an automated test if supplied but it can't interfere with my > system or require superuser privs, because I don't have that. > > I appreciate people trying to help and I'm sorry if it wasn't clear just > how incompetent I intend to be on this. > From hinsen at cnrs-orleans.fr Fri Mar 15 01:35:06 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Mar 15 01:35:06 2002 Subject: [Numpy-discussion] Numerical Python 21.0 released In-Reply-To: <02031508403700.11584@taco.polycnrs-gre.fr> References: <000001c1cb90$64e2d140$0e01a8c0@NICKLEBY> <02031508403700.11584@taco.polycnrs-gre.fr> Message-ID: Gerard Vermeulen writes: > Because all RPM based Linux distributions have subtle incompatibilities, it > is impossible to write a setup.py script that produces RPMs that will work on > all those distributions. Or at least not simple. I like the idea of providing RPMs for as many distributions as possible, and I volunteer to participate in the effort. In fact, I always make RPMs of all my Python-related packages already for my own use (seven machines), for RedHat 7.x systems. However, I have had difficulties with uploading to SourceForge for almost a full year now, and I don't expect it to get better soon. While building RPMs is a small job for me, uploading the result costs me an enormous effort each time, and I am not willing to waste time on that. I don't know how many people are in a similar situation, but perhaps we could get more RPM packaging volunteers by opening an RPM archive elsewhere. Konrad. From karshi.hasanov at utoronto.ca Fri Mar 15 20:21:02 2002 From: karshi.hasanov at utoronto.ca (Karshi) Date: Fri Mar 15 20:21:02 2002 Subject: [Numpy-discussion] fft_help Message-ID: <20020316042020Z234697-26166+4@bureau8.utcc.utoronto.ca> Hi all, How do use Matlab like " fft_shift" in NumPy? Thanks From huaiyu_zhu at yahoo.com Mon Mar 18 00:40:05 2002 From: huaiyu_zhu at yahoo.com (Huaiyu Zhu) Date: Mon Mar 18 00:40:05 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: <020601c1c595$24bd76c0$6b01a8c0@ericlaptop> Message-ID: I'm a little late to this discussion, but it gives me a chance to read all the existing comments. I'd like to offer some background to one possible solution. On Thu, 7 Mar 2002, eric wrote: > > http://python.sourceforge.net/peps/pep-0225.html > > This one proposes decorating the current binary ops with some > symbols to indicate that they have different behavior than > the standard binary ops. This is similar to Matlab's use of > * for matrix multiplication and .* for element-wise multiplication > or to R's use of * for element-wise multiplication and %*% for > "object-wise" multiplication. > > It proposes prepending ~ to operators to change their behavior so > that ~* would become matrix multiply. > > The PEP is a little more general, but this gives the flavor. > > My hunch is that some form of the second (perhaps drastically reduced) would > meet with more success. The suggested ~* or even the %*% operator are both > palitable. Such details can be decided later. The question is whether there is > sufficient interest to try and push the operator idea through? It would take > much longer than choosing something we can do ourselves (like .M), but the > operator solution seems more desirable to me. > I'm one of the coauthor of this PEP. I'm very glad to see additional interest in this proposal. It is not just a proposal - there was actually a patch made by Gregory Lielens for the ~op operators for Python 2.0. It redefines ~ so that it can be combined with + - * / ** to form new operators. It is quite ingenious in that all the original bitwise operations on ~ alone are still valid. The new operator can be assigned any semantics with hooks like __tmul__ and __rtmul__. The idea is that a matrix class would define __mul__ so that * is matrix multiplication and define __tmul__ so that ~* is elementwise operation. There is a test implementation on the MatPy homepage (matpy.sourceforge.net). So what was holding it back? Well, last time around when this was discussed, it appears that most of the heavy weights in the Numeric community favored either keeping the status quo, or using ~* symbol for arrays. We hoped to use the MatPy package as a test case to show that it is possible to have two entirely different kinds of objects, where the meaning of * and ~* are switched. However, for various reasons I was not able to act upon it for months, and Python evolved into 2.1 and 2.2. I never had much time to update the patch, and felt the attempt was futile as 1) Python was evolving quite fast, 2) I did not heard much about this issue since then. I often feel guilty about the lapse. Now it might be a good time to revive this proposal, as the idea of having matrices and arrays with independent semantics but possibly related implementation appears to be gaining some additional acceptance. Some ancillary issues that hindered the implementation at that time have also been solved. For example, using .I for inverse, .T for transpose, etc, was costly because of the need to override __getattr__ and __coerce__, making a matrix class less attractive in practice. These can now be implemented efficiently using the new set/get mechanism. I'd like to hear any suggestions on how to proceed. My own favorite would be to have separate array and matrix classes with easy but explicit conversions between them. Without conversions, arrays and matrices would be completely independent semantically. In other words, I'm mostly in favor of Konrad Hinsen's position, with the addition of using ~ operators for elementwise operations for matrix-like classes. The PEP itself also discussed ideas of extending the meaning of ~ to other parts of Python for elementwise operations on aggregate types, but my impressions of people's impressions is that it has a better chance without that part. Huaiyu From a.schmolck at gmx.net Mon Mar 18 06:55:20 2002 From: a.schmolck at gmx.net (A.Schmolck) Date: Mon Mar 18 06:55:20 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: [Sorry about the crossposting, but it also seemed relevant to both scipy and numpy...] Huaiyu Zhu writes: [...] > I'd like to hear any suggestions on how to proceed. My own favorite would > be to have separate array and matrix classes with easy but explicit > conversions between them. Without conversions, arrays and matrices would > be completely independent semantically. In other words, I'm mostly in > favor of Konrad Hinsen's position, with the addition of using ~ operators > for elementwise operations for matrix-like classes. The PEP itself also > discussed ideas of extending the meaning of ~ to other parts of Python for > elementwise operations on aggregate types, but my impressions of people's > impressions is that it has a better chance without that part. > Well, from my impression of the previous discussions, the situation (both for numpy and scipy) seems to boil down to me as follows: Either `array` currently is too much of a matrix, or too little: Linear algebra functionality is currently exclusively provided by `array` and libraries that operate on and return `array`s, but the computational and notational efficiency leaves to be desired (compared to e.g. Matlab) in some areas, importantly matrix multiplications (which are up to 40 times slower) and really awkward to write (and much more importantly, decipher afterwards). So I think what one should really do is discuss the advantages and disadvantages of the two possible ways out of this situation, namely providing: 1) a new (efficient) `matrix` class/type (and appropriate libraries that operate on it) [The Matrix class that comes with Numeric is more some syntactic sugar wrapper -- AFAIK it's not use as a return type or argument in any of the functions that only make sense for arrays that are matrices]. 2) the additional functionality that is needed for linear algebra in `array` and the libraries that operate on it. (see [1] below for what I feel is currently missing and could be done either in way 1) or 2)) I think it might be helpful to investigate these "macro"-issues before one gets bogged down in discussions about operators (I admit these are not entirely unrelated -- given that one of the reasons for the creation of a Matrix type would be that '*' is already taken in 'array's and there is no way to add a new operator without modifying the python core -- just for the record and ignoring my own advice, _iff_ there is a chance of getting '~*' into the language, I'd rather have '*' do the same for both matrices and arrays). My impression is that the best path also very much depends on the what the feature aspirations and divisions of labor of numpy/numarray and scipy are going to be. For example, scipy is really aimed at scientific users, which need performance, and are willing to buy it with inconvenience (like the necessity to install other libraries on one's machine, most prominently atlas and blas). The `array` type and the functions in `Numeric`, on the other hand, potentially target a much wider community -- the efficient storage and indexing facilities (rich comparisons, strides, the take, choose etc. functions) make it highly useful for code that is not necessarily numeric, (as an example I'm currently using it for feature selection algorithms, without doing any numerical computations on the arrays). So maybe (a subset of) numpy should make it into the python core (or an as yet `non-existent sumo-distribution`) [BTW, I also wonder whether the python-core array module could be superseded/merged with numpy's `array`? One potential show stopper seems to be that it is e.g. `pop`able]. In such a scenario, where numpy remains relatively general (and might even aim at incorporation into the core), it would be a no-no to bloat it with too much code aimed at improving efficiency (calling blas when possible, sparse storage etc.). On the other hand people who want to do serious numerical work will need this -- and the scipy community already requires atlas etc. and targets a more specialized audience. Under this consideration it might be an attractive solution do incorporate good matrix functionality (and possibly other improvements for hard core number crunchers) in scipy only (or at least limit the efficient _implementation_ of matrices to scipy, providing at only a pure python class or so in numpy). I'm not suggesting, BTW, to necessarily put all of [1] into a single class -- it seems sensible to have a couple of subclasses (for masked, sparse representations etc.) to `matrix` (maybe the parent-class should even be a relatively na?ve Numpy implementation, with the good stuff as subclasses in scipy...). In any event, creating a new matrix class/type would also mean that matrix functionality in libraries should use and return this class (existing libraries should presumably largely still operate on arrays for backwards-compatibily (or both -- after a typecheck), and some matrix operations are so useful that it makes sense to provide array versions for them (e.g. dot) -- but on the whole it makes little sense to have a computationally and space efficient matrix type if one has to cast it around all the time). A `matrix` class is more specialized than an `array` and since the operations one will often do on it are consequently more limited, I think it should provide most important functionality as methods (rather than as external functions; see [2] for a list of suggestions). Approach 1) on the other hand would have the advantage that the current interface would stay pretty much the same, and as long as 2D arrays can just be regarded as matrices, there is no absolutely compelling reason not to stuff everything into array (at least the scipy-version thereof). Another important question to ask before deciding what to change how and if, is obviously how many people in the scipy/numpy community do lots of linear algebra (and how many deflectors from matlab etc. one could hope to win if one spiced things up a bit for them...), but I would suppose there must be quite a few (but I'm certainly biased ;). Unfortunately, I've really got to do some work again now, but before I return to number-crunching I'd like to say that I'd be happy to help with the implementation of a matrix class/type in python (I guess a .py-prototype would be helpful to start with, but ultimately a (subclassable) C(++)-type will be called for, at least in scipy). --alex Footnotes: [1] The required improvements for serious linear algebra seem to be: - optional use (atlas) blas routines for real and complex matrix, matrix `dot`s if atlas is available on the build machine (see http://www.scipy.org/Members/aschmolck for a patch -- it produces speedups of more than factor 40 for big matrices; I'd be willing to provide an equivalent patch for the scipy distribution if there is interest) - making sure that no unnecessary copies are created (e.g. when transposing a matrix to use it in `dot` -- AFAIK although the transpose itself only creates a new view, using it for dot results in a copy (but I might be wrong here )) - allowing more space efficient storage forms for special cases (e.g. sparse matrices, upper triangular etc.). IO libraries that can save and load such representations are also needed (methods and static methods might be a good choice to keep things transparent to the user). - providing a convinient and above all legible notation for common matrix operations (better than `dot(tranpose(A),B)` etc. -- possibilities include A * B.T or A ~* B.T or A * B ** T (by overloding __rpow__ as suggested in a previous post)) - (in the case of a new `matrix` class): indexing functionality (e.g. `where`, `choose` etc. should be available without having to cast, e.g. for the common case that I want to set everything under a certain threshold to 0., I don't want to have to cast my sparse matrix to an array etc.) [2] What should a matrix class contain? - a dot operator (certainly eventually, but if there is a good chance to get ~* into python, maybe '*' should remain unimplemented till this can be decided) - most or all of what scipy's linalg module does - possibly IO, (reading as a static method) - indexing (the like of take, choose etc. (some should maybe be functions or static methods)) -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.Schmolck at gmx.net http://www.dcs.ex.ac.uk/people/aschmolc/ From hinsen at cnrs-orleans.fr Mon Mar 18 08:00:08 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Mon Mar 18 08:00:08 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: a.schmolck at gmx.net (A.Schmolck) writes: > Linear algebra functionality is currently exclusively provided by `array` and > libraries that operate on and return `array`s, but the computational and > notational efficiency leaves to be desired (compared to e.g. Matlab) in some > areas, importantly matrix multiplications (which are up to 40 times slower) > and really awkward to write (and much more importantly, decipher afterwards). Computational and notational efficiency are rather well separated, fortunately. Both the current dot function and an hypothetical matrix multiply operator could be implemented in straightforward C code or using a high-performance library such as Atlas. In fact, this should even be an installation choice in my opinion, as installing Atlas isn't trivial on all machines (e.g. with some gcc versions), and I consider it important for fundamental libraries that they work everywhere easily, even if not optimally. > My impression is that the best path also very much depends on the what the > feature aspirations and divisions of labor of numpy/numarray and scipy are > going to be. For example, scipy is really aimed at scientific users, which > need performance, and are willing to buy it with inconvenience (like the I see the main difference in distribution philosophy. NumPy is an add-on package to Python, which is in turn used by other add-on packages in a modular way. SciPy is rather a monolithic super-distribution for scientific users. Personally I strongly favour the modular package approach, and in fact I haven't installed SciPy on my system for that reason, although I would be interested in some of its components. > algorithms, without doing any numerical computations on the arrays). So maybe > (a subset of) numpy should make it into the python core (or an as yet This has been discussed already, and it might well happen one day, but not with the current NumPy implementation. Numarray looks like a much better candidate, but isn't ready yet. > In such a scenario, where numpy remains relatively general (and > might even aim at incorporation into the core), it would be a no-no > to bloat it with too much code aimed at improving efficiency > (calling blas when possible, sparse storage etc.). On the other hand The same approach as for XML could be used: a slim-line version in the standard distribution that could be replaced by a high-efficiency extended version for those who care. > attractive solution do incorporate good matrix functionality (and > possibly other improvements for hard core number crunchers) in scipy > only (or at least limit the efficient _implementation_ of matrices > to scipy, providing at only a pure python class or so in numpy). I'm I'd love to have efficient matrices without having to install the whole SciPy package! Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From pearu at cens.ioc.ee Mon Mar 18 12:35:24 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Mon Mar 18 12:35:24 2002 Subject: [SciPy-dev] Re: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: Off topic warning On 18 Mar 2002, Konrad Hinsen wrote: > I see the main difference in distribution philosophy. NumPy is an > add-on package to Python, which is in turn used by other add-on > packages in a modular way. SciPy is rather a monolithic > super-distribution for scientific users. > > Personally I strongly favour the modular package approach, and in fact > I haven't installed SciPy on my system for that reason, although I > would be interested in some of its components. Me too. In what I have contributed to SciPy, I have tried to follow this modularity approach. Modularity is also important property from the development point of view: it minimizes possible interference with other unreleated modules and their bugs. What I am trying to say here is that SciPy can (and should?, +1 from me) provide its components separately, though, currently only few of its components seem to be available in that way without some changes. Pearu From a.schmolck at gmx.net Mon Mar 18 14:33:05 2002 From: a.schmolck at gmx.net (A.Schmolck) Date: Mon Mar 18 14:33:05 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: Konrad Hinsen writes: > Computational and notational efficiency are rather well separated, > fortunately. Both the current dot function and an hypothetical matrix Yes, the only thing they have in common is that both are currently unsatisfactory (for matrix operations) in numpy, at least for my needs. Although I've solved my most pressing performance problems by patching Numeric [1], I'm obviously interested in a more official solution (i.e. one that is maintained by others :) [...] [order changed by me] > a.schmolck at gmx.net (A.Schmolck) writes: > > My impression is that the best path also very much depends on the what the > > feature aspirations and divisions of labor of numpy/numarray and scipy are ^^^^^^^ Darn, I made a confusing mistake -- this should read _future_. > > going to be. For example, scipy is really aimed at scientific users, which > > need performance, and are willing to buy it with inconvenience (like the > > I see the main difference in distribution philosophy. NumPy is an > add-on package to Python, which is in turn used by other add-on > packages in a modular way. SciPy is rather a monolithic > super-distribution for scientific users. > > Personally I strongly favour the modular package approach, and in fact > I haven't installed SciPy on my system for that reason, although I > would be interested in some of its components. [...] > The same approach as for XML could be used: a slim-line version in the > standard distribution that could be replaced by a high-efficiency > extended version for those who care. [...] I personally agree with all your above points -- if you have a look at our "dotblas"-patch mentioned earlier (see [1]), you will find that it aims to do provide that -- have dot run anywhere without a hassle but run (much) faster if the user is willing to install atlas. My main concern was that the argument should shift away a bit from syntactic and implementation details to what audiences and what needs numpy/numarray and are supposed to address and, in this light, how to best strike the balance between convinience for users and maitainers, speed and bloat, generality and efficiency etc. As an example, adding the dotblas patch [1] to Numeric is, I think more convinient for the users (granting a few assumptions (like that it actually works :) for the sake of the argument) -- it gives users that have atlas better-performance and those who don't won't (or at least shouldn't) notice. It is however inconvinient for the maintainers. Whether one should bother including it in this or some other way depends, among the obvious question of whether there is a better way to achieve what it does for both groups (like creating a dedicated Matrix class), also on what numpy is really supposed to achieve. I'm not entirely clear on that. For example I don't know how many numpy users deeply care about their matrix multiplications for big (1000x1000) matrices being 40 times faster. The monolithic approach is not entirely without its charms (remember python's "batteries included" jinggle)? Apart from convinience factors it also has the not unconsiderable advantage that people use _one_ standard module for a certain thing -- rather than 20 different solutions. This certainly helps to improve code quality. Not least because someone goes through the trouble of deciding what merrit's inclusion in the "Big Thing", possibly urging changes but at least almost certainly taking more time for evalutation than an indivdual programmer who just wants to get a certain job done. It also makes life easier for module writers -- they can rely on certain stuff being around (and don't have to reinvent the wheel, another potential improvement to code quality). As such it makes live easier for maintainers, as does the scipy commandment that you have to install atlas/lapack, full-stop (and if it doesn't run on your machine -- well at least it works fast for some people and that might well be better than working slow for everyone in this context). So, I think what's good really depends on what you're aiming at, that's why I'd like to know what users and developers think about these matters. My points regarding scipy and numpy/numarray were just one attempt at interpreting what these respective libraries try to/should/could attempt to be or become. Now, not being a developer for either of them (I've only submitted a few minor patches to scipy), I'm not in a particular good position to venture such interpretations, but I hoped that it would provoke other and more knowledgeable people to share their opinions and insights on this matter (as indeed you did). > I'd love to have efficient matrices without having to install the > whole SciPy package! Welcome to the linear algebra lobby group ;) yep, that would be nice but my impression was that the scipy folks are currently more concerned about performance issues than the numpy/numarray folks and I could live with either package providing what I want. Ideally , I'd like to see a slim core numarray, without any frills (and more streamlined to behave like standard python containers (e.g. indexing and type/casts behavior)) for the python core, something more enabled and efficient for numerics (including matrices!) as a seperate package (like the XML example you quote). And then maybe a bigger pre-bundled collection of (ideally rather modular) numerical libraries for really hard-core scientific users (maybe in the spirit of xemacs-packages and sumo-tar-balls -- no bloat if you don't need it, plenty of features in an instant if you do). Anyway, is there at least general agreement that there should be some new and wonderful matrix class (plus supporting libraries) somewhere (rather than souping up array)? alex Footnotes: [1] patch for faster dot product in Numeric http://www.scipy.org/Members/aschmolck -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.Schmolck at gmx.net http://www.dcs.ex.ac.uk/people/aschmolc/ From hinsen at cnrs-orleans.fr Tue Mar 19 03:09:02 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Tue Mar 19 03:09:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: a.schmolck at gmx.net (A.Schmolck) writes: > > > feature aspirations and divisions of labor of numpy/numarray and scipy are > ^^^^^^^ > Darn, I made a confusing mistake -- this should read _future_. Or perhaps __future__ ;-) > I personally agree with all your above points -- if you have a look at our > "dotblas"-patch mentioned earlier (see [1]), you will find that it aims to do And I didn't know even about this... > It is however inconvinient for the maintainers. Whether one should bother > including it in this or some other way depends, among the obvious question of There could be two teams, one maintaining a standard portable implementation, and another one taking care of optimization add-ons. >From the user's point of view, what matters most is a single entry-point for finding everything that is available. > The monolithic approach is not entirely without its charms (remember > python's "batteries included" jinggle)? Apart from convinience Sure, but... That's the standard library. Everybody has it, in identical form, and its consistency and portability is taken care off by the Python development team. There can be only *one* standard library that works like this. I see no problem either with providing a larger integrated distribution for specific user communities. But such distribution and packaging strategies should be distinct from development projects. If I can get a certain package only as part of a juge distribution that I can't or don't want to install, then that package is effectively lost for me. Worse, if one package comes with its personalized version of another package (SciPy with NumPy), then I end up having to worry about internal conflicts within my installation. On the other hand, package interdependencies are a big problem in the Open Source community at large, and I have personally been bitten more than once by an incompatible change in NumPy that broke my modules. But I don't see any other solution than better communication between development teams. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From DavidA at ActiveState.com Tue Mar 19 14:05:03 2002 From: DavidA at ActiveState.com (David Ascher) Date: Tue Mar 19 14:05:03 2002 Subject: [Numpy-discussion] ANN: numarray-0.3 References: <3C8E7F39.7090100@stsci.edu> Message-ID: <3C97B5C8.5045F50C@activestate.com> Can I suggest that the generated files not be included in the source tarball? I'm making sure that numarray and Numeric will be available in our PPM repository, and it's confusing the build script when the setup.py tries to overwrite files that are under source code control. --david ascher PS: PPMs for numarray 0.3 and Numeric 21 coming up soon. =) From jmiller at stsci.edu Tue Mar 19 14:18:01 2002 From: jmiller at stsci.edu (Todd Miller) Date: Tue Mar 19 14:18:01 2002 Subject: [Numpy-discussion] ANN: numarray-0.3 References: <3C8E7F39.7090100@stsci.edu> <3C97B5C8.5045F50C@activestate.com> Message-ID: <3C97B8CB.3090403@stsci.edu> David Ascher wrote: >Can I suggest that the generated files not be included in the source >tarball? I'm making sure that numarray and Numeric will be available in >our PPM repository, and it's confusing the build script when the >setup.py tries to overwrite files that are under source code control. > >--david ascher > >PS: PPMs for numarray 0.3 and Numeric 21 coming up soon. =) > >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > Yes. I'll tighten up the MANIFEST.in for the next release. Todd -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 From jochen at unc.edu Tue Mar 19 17:10:22 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Tue Mar 19 17:10:22 2002 Subject: [Numpy-discussion] Re: Python 2.2 seriously crippled for numerical computation? In-Reply-To: References: Message-ID: The following message is a courtesy copy of an article that has been posted to comp.lang.python as well. -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sat, 09 Mar 2002 00:17:40 -0500 Tim Peters wrote: Tim> I hope the bogus underflow problem is fixed in CVS Python now. Tim> Since my platform didn't have the problem (and still doesn't, of Tim> course), we won't know for sure until people try it and report Tim> back. Ok, looking at SourceForge and google this seems to be fixed in cvs HEAD. Would it be possible to put the same patch into the cvs python-2.2 branch, please? [1] Greetings, Jochen Footnotes: [1] If it is in there, it doesn't work for me with current python cvs branch release22-maint. I still have to manually add -lieee. (RedHat-7.0 with current updates.) - -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6-cygwin-fcn-1 (Cygwin) Comment: Processed by Mailcrypt and GnuPG iEYEARECAAYFAjyX4IAACgkQiJ/aUUS8zY52jgCbB69a8KZmCuk9MYIKzNu7EpBR 3fIAmQEYH1/ipB7OoiBcBPIAHev3dRlU =5l6n -----END PGP SIGNATURE----- From tim.one at comcast.net Tue Mar 19 17:21:09 2002 From: tim.one at comcast.net (Tim Peters) Date: Tue Mar 19 17:21:09 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: [jochen at bock.chem.unc.edu] > Ok, looking at SourceForge and google this seems to be fixed in cvs > HEAD. Would it be possible to put the same patch into the cvs > python-2.2 branch, please? [1] > > Greetings, > Jochen > > Footnotes: > [1] If it is in there, it doesn't work for me with current python cvs > branch release22-maint. I still have to manually add -lieee. > (RedHat-7.0 with current updates.) I don't know what "current" meant to you at the time you wrote this. Michael Hudson did backport the patch into 2.2.1c1, which was released yesterday. So please try 2.2.1c1, and if you still have a problem, file a bug report about it on SourceForge. 2.2.1 final is expected in about a week. From jochen at unc.edu Tue Mar 19 17:54:02 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Tue Mar 19 17:54:02 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, 19 Mar 2002 20:20:36 -0500 Tim Peters wrote: Thanks for the quick answer. The problem is resolved. Tim> [jochen at bock.chem.unc.edu] >> [1] If it is in there, it doesn't work for me with current python cvs >> branch release22-maint. I still have to manually add -lieee. >> (RedHat-7.0 with current updates.) Tim> I don't know what "current" meant to you at the time you wrote this. About 20:00 (8:00pm) EST today, March 19. Tim> Tim> Michael Hudson did backport the patch into 2.2.1c1, which Tim> Tim> was released yesterday. So please try 2.2.1c1, and if you Tim> Tim> still have a problem, file a bug report about it on Tim> Tim> SourceForge. 2.2.1 final is expected in about a week. Well, changing cvs from release22-maint to r221c1 helps. That is, everything seems to work fine with the cvs sources tagged r221c1. Then, is it really necessary to mess up the cvs tags so much? Why isn't it possible to have a single python-2.2 branch which one could follow to get all the stuff that's incorporated into that version? [1] There are huge differences between release22-maint and r221c1, it seems from the number of patches applied when going from one to the other. But then some files are in the same (non-main) branch. ??? Thanks for all your work, and thank you for the quick help again. Greetings, Jochen - -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6-cygwin-fcn-1 (Cygwin) Comment: Processed by Mailcrypt and GnuPG iEYEARECAAYFAjyX6yYACgkQiJ/aUUS8zY76QwCdGZsJd1b+0qJ19LJ5TlvwI5fP kbcAniNe/5eiPnEUfGbddLCpyYD1+gmr =fLmk -----END PGP SIGNATURE----- From tim.one at comcast.net Tue Mar 19 18:21:03 2002 From: tim.one at comcast.net (Tim Peters) Date: Tue Mar 19 18:21:03 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: [jochen at bock.chem.unc.edu] > Thanks for the quick answer. The problem is resolved. Cool! Glad to hear it. > Well, changing cvs from > release22-maint > to > r221c1 > helps. That is, everything seems to work fine with the cvs sources > tagged r221c1. That shouldn't have made any difference -- r221c1 is merely a tag on the release22-maint branch. Now I can spend a lot of time trying to guess why your checkout is screwed up (probably stale sticky flags, if it is), or you can try blowing away your checkout and starting over. I know which one gets my vote . CVS branches and tags are a nightmare: when in any doubt, kill the beast and start over. > Then, is it really necessary to mess up the cvs tags so much? Why > isn't it possible to have a single python-2.2 branch which one could > follow to get all the stuff that's incorporated into that version? That's what the release22-maint branch is supposed to be (and, AFIAK, is). > There are huge differences between release22-maint and r221c1, What makes you think so? I just did cvs diff -r release22-maint -r r221c1 and it turned up expected differences in the handful of files that indeed *have* changed since r221c1 was tagged, mostly in the docs and under the Mac subdirectory: Index: Doc/lib/libcopyreg.tex Index: Doc/lib/libthreading.tex Index: Lib/urllib.py Index: Mac/_checkversion.py Index: Mac/Build/PythonCore.mcp Index: Mac/Distributions/(vise)/Python 2.2.vct Index: Mac/Include/macbuildno.h Index: Mac/Modules/macfsmodule.c Index: Mac/Modules/macmodule.c Index: Misc/NEWS Index: PCbuild/BUILDno.txt > ... > Thanks for all your work, and thank you for the quick help again. And thanks for checking that your problem is fixed in 221c1. Had anyone tried this stuff in 22a1 or 22a2 or 22a3 or 22a4 or 22b1 or 22b2 or 22c1 (yes, we actually cut 7 full prerelease distributions for 2.2!), it would have worked in 2.2 out of the box. Keep that in mind when 2.3a1 comes out . From prabhu at aero.iitm.ernet.in Wed Mar 20 10:22:14 2002 From: prabhu at aero.iitm.ernet.in (Prabhu Ramachandran) Date: Wed Mar 20 10:22:14 2002 Subject: [SciPy-dev] Re: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: <15512.54075.312312.299325@monster.linux.in> hi, I'm sorry I havent been following the discussion too closely and this post might be completely unrelated. >>>>> "AS" == A Schmolck writes: AS> Ideally , I'd like to see a slim core numarray, without any AS> frills (and more streamlined to behave like standard python AS> containers (e.g. indexing and type/casts behavior)) for the AS> python core, something more enabled and efficient for numerics AS> (including matrices!) as a seperate package (like the XML AS> example you quote). And then maybe a bigger pre-bundled AS> collection of (ideally rather modular) numerical libraries for AS> really hard-core scientific users (maybe in the spirit of AS> xemacs-packages and sumo-tar-balls -- no bloat if you don't AS> need it, plenty of features in an instant if you do). AS> Anyway, is there at least general agreement that there should AS> be some new and wonderful matrix class (plus supporting AS> libraries) somewhere (rather than souping up array)? Ideally, I'd like something that also has a reasonably easy to use interface from C/C++. The idea is that it should be easy (and natural) for someone to use the same library from C/C++ when performance was desired. This would be really nice and very useful. prabhu From Aureli.Soria_Frisch at ipk.fhg.de Wed Mar 20 10:48:04 2002 From: Aureli.Soria_Frisch at ipk.fhg.de (Aureli Soria Frisch) Date: Wed Mar 20 10:48:04 2002 Subject: [Numpy-discussion] Different behaviour logical_and/and In-Reply-To: References: Message-ID: Hi all, In the version of Numeric with MacPython2.2 the functions "Numeric.logical_and" and "and" behave different, although up to the on-line documentation they should behave the same: >>> Numeric.logical_and(a,b) array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]) >>> a and b array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) for arrays: >>> a array([0, 1, 0, 0, 1, 0, 1, 0, 0, 1]) >>> b array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) or am I misunderstood something...? Regards, Aureli ################################# Aureli Soria Frisch Fraunhofer IPK Dept. Pattern Recognition post: Pascalstr. 8-9, 10587 Berlin, Germany e-mail: aureli at ipk.fhg.de fon: +49 30 39006-150 fax: +49 30 3917517 web: http://vision.fhg.de/~aureli/web-aureli_en.html ################################# From paul at pfdubois.com Wed Mar 20 13:14:10 2002 From: paul at pfdubois.com (Paul Dubois) Date: Wed Mar 20 13:14:10 2002 Subject: [Numpy-discussion] [ANN] Pyfort 7.0 Extending Numpy with C Message-ID: <000701c1d053$c501d860$09860cc0@CLENHAM> A beta version of Pyfort 7.0 is available at pyfortran.sf.net. The documentation is not yet upgraded to this version. Pyfort 7.0 adds the ability to wrap Fortran-like C coding to extend Numpy. Dramatically illustrating the virtue of open source software, credit for this improvement goes to: Michiel de Hoon Human Genome Center University of Tokyo For example, if you have this C code: double mydot(int n, double* x, double *y) { int i; double d; d = 0.0; for (i=0; i < n; ++i) d += x[i] * y[i]; return d; } Then you can create a Pyfort input file mymod.pyf: function mydot (n, x, y) integer:: n=size(x) doubleprecision x(n), y(n) doubleprecision mydot end Compile mydot.c into a library libmydot.a. Then: pyfort -c cc -i -L. -l mydot mymod.pyf builds and installs the module mymod containing function mydot, which you can use from Python: import Numeric, mymod x=Numeric.array([1.,2.3.]) y=Numeric.array([5., -1., -1.]) print mymod.mydot(x,y) Note that by wrapping mydot in this way, Pyfort takes care of problems like converting input arrays of the wrong type, such as integer; making sure that x and y have the same length; and making sure x and y are contiguous. I added directory testc that contains an example like this and one where an array is output. Mr. de Hoon explained his patch as follows. "I have modified fortran_compiler.py to add gcc as a compiler. This enables pyfort to be used for C code instead of Fortran code only. To use this option, call pyfort with the -cgcc option to specify gcc as the compiler. In order to switch off the default TRANSPOSE and MIRROR options, some small modifications were needed in generator.py also. [Editor's note: both -c gcc and -c cc will work] Before writing this addition to pyfort, I tried to use swig to generate the wrapper for my C code. However, pyfort was easier to use in my case because it is integrated with numpy. I wasn't able to get swig use numpy arrays. In addition, I am writing extension code both in fortran and C, so it is easier having to use only one tool (pyfort) for both. In a sense, it is easier to extend python with C than with fortran because you don't have to worry about transposing the array. I tried to be minimally instrusive on the existing pyfort code to switch off transposing arrays; there may be prettier ways to do this than what I have done. With this modification, I was able to pass one- and two-dimensional numpy arrays from Python to C and back without problems, as well as scalar variables with intent(in) and intent(out). I have also used the modified Pyfort on some Fortran routines to make sure I didn't break something in the fortran part of Pyfort. I haven't done an extensive test of this addition, but I haven't found any problems with it so far. I hope this patch will be useful to other people trying to extend Python/numpy with C routines." Michiel de Hoon Human Genome Center University of Tokyo mdehoon at ims.u-tokyo.ac.jp From jochen at jochen-kuepper.de Wed Mar 20 19:06:03 2002 From: jochen at jochen-kuepper.de (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Wed Mar 20 19:06:03 2002 Subject: [Numpy-discussion] Different behaviour logical_and/and In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 20 Mar 2002 19:43:03 +0100 Aureli Soria Frisch wrote: >>>> a Aureli> array([0, 1, 0, 0, 1, 0, 1, 0, 0, 1]) >>>> b Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) >>>> Numeric.logical_and(a,b) Aureli> array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]) This look's correct... >>>> a and b Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) ... and this suspiciously like b. Greetings, Jochen - -- Einigkeit und Recht und Freiheit http://www.Jochen-Kuepper.de Libert?, ?galit?, Fraternit? GnuPG key: 44BCCD8E Sex, drugs and rock-n-roll -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: Processed by Mailcrypt and GnuPG iD8DBQE8mU3jiJ/aUUS8zY4RAnxdAKCfrCyep4b7TKoF+c631cJiX53GmACgtmGo jcS9gt6j1eJJj937JqoUm6M= =TyKW -----END PGP SIGNATURE----- From pearu at cens.ioc.ee Wed Mar 20 23:58:24 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 20 23:58:24 2002 Subject: [Numpy-discussion] Different behaviour logical_and/and In-Reply-To: Message-ID: On 20 Mar 2002, Jochen K?pper wrote: > >>>> a > Aureli> array([0, 1, 0, 0, 1, 0, 1, 0, 0, 1]) > >>>> b > Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) > > > >>>> Numeric.logical_and(a,b) > Aureli> array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]) > > This look's correct... > > >>>> a and b > Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) > > ... and this suspiciously like b. .. and also correct. It is default behaviour of Python `and' operation. From Python Language Reference: The expression `x and y' first evaluates `x'; if `x' is false, its value is returned; otherwise, `y' is evaluated and the resulting value is returned. So, `and' operation is "object"-operation (unless redefined to something else) while logical_and is "elementwise"-operation. Pearu From huaiyu_zhu at yahoo.com Sat Mar 23 00:32:02 2002 From: huaiyu_zhu at yahoo.com (Huaiyu Zhu) Date: Sat Mar 23 00:32:02 2002 Subject: [Numpy-discussion] Different behaviour logical_and/and In-Reply-To: Message-ID: On Thu, 21 Mar 2002, Pearu Peterson wrote: > On 20 Mar 2002, Jochen K?pper wrote: > > >>>> a > > Aureli> array([0, 1, 0, 0, 1, 0, 1, 0, 0, 1]) > > >>>> b > > Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) > > > > >>>> Numeric.logical_and(a,b) > > Aureli> array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]) > > > > This look's correct... > > > > >>>> a and b > > Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) > > > > ... and this suspiciously like b. > > . and also correct. It is default behaviour of Python `and' > operation. From Python Language Reference: > > The expression `x and y' first evaluates `x'; if `x' is false, its value > is returned; otherwise, `y' is evaluated and the resulting value is > returned. > > So, `and' operation is "object"-operation (unless redefined to something > else) while logical_and is "elementwise"-operation. There is a section in PEP-225 for elementwise/objectwise operators to extend the meaning of ~ to an "elementizer", so that [1, 0, 1, 0] and [0, 1, 1, 0] => [0, 1, 1, 0] [1, 0, 1, 0] ~and [0, 1, 1, 0] => [0, 0, 1, 0] There are several other places, entirely unrelated to numerical computation, that elementization of an operator makes sense. Huaiyu From paul at pfdubois.com Sun Mar 24 08:49:05 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Sun Mar 24 08:49:05 2002 Subject: [Numpy-discussion] [ANN] Pyfort-7.0b2 -- extending Numeric with C routines Message-ID: <000001c1d353$c816d390$0f01a8c0@NICKLEBY> Pyfort 7.0b2 is now available at sf.net/projects/pyfortran. Nummies who do not use Fortran may be interested in using Pyfort; with Michiel de Hoon's "C" option, it is now extremely easy to wrap a simple kind of C code for processing Numeric's arrays, like this: double ctry(int n, double* x, double* y) { int i; double d; d = 0.0; for (i=0; i < n; ++i) { d += x[i] * y[i]; } return d; } void cout(int n, double* x, double* y) { int i; for (i = 0; i < n; ++i) { y[i] = 1.414159 * x[i]; } } double c2(int n, int m, double x[n][m]) { double sum = 0.0; int i, j; for (i=0; i < n; i++) { for (j=0; j < m; j++) { sum += x[i][j]; } } return sum; } and then call it from Python like this: import testc, Numeric x = Numeric.array([1.,2.,3.]) y = Numeric.array([6.,-1.,-1.]) z = Numeric.arange(6) *1.0 z.shape=(3,2) print "Should be 1.0:", testc.ctry(x,y) print "Should be sqrt(2) * [1,2,3]:", testc.cout(x) print "Should be 15.0:", testc.c2(z) z.shape=(2,3) print "Should be 15.0:", testc.c2(z) ----------- notes Somehow 7.0b1 was missing the C examples. I apparently lost the changes I had made to the MANIFEST and did not realize my CVS commits had failed. Bad day at the office, I guess. I added testc back in, and added a 2-dimensional example. I also eliminated some warning errors in the generated code, and fixed an error in that case. My thanks to Michiel de Hoon and J.S. Whitaker. From pearu at cens.ioc.ee Sun Mar 24 09:59:04 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sun Mar 24 09:59:04 2002 Subject: f2py comparison Re: [Numpy-discussion] [ANN] Pyfort 7.0 Extending Numpy with C In-Reply-To: <000701c1d053$c501d860$09860cc0@CLENHAM> Message-ID: Hi, Nummies might be interested how to wrap C codes with f2py. On Wed, 20 Mar 2002, Paul Dubois wrote: > For example, if you have this C code: > > double mydot(int n, double* x, double *y) { > int i; > double d; > d = 0.0; > for (i=0; i < n; ++i) d += x[i] * y[i]; > return d; > } > > Then you can create a Pyfort input file mymod.pyf: > function mydot (n, x, y) > integer:: n=size(x) > doubleprecision x(n), y(n) > doubleprecision mydot > end Different from pyfort, f2py needs the following signature file: python module mymod interface function mydot (n, x, y) intent(c) mydot integer intent(c):: n=size(x) doubleprecision x(n), y(n) doubleprecision mydot end end interface end python module > > Compile mydot.c into a library libmydot.a. > Then: > > pyfort -c cc -i -L. -l mydot mymod.pyf > > builds and installs the module mymod containing function mydot, With f2py the above is equivalent to f2py -c mydot.c mymod.pyf This compiles mydot.c and builds the module mymod into the current directory. > which you > can use from Python: > > import Numeric, mymod > x=Numeric.array([1.,2.3.]) > y=Numeric.array([5., -1., -1.]) > print mymod.mydot(x,y) Python session with f2py generated mymod: >>> import mymod >>> print mymod.mydot([1,2,3],[1,2,4.]) 17.0 Regards, Pearu From oliphant.travis at ieee.org Sun Mar 24 16:49:03 2002 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sun Mar 24 16:49:03 2002 Subject: [Numpy-discussion] Re: [SciPy-dev] accuracy problem in filterdesign.py In-Reply-To: <000b01c1d37c$4a64a5c0$e5db9e3e@arrow> References: <000b01c1d37c$4a64a5c0$e5db9e3e@arrow> Message-ID: On Sunday 24 March 2002 02:38 pm, you wrote: > Hello, > > while writing a test driver for a minimum phase calculation routine I came > across the following problem. It is causing asymmetriesin the output of > > >>> N=512 > >>> lastpoint=2*pi > >>> w1=arange(0,lastpoint,lastpoint/N) > >>> w2=arange(0,N)*(lastpoint/N) > >>> lastpoint-w1[511]-w1[1] > > -6.3546390371982397e-014 > > >>> lastpoint-w2[511]-w2[1] > > 4.0245584642661925e-016 > > >>> w1[511] > > 6.2709134608765646 > > >>> w2[511] > > 6.2709134608765007 > > >>> w2[511]-w1[511] > > -6.3948846218409017e-014 > I just fixed this in Numeric. The arange in Numeric used to increment the value by the step amount. It now computes the value using value = start + i*step which fixes the problem. Thanks for pointing this out. From mall at kornet.net Wed Mar 27 06:36:11 2002 From: mall at kornet.net (¸ô¸¶½ºÅ¸) Date: Wed Mar 27 06:36:11 2002 Subject: [Numpy-discussion] (numpy-discussion´Ô) µðÁöŻī¸Þ¶ó & ½ºÄ³³Ê Á¤º¸.(È«^º¸) Message-ID: An HTML attachment was scrubbed... URL: From magnus at hetland.org Wed Mar 27 11:23:19 2002 From: magnus at hetland.org (Magnus Lie Hetland) Date: Wed Mar 27 11:23:19 2002 Subject: [Numpy-discussion] Lazy video array? Message-ID: <20020327202138.A4132@idi.ntnu.no> Just something I've been thinking about for a few years (and never gotten around to doing anything about)... How realistic would it be to wrap a video file as a type of three-dimensional (assuming grayscale) array object and then use e.g. numarray to manipulate it? And how easy would it be to make this sort of thing "lazy", so that only the parts needed for the parts you actually access (for display or whatever) are processed? E.g. (silly example): >>> a = videoarray('somefile.mpg') >>> b = sin(a) # No real computation here >>> for frame in b: ... displayFrame(frame) # Computation performed here... Or something... Or maybe I'm just bonkers ;) By the way: Is there any documentation of the numarray C API anywhere yet? -- Magnus Lie Hetland The Anygui Project http://hetland.org http://anygui.org From jochen at unc.edu Wed Mar 27 13:09:00 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Wed Mar 27 13:09:00 2002 Subject: [Numpy-discussion] Lazy video array? In-Reply-To: <20020327202138.A4132@idi.ntnu.no> References: <20020327202138.A4132@idi.ntnu.no> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 27 Mar 2002 20:21:38 +0100 Magnus Lie Hetland wrote: Magnus> By the way: Is there any documentation of the numarray C API Magnus> anywhere yet? I think ,---- | http://stsdas.stsci.edu/numarray/DesignOverview.html `---- is all there is for now. And it's probably not exactly right any more... Greetings, Jochen - -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6-cygwin-fcn-1 (Cygwin) Comment: Processed by Mailcrypt and GnuPG iEYEARECAAYFAjyiMsIACgkQiJ/aUUS8zY6wfgCdHxsqk3NnWRzme0M/6wYcfpKK kA8AoJ0p0Kg05UOwCOk2QkPu6mQft5xj =WLhN -----END PGP SIGNATURE----- From amitha_linux at yahoo.com Thu Mar 28 06:59:10 2002 From: amitha_linux at yahoo.com (Amitha P) Date: Thu Mar 28 06:59:10 2002 Subject: [Numpy-discussion] Returning a 2-D PyArrayObject Message-ID: <20020328145811.66461.qmail@web13403.mail.yahoo.com> Hi all, I am new to Python/Python extension programming. I am writing a c-extension to python which takes a 3-Dimensional array object and returns the 2Dimensional array object. The c function returns the 2-D array and converts into Python object, but the 2D python array object doesnot contain the correct values, it's priting out some irrevelent values. I could successfully return single dimension Python array object but I am not able to return 2D python array object. I am attaching the code. Please look at it and point out the errors. Thank you very much.. ------------------------------------------------------- CMATRIXMUL16.c - This takes the array (the python 3D array) and returns the 2D array ------------------------------------------------------ #include #include float* cmatrixmul16(float *array,float *paddarr, int n,int r,int col,float **store) { int i,j,k; int devices; int pathpoints; int column; int len; float sum; float **a,**b,**c,**d; float *temp; static float **tracematrix; tracematrix = store; pathpoints=r; column = col; devices= n; len = pathpoints*2; temp = (float *)malloc( len* sizeof(float)); sum =0.0; a = (float **)malloc(devices * sizeof(float)); b = (float **)malloc(devices * sizeof(float)); c = (float **)malloc(devices * sizeof(float)); d = (float **)malloc(devices * sizeof(float)); for(i=0;i #include #include static PyObject * Py_arraytest1 (PyObject *, PyObject *); static char _tests17_module_doc[] ="tests11: module documentation"; static char arraytest1__doc__[]= "mytest:function documentation"; /********************************Python symbol table *****************************************/ static PyMethodDef _tests17_methods[] = { {"arraytest1" , (PyCFunction)Py_arraytest1 , METH_VARARGS,arraytest1__doc__ }, {NULL, (PyCFunction)NULL,0,NULL } /* terminates the list of methods */ }; /*********************************End of Symbol table ***************************************/ void init_tests17() { /* We will be using C-functions defined in the array module. So we * need to be sure that the shared library defining these functions * is loaded. */ import_array(); (void) Py_InitModule4( "_tests17", /* module name */ _tests17_methods, /* structure containing python symbol info */ _tests17_module_doc, /* module documentation string */ (PyObject *) NULL, PYTHON_API_VERSION); } /*function to calculate the product of two arrays */ static PyObject * Py_arraytest1(PyObject *self, PyObject *args) { PyArrayObject *array, *paddarr, *product; char *c; int n,r,col,i,j,k; int dimensions[2]; float **store; if (!PyArg_ParseTuple(args, "O!O!", &PyArray_Type, &array,&PyArray_Type,&paddarr)) return NULL; /* The arguments are the 3-D array and the 1-Dpaddarr */ n = (long) array->dimensions[0]; r = (long) array->dimensions[1]; col = (long) array->dimensions[2]; store = (float **)malloc(n*sizeof(float)); for(i=0;idata,(float *)paddarr->data,n,r,col,store); dimensions[0] = n; dimensions[1] = 2*r; product = (PyArrayObject *)PyArray_FromDimsAndData(2, dimensions, PyArray_FLOAT,c); PyArray_Return(product); } __________________________________________________ Do You Yahoo!? Yahoo! Movies - coverage of the 74th Academy Awards? http://movies.yahoo.com/ From hinsen at cnrs-orleans.fr Thu Mar 28 08:28:09 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 28 08:28:09 2002 Subject: [Numpy-discussion] Returning a 2-D PyArrayObject In-Reply-To: <20020328145811.66461.qmail@web13403.mail.yahoo.com> References: <20020328145811.66461.qmail@web13403.mail.yahoo.com> Message-ID: Amitha P writes: > irrevelent values. I could successfully return single > dimension Python array object but I am not able to > return 2D python array object. I am attaching the > code. Please look at it and point out the errors. > product = (PyArrayObject *)PyArray_FromDimsAndData(2, > dimensions, > PyArray_FLOAT,c); The variable c in this call must point to a storage area which holds the elements of the matrix, i.e. a one-dimensional float array. What you pass in your code is a list of pointers to the rows of the matrix. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From johanfo at ohman.no Fri Mar 29 12:03:06 2002 From: johanfo at ohman.no (=?iso-8859-1?Q?Johan_Fredrik_=D8hman?=) Date: Fri Mar 29 12:03:06 2002 Subject: [Numpy-discussion] Right behavior Message-ID: <000d01c1d75c$9bfe1e50$c167f081@matrisen> This code generates very non-random numbers, even when the seed value is reinitialized. Take a look at the first number in each run ! Is this right ? -- JF? #!/usr/local/bin/python2.1 # Python Virtual clock import RandomArray print "Seed", RandomArray.get_seed() for i in range(1000000,10000000,1000000): print "Clock at time:" , i/1000000, ":", RandomArray.normal(10,2) [root at blekkulf /root]# ./t2.py Seed (101743, 1951) Clock at time: 1 : 7.98493051529 Clock at time: 2 : 10.8439420462 Clock at time: 3 : 7.59234881401 Clock at time: 4 : 7.32021093369 Clock at time: 5 : 10.9444898367 Clock at time: 6 : 10.1128772199 Clock at time: 7 : 13.1178274155 Clock at time: 8 : 11.779414773 Clock at time: 9 : 10.7529922128 [root at blekkulf /root]# ./t2.py Seed (101743, 1953) Clock at time: 1 : 7.98525762558 Clock at time: 2 : 9.38142818213 Clock at time: 3 : 7.11979293823 Clock at time: 4 : 10.867649436 Clock at time: 5 : 9.62882992625 Clock at time: 6 : 12.1940765381 Clock at time: 7 : 6.84895467758 Clock at time: 8 : 8.13472533226 Clock at time: 9 : 8.15638375282 [root at blekkulf /root]# ./t2.py Seed (101743, 1959) Clock at time: 1 : 7.98623776436 Clock at time: 2 : 14.5040078163 Clock at time: 3 : 11.3408681154 Clock at time: 4 : 6.32757425308 Clock at time: 5 : 8.94617521763 Clock at time: 6 : 12.1802353859 Clock at time: 7 : 12.0685124397 Clock at time: 8 : 10.5330892205 Clock at time: 9 : 10.9744755626 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tchur at optushome.com.au Fri Mar 29 13:01:15 2002 From: tchur at optushome.com.au (Tim Churches) Date: Fri Mar 29 13:01:15 2002 Subject: [Numpy-discussion] Right behavior References: <000d01c1d75c$9bfe1e50$c167f081@matrisen> Message-ID: <3CA4D3BB.617F2883@optushome.com.au> > Johan Fredrik ?hman wrote: > > This code generates very non-random numbers, even when the seed value > is reinitialized. > Take a look at the first number in each run ! > The first numbers in each of your three runs are 7.98493051529 , 7.98525762558 and 7.98623776436. They look like different numbers to me. If you want the difference between initial values to be greater, you need to make the difference in your seeds greater. For example, if I run your code now, I get 8.29225027561, 8.29484963417 and 8.29744851589, but setting the seed to (1,2) gives an initial value of 5.69397783279. Remember, these are only pseudorandom numbers. Tim C > Is this right ? > > -- > JF? > > > #!/usr/local/bin/python2.1 > > # Python Virtual clock > import RandomArray > > print "Seed", RandomArray.get_seed() > for i in range(1000000,10000000,1000000): > print "Clock at time:" , i/1000000, ":", > RandomArray.normal(10,2) > > > [root at blekkulf /root]# ./t2.py > Seed (101743, 1951) > Clock at time: 1 : 7.98493051529 > Clock at time: 2 : 10.8439420462 > Clock at time: 3 : 7.59234881401 > Clock at time: 4 : 7.32021093369 > Clock at time: 5 : 10.9444898367 > Clock at time: 6 : 10.1128772199 > Clock at time: 7 : 13.1178274155 > Clock at time: 8 : 11.779414773 > Clock at time: 9 : 10.7529922128 > [root at blekkulf /root]# ./t2.py > Seed (101743, 1953) > Clock at time: 1 : 7.98525762558 > Clock at time: 2 : 9.38142818213 > Clock at time: 3 : 7.11979293823 > Clock at time: 4 : 10.867649436 > Clock at time: 5 : 9.62882992625 > Clock at time: 6 : 12.1940765381 > Clock at time: 7 : 6.84895467758 > Clock at time: 8 : 8.13472533226 > Clock at time: 9 : 8.15638375282 > [root at blekkulf /root]# ./t2.py > Seed (101743, 1959) > Clock at time: 1 : 7.98623776436 > Clock at time: 2 : 14.5040078163 > Clock at time: 3 : 11.3408681154 > Clock at time: 4 : 6.32757425308 > Clock at time: 5 : 8.94617521763 > Clock at time: 6 : 12.1802353859 > Clock at time: 7 : 12.0685124397 > Clock at time: 8 : 10.5330892205 > Clock at time: 9 : 10.9744755626 From johanfo at ohman.no Fri Mar 29 13:28:11 2002 From: johanfo at ohman.no (=?iso-8859-1?Q?Johan_Fredrik_=D8hman?=) Date: Fri Mar 29 13:28:11 2002 Subject: [Numpy-discussion] Right behavior References: <000d01c1d75c$9bfe1e50$c167f081@matrisen> <3CA4D3BB.617F2883@optushome.com.au> Message-ID: <002e01c1d768$86ef39c0$c167f081@matrisen> The first numbers in each of your three runs are 7.98493051529 , 7.98525762558 and 7.98623776436. They look like different numbers to me. First, thanks for your answer Time. I do agree, they are different. But I wouldn't call it random. I didn't expect that the small difference in the initial seed would affect the first number with so little. Usually the seed numbers I have experienced other places have much more dramatic effect on the numbers, if you see what I mean... If you want the difference between initial values to be greater, you need to make the difference in your seeds greater. For example, if I run your code now, I get 8.29225027561, 8.29484963417 and 8.29744851589, but setting the seed to (1,2) gives an initial value of 5.69397783279. Remember, these are only pseudorandom numbers. Yes, they are pseudorandom and that is OK. What I just want is some more initial difference between the runs without setting the seed number manually. But know I know this is not a flaw in the RNG, but "its the way it is supposed to be" Thanks -- Johan Fredrik Ohman From tchur at optushome.com.au Fri Mar 29 14:51:03 2002 From: tchur at optushome.com.au (Tim Churches) Date: Fri Mar 29 14:51:03 2002 Subject: [Numpy-discussion] Right behavior References: <000d01c1d75c$9bfe1e50$c167f081@matrisen> <3CA4D3BB.617F2883@optushome.com.au> <002e01c1d768$86ef39c0$c167f081@matrisen> Message-ID: <3CA4ED89.14AE55B0@optushome.com.au> Johan Fredrik ?hman wrote: > > The first numbers in each of your three runs are 7.98493051529 , > 7.98525762558 and 7.98623776436. > They look like different numbers to me. > > First, thanks for your answer Time. > I do agree, they are different. But I wouldn't call it random. I didn't expect > that the small difference in the initial seed would affect the first number with so little. > Usually the seed numbers I have experienced other places have much more > dramatic effect on the numbers, if you see what I mean... OK, you need to use Konrad Hinsen's excellent RNG module which comes with Numeric Python: ################################# # Python Virtual clock import RNG dist = RNG.NormalDistribution(10, 2) rng = RNG.CreateGenerator(0, dist) for i in range(1000000,10000000,1000000): print "Clock at time:" , i/1000000, ":", rng.ranf() ################################## The above code gives 8.46183655136, 7.29889782477 and 5.58243682462 as the first values in three successive runs on my system. Hope this helps, Tim C > > If you want the difference > between initial values to be greater, you need to make the > difference in your seeds greater. For example, if I run your code now, I > get 8.29225027561, 8.29484963417 and 8.29744851589, but setting the seed > to (1,2) gives an initial value of 5.69397783279. Remember, these are > only pseudorandom numbers. > > Yes, they are pseudorandom and that is OK. What I just want is some more > initial difference between the runs without setting the seed number manually. > But know I know this is not a flaw in the RNG, but "its the way it is supposed to be" > > Thanks > > -- > Johan Fredrik Ohman > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From cnetzer at mail.arc.nasa.gov Fri Mar 29 18:01:02 2002 From: cnetzer at mail.arc.nasa.gov (Chad Netzer) Date: Fri Mar 29 18:01:02 2002 Subject: [Numpy-discussion] RandomArray question Message-ID: <200203300200.SAA13173@mail.arc.nasa.gov> > writes: >Take a look at the first number in each run ! That is because the random starting seed is (probably, I haven't looked at the code) set from the clock, and doesn't change all that much from run to run. You'll see similar results when you substitute: print "Clock at time:" , i, ":", RandomArray.random_integers(10) or print "Clock at time:" , i, ":", RandomArray.uniform(1, 10) into your code. The part before the decimal point is always the same on the first call of each run (assuming you run them at roughly the same time). Note that the 'seed' is really the internal state of the RNG and changes at each call. You could call the random function a few dozen times before using results, or hash the first result and use that as a new seed, etc. But basically, the generator will produce similar initial results (ie. one call) for similar seeds, which is what the time value is causing. I'd propose that the implementation, when setting the seed from the time, generate at least one dummy RNG generation before returning results. -- Chad Netzer chad.netzer at stanfordalumni.org From edcjones at erols.com Sat Mar 30 07:55:18 2002 From: edcjones at erols.com (Edward C. Jones) Date: Sat Mar 30 07:55:18 2002 Subject: [Numpy-discussion] IM = Numeric + PIL + OpenCV Message-ID: <3CA5E432.4040503@erols.com> IM (pronounced with a long I) is an Python module that makes it easy to use Numeric and PIL together in programs. Typical functions in IM are: Open: Opens an image file using PIL and converts it to Numeric, PIL, or OpenCV formats. Save: Converts an array to PIL and saves it to a file. Array_ToArrayCast: Converts images between formats and between pixel types. In addition to Numeric and PIL, IM works with the Intel OpenCV computer vision system (http://www.intel.com/research/mrl/research/opencv/). OpenCV is available for Linux at the OpenCV Yahoo Group (http://groups.yahoo.com/group/OpenCV/). IM currently runs under Linux only. It should not be too difficult to port the basic IM system to Windows or Mac. The OpenCV wrapper is large and complex and uses SWIG. It will be harder to port. The IM system appears to be pretty stable. On the other hand, the OpenCV wrapper is probably very buggy. To download the software go to http://members.tripod.com/~edcjones/pycode.html and download "PyCV.032502.tgz". Edward C. Jones edcjones at hotmail.com From rob at pythonemproject.com Sat Mar 30 08:01:15 2002 From: rob at pythonemproject.com (rob) Date: Sat Mar 30 08:01:15 2002 Subject: [Numpy-discussion] Where is SciPy? Message-ID: <3CA5E06E.BCE32F8D@pythonemproject.com> Their site is dead here. If anyone has the latest copy of Weave tar'd up that they can send me, I had a bug report to finally make. For some reason, Weave still can't find libstdc++ on FreeBSD. Rob. -- ----------------------------- The Numeric Python EM Project www.pythonemproject.com From rob at pythonemproject.com Sat Mar 30 08:30:07 2002 From: rob at pythonemproject.com (rob) Date: Sat Mar 30 08:30:07 2002 Subject: [Numpy-discussion] need help with simple conjugate gradient Laplace solver Message-ID: <3CA5E712.5516AD93@pythonemproject.com> I'm experimenting with electrostatics now. I have an iterative Jacobian Laplace solver working but it can be slow. It creates a beautiful 3D Animabob image. So I decided to try out a conjugate-gradient solver, which should be an order of mag better. It runs but doesn't converge. One thing I am wondering, where is the conjugate? In my FEM code, the solver realy does use a conjugate, while this one here that I pieced together from several other programs does not. Why is it called conjugate gradient without a conjugate ? :) Here is the code: from math import * from Numeric import * # #*** ENTER DATA filename= "out" # bobfile=open(filename+".bob","w") print "\n"*30 NX=30 # number of cells NY=30 NZ=30 resmax=1e-3 # conj-grad tolerance #allocate arrays ##Ex=zeros((NX+2,NY+2,NZ+2),Float) ##Ey=zeros((NX+2,NY+2,NZ+2),Float) ##Ez=zeros((NX+2,NY+2,NZ+2),Float) p=zeros((NX+1,NY+1,NZ+1),Float) q=zeros((NX+1,NY+1,NZ+1),Float) r=zeros((NX+1,NY+1,NZ+1),Float) u=zeros((NX+1,NY+1,NZ+1),Float) u[0:30,0:30,0]=0 # box at 1V with one side at 0V u[0:30,0,0:30]=1 u[0,0:30,0:30]=1 u[0:30,0:30,30]=1 u[0:30,30,0:30]=1 u[30,0:30,0:30]=1 r[1:NX-1,1:NY-1,1:NZ-1]=(u[1:NX-1,0:NY-2,1:NZ-1]+ #initialize r matrix u[1:NX-1,2:NY,1:NZ-1]+ u[0:NX-2,1:NY-1,1:NZ-1]+ u[2:NX,1:NY-1,1:NZ-1]+ u[1:NX-1,1:NY-1,0:NZ-2]+ u[1:NX-1,1:NY-1,2:NZ]) p[...]=r[...] #initialize p matrix # #**** START ITERATIONS # N=(NX-2)*(NY-2)*(NZ-2) # left over from Jacobi solution, not used KK=0 # iteration counter res=resmax=0.0; #set residuals=0 while(1): q[1:NX-1,1:NY-1,1:NZ-1]=(p[1:NX-1,0:NY-2,1:NZ-1]+ # finite difference eq p[1:NX-1,2:NY,1:NZ-1]+ p[0:NX-2,1:NY-1,1:NZ-1]+ p[2:NX,1:NY-1,1:NZ-1]+ p[1:NX-1,1:NY-1,0:NZ-2]+ p[1:NX-1,1:NY-1,2:NZ]) # Calculate r dot p and p dot q rdotp = 0.0 pdotq = 0.0 rdotp = add.reduce(ravel( r[1:NX-1,1:NY-1,1:NZ-1] * p[1:NX-1,1:NY-1,1:NZ-1])) pdotq = add.reduce(ravel( p[1:NX-1,1:NY-1,1:NZ-1] * q[1:NX-1,1:NY-1,1:NZ-1])) # Set alpha value alpha = rdotp/pdotq # Update solution and residual u[1:NX-1,1:NY-1,1:NZ-1] += alpha*p[1:NX-1,1:NY-1,1:NZ-1] r[1:NX-1,1:NY-1,1:NZ-1] += - alpha*q[1:NX-1,1:NY-1,1:NZ-1] # calculate beta rdotq = 0.0 rdotq = add.reduce(ravel(r[1:NX-1,1:NY-1,1:NZ-1]*q[1:NX-1,1:NY-1,1:NZ-1])) beta = rdotq/pdotq # Set the new search direction p[1:NX-1,1:NY-1,1:NZ-1] = r[1:NX-1,1:NY-1,1:NZ-1] - beta*p[1:NX-1,1:NY-1,1:NZ-1] res = sort(ravel(r[1:NX-1,1:NY-1,1:NZ-1]))[-1] #find largest residual # resmax = max(resmax,abs(res)) KK+=1 # print "Iteration Number %d Residual %1.2e" %(KK,abs(res)) if (abs(res)<=resmax): break # if residual is small enough break out print "Number of Iterations ",KK -- ----------------------------- The Numeric Python EM Project www.pythonemproject.com From rob at pythonemproject.com Sat Mar 30 09:00:03 2002 From: rob at pythonemproject.com (rob) Date: Sat Mar 30 09:00:03 2002 Subject: [Numpy-discussion] need help with simple conjugate gradient Laplace solver References: <3CA5E712.5516AD93@pythonemproject.com> Message-ID: <3CA5EE17.3C5CED2D@pythonemproject.com> After I post, I always see the dumb error. I am not including the 6x term in my finite difference equation. It now converges, but I get wierd looking V map. Rob. Here is the fixed code from math import * from Numeric import * # #*** ENTER DATA filename= "out" # bobfile=open(filename+".bob","w") print "\n"*30 NX=30 # number of cells NY=30 NZ=30 N=30 # size of box resmax=1e-3 # conj-grad tolerance #allocate arrays ##Ex=zeros((NX+2,NY+2,NZ+2),Float) ##Ey=zeros((NX+2,NY+2,NZ+2),Float) ##Ez=zeros((NX+2,NY+2,NZ+2),Float) p=zeros((NX+1,NY+1,NZ+1),Float) q=zeros((NX+1,NY+1,NZ+1),Float) r=zeros((NX+1,NY+1,NZ+1),Float) u=zeros((NX+1,NY+1,NZ+1),Float) u[0:N,0:N,0]=0 # box at 1V with one side at 0V u[0:N,0,0:N]=1 u[0,0:N,0:N]=1 u[0:N,0:N,N]=1 u[0:N,N,0:N]=1 u[N,0:N,0:N]=1 r[1:NX-1,1:NY-1,1:NZ-1]=(u[1:NX-1,0:NY-2,1:NZ-1]+ #initialize r matrix u[1:NX-1,2:NY,1:NZ-1]+ u[0:NX-2,1:NY-1,1:NZ-1]+ u[2:NX,1:NY-1,1:NZ-1]+ u[1:NX-1,1:NY-1,0:NZ-2]+ u[1:NX-1,1:NY-1,2:NZ]- 6*u[1:NX-1,1:NY-1,1:NZ-1]) p[...]=r[...] #initialize p matrix # #**** START ITERATIONS # N=(NX-2)*(NY-2)*(NZ-2) # left over from Jacobi solution, not used KK=0 # iteration counter res=0.0; #set residuals=0 while(1): q[1:NX-1,1:NY-1,1:NZ-1]=(6*p[1:NX-1,1:NY-1,1:NZ-1]- p[1:NX-1,0:NY-2,1:NZ-1]- # finite difference eq p[1:NX-1,2:NY,1:NZ-1]- p[0:NX-2,1:NY-1,1:NZ-1]- p[2:NX,1:NY-1,1:NZ-1]- p[1:NX-1,1:NY-1,0:NZ-2]- p[1:NX-1,1:NY-1,2:NZ]) # Calculate r dot p and p dot q rdotp = 0.0 pdotq = 0.0 rdotp = add.reduce(ravel( r[1:NX-1,1:NY-1,1:NZ-1] * p[1:NX-1,1:NY-1,1:NZ-1])) pdotq = add.reduce(ravel( p[1:NX-1,1:NY-1,1:NZ-1] * q[1:NX-1,1:NY-1,1:NZ-1])) # Set alpha value alpha = rdotp/pdotq # Update solution and residual u[1:NX-1,1:NY-1,1:NZ-1] += alpha*p[1:NX-1,1:NY-1,1:NZ-1] r[1:NX-1,1:NY-1,1:NZ-1] += - alpha*q[1:NX-1,1:NY-1,1:NZ-1] # calculate beta rdotq = 0.0 rdotq = add.reduce(ravel(r[1:NX-1,1:NY-1,1:NZ-1]*q[1:NX-1,1:NY-1,1:NZ-1])) beta = rdotq/pdotq # Set the new search direction p[1:NX-1,1:NY-1,1:NZ-1] = r[1:NX-1,1:NY-1,1:NZ-1] - beta*p[1:NX-1,1:NY-1,1:NZ-1] res = sort(ravel(r[1:NX-1,1:NY-1,1:NZ-1]))[-1] #find largest residual # resmax = max(resmax,abs(res)) KK+=1 # print "Iteration Number %d Residual %1.2e" %(KK,abs(res)) if (abs(res)<=resmax): break # if residual is small enough break out print "Number of Iterations ",KK From paul at pfdubois.com Sat Mar 30 19:02:04 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Sat Mar 30 19:02:04 2002 Subject: [Numpy-discussion] Right behavior In-Reply-To: <3CA4D3BB.617F2883@optushome.com.au> Message-ID: <000001c1d860$58bfcf30$0f01a8c0@NICKLEBY> About random number generation with Numeric: a. IMHO, RNG is the right choice if you are picky about the quality of the generator. This generator has a long history of heavy use. RandomArray is in the core because someone put it there early, not because it is the best. I do not claim to be an authority on this but that is my understanding. b. The suggestion made by one correspondent, that a generator should generate and throw away one value when the seed is set, sounds correct if viewed from the point of view of the initial set of a single stream. But many users need multiple streams that are independent and reproducible. This is done by saving the state of the generator and then restoring it later. It is important that this save/restore not change the results compared to not doing it. The presence or absence of another computation, or frequency of dump/restarts, that require a save/restore, must not affect the result. Thus a decision to throw away a result must come from the application level. From R.M.Everson at exeter.ac.uk Fri Mar 1 01:42:05 2002 From: R.M.Everson at exeter.ac.uk (R.M.Everson) Date: Fri Mar 1 01:42:05 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) In-Reply-To: References: Message-ID: Hi, On 28 Feb 2002, Travis Oliphant wrote: > On 28 Feb 2002, A.Schmolck wrote: >> Two essential matrix operations (matrix-multiplication and >> transposition (which is what I am mainly using) are both >> considerably >> >> a) less efficient and >> b) less notationally elegant > You are not alone in your concerns. The developers of SciPy are > quite concerned about speed, hence the required linking to ATLAS. > The question of notational elegance is stickier because we just > can't add new operators. > The solution I see is to use other classes. At the moment, I agree this is probably the best solution, although it would be nice if the core python was able to add operators :) >> The following Matlab fragment >> M * (C' * C) * V' * u >> > This becomes (using SciPy which defines Mat = Matrix.Matrix and > could later redefine it to use the ATLAS libraries for matrix > multiplication). > C, V, u, M = apply(Mat, (C, V, u, M)) > M * (C.H * C) * V.H * M Yes, much better. > not bad.. and with a Mat class that uses the ATLAS blas (not a > very hard thing to do now.), this could be made as fast as > MATLAB. > Perhaps, as as start we could look at how you make the current > Numeric use blas if it is installed to do dot on real and complex > arrays (I know you can get rid of lapack_lite and use your own > lapack) but, the dot function is defined in multiarray and would > have to be modified to use the BLAS instead of its own homegrown > algorithm. This is precisely what Alex and I have done. Please see the patch to Numeric and timings on http://www.dcs.ex.ac.uk/~aschmolc/Numeric/ It's not beautiful but about 40 times faster on 1000 by 1000 matrix multiplies. I'll attempt to provide a similar patch for numarray over the next week or so. Many thanks for your comments. Richard. From focke at SLAC.Stanford.EDU Fri Mar 1 17:53:10 2002 From: focke at SLAC.Stanford.EDU (Warren Focke) Date: Fri Mar 1 17:53:10 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) In-Reply-To: <198c01c1c09b$4e981c60$74460344@cx781526b> Message-ID: On Thu, 28 Feb 2002, Tim Hochberg wrote: > > not going to do you any good). The above C' * C actually creates, > AFAIK, > > _3_ versions of C, 2 of them transposed (prior to 20.3; > > I think you're a little off track here. The transpose operation doesn't > normally make a copy, it just creates a new object that points to the same > data, but with different stride values. So the transpose shouldn't be slow > or take up more space. Numeric.transpose quickly returns an object which takes up little space. But, in many cases, when you actually use the object returned, a contiguous copy gets created. Glancing over the 21.0b3 sources, it looks like this might not happen as often as it used to, but there are still plenty of calls to PyArray_ContiguousFromObject in there, so transposition is not always as cheap as it might seem. Especially if you say Aprime=Numeric.transpose(A) and then use Aprime several times, you could end up repeatedly creating and discarding temporary transposed copies. Warren Focke From huaiyu_zhu at yahoo.com Fri Mar 1 21:09:02 2002 From: huaiyu_zhu at yahoo.com (Huaiyu Zhu) Date: Fri Mar 1 21:09:02 2002 Subject: [Numpy-discussion] Python 2.2 seriously crippled for numerical computation? Message-ID: There appears to be a serious bug in Python 2.2 that severely limits its usefulness for numerical computation: # Python 1.5.2 - 2.1 >>> 1e200**2 inf >>> 1e-200**2 0.0 # Python 2.2 >>> 1e-200**2 Traceback (most recent call last): File "", line 1, in ? OverflowError: (34, 'Numerical result out of range') >>> 1e200**2 Traceback (most recent call last): File "", line 1, in ? OverflowError: (34, 'Numerical result out of range') This produces the following serious effects: after hours of numerical computation, just as the error is converging to zero, the whole thing suddenly unravels. Note that try/except is completely useless for this purpose. I hope this is unintended behavior and that there is an easy fix. Have any of you experienced this? Huaiyu From tim.one at comcast.net Fri Mar 1 21:43:12 2002 From: tim.one at comcast.net (Tim Peters) Date: Fri Mar 1 21:43:12 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: [Huaiyu Zhu] > There appears to be a serious bug in Python 2.2 that severely limits its > usefulness for numerical computation: > > # Python 1.5.2 - 2.1 > > >>> 1e200**2 > inf A platform-dependent accident, there. > >>> 1e-200**2 > 0.0 > > # Python 2.2 > > >>> 1e-200**2 > Traceback (most recent call last): > File "", line 1, in ? > OverflowError: (34, 'Numerical result out of range') That one is surprising and definitely not intended: it suggests your platform libm is setting errno to ERANGE for pow(1e-200, 2.0), or that your platform C headers define INFINITY but incorrectly, or that your platform C headers define HUGE_VAL but incorrectly, or that your platform C compiler generates bad code, or optimizes incorrectly, for negating and/or comparing against its definition of HUGE_VAL or INFINITY. Python intends silent underflow to 0 in this case, and I haven't heard of underflows raising OverflowError before. Please file a bug report with full details about which operating system, Python version, compiler and C libraries you're using (then it's going to take a wizard with access to all that stuff to trace into it and determine the true cause). > >>> 1e200**2 > Traceback (most recent call last): > File "", line 1, in ? > OverflowError: (34, 'Numerical result out of range') That one is intended; see http://sf.net/tracker/?group_id=5470&atid=105470&func=detail&aid=496104 for discussion. > This produces the following serious effects: after hours of numerical > computation, just as the error is converging to zero, the whole thing > suddenly unravels. It depends on how you write your code, of course. > Note that try/except is completely useless for this purpose. Ditto. If your platform C lets you get away with it, you may still be able to get an infinity out of 1e200 * 1e200. > I hope this is unintended behavior Half intended, half both unintended and never before reported. > and that there is an easy fix. Sorry, "no" to either. From paul at pfdubois.com Sat Mar 2 07:56:15 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Sat Mar 2 07:56:15 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: <000001c1c202$7996be90$1001a8c0@NICKLEBY> I also see the underflow problem on my Linux box 2.4.2-2. This is certainly untenable. However, I am able to catch OverflowError in both cases. I had a user complain about this just yesterday, so I think it is a new behavior in Python 2.2 which I was just rolling out. A small Fortran test problem did not exhibit the underflow bug, and caught the overflow bug at COMPILE TIME (!). There are two states for the IEEE underflow: one in which the hardware sets it to zero, and the other in which the hardware signals the OS and you can tell the OS to set it to zero. There is no standard for the interface to this facility that I am aware of. (Usually I have had to figure out how to make sure the underflow was handled in hardware because the sheer cost of letting it turn into a system call was prohibitive.) I speculate that on machines where the OS call is the default that Python 2.2 is catching the signal when it should let it go by. I have not looked at this lately so something may have changed. You can use the kinds package that comes with Numeric to test for maximum and minimum exponents. kinds.default_float_kind.MAX_10_EXP (equal to 308 on my Linux box, for example) tells you how big an exponent a floating point number can have. MIN_10_EXP (-307 for me) is also there. Work around on your convergence test: instead of testing x**2 you might test log10(x) vs. a constant or some expression involving kinds.default_float_kind.MIN_10_EXP. -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Tim Peters Sent: Friday, March 01, 2002 9:43 PM To: Huaiyu Zhu; numpy-discussion at lists.sourceforge.net Cc: python-list at python.org Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? [Huaiyu Zhu] > There appears to be a serious bug in Python 2.2 that severely limits > its usefulness for numerical computation: > > # Python 1.5.2 - 2.1 > > >>> 1e200**2 > inf A platform-dependent accident, there. > >>> 1e-200**2 > 0.0 > > # Python 2.2 > > >>> 1e-200**2 > Traceback (most recent call last): > File "", line 1, in ? > OverflowError: (34, 'Numerical result out of range') That one is surprising and definitely not intended: it suggests your platform libm is setting errno to ERANGE for pow(1e-200, 2.0), or that your platform C headers define INFINITY but incorrectly, or that your platform C headers define HUGE_VAL but incorrectly, or that your platform C compiler generates bad code, or optimizes incorrectly, for negating and/or comparing against its definition of HUGE_VAL or INFINITY. Python intends silent underflow to 0 in this case, and I haven't heard of underflows raising OverflowError before. Please file a bug report with full details about which operating system, Python version, compiler and C libraries you're using (then it's going to take a wizard with access to all that stuff to trace into it and determine the true cause). > >>> 1e200**2 > Traceback (most recent call last): > File "", line 1, in ? > OverflowError: (34, 'Numerical result out of range') That one is intended; see http://sf.net/tracker/?group_id=5470&atid=105470&func=detail&aid=496104 for discussion. > This produces the following serious effects: after hours of numerical > computation, just as the error is converging to zero, the whole thing > suddenly unravels. It depends on how you write your code, of course. > Note that try/except is completely useless for this purpose. Ditto. If your platform C lets you get away with it, you may still be able to get an infinity out of 1e200 * 1e200. > I hope this is unintended behavior Half intended, half both unintended and never before reported. > and that there is an easy fix. Sorry, "no" to either. _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From andymac at bullseye.apana.org.au Sat Mar 2 10:18:22 2002 From: andymac at bullseye.apana.org.au (Andrew MacIntyre) Date: Sat Mar 2 10:18:22 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: On 2 Mar 2002, Konrad Hinsen wrote: > Tim Peters writes: > > > > # Python 2.2 > > > > > > >>> 1e-200**2 > > > Traceback (most recent call last): > > > File "", line 1, in ? > > > OverflowError: (34, 'Numerical result out of range') > > > > That one is surprising and definitely not intended: it suggests your > > platform libm is setting errno to ERANGE for pow(1e-200, 2.0), or that your > > platform C headers define INFINITY but incorrectly, or that your platform C > > headers define HUGE_VAL but incorrectly, or that your platform C compiler > > generates bad code, or optimizes incorrectly, for negating and/or comparing > > I just tested and found the same behaviour, on RedHat Linux 7.1 > running on a Pentium machine. Python 2.1, compiled and running on the > same machine, returns 0. So does the Python 1.5.2 that comes with the > RedHat installation. Although there might certainly be something wrong > with the C compiler and/or header files, something has likely changed > in Python as well in going to 2.2, the only other explanation I see > would be a compiler optimization bug that didn't have an effect with > earlier Python releases. Other examples... FreeBSD 4.4: Python 2.1.1 (#1, Sep 13 2001, 18:12:15) [GCC 2.95.3 20010315 (release) [FreeBSD]] on freebsd4 Type "copyright", "credits" or "license" for more information. >>> 1e-200**2 0.0 >>> 1e200**2 Inf Python 2.3a0 (#1, Mar 1 2002, 00:00:52) [GCC 2.95.3 20010315 (release) [FreeBSD]] on freebsd4 Type "help", "copyright", "credits" or "license" for more information. >>> 1e-200**2 0.0 >>> 1e200**2 Traceback (most recent call last): File "", line 1, in ? OverflowError: (34, 'Result too large') Both builds built with "./configure --with-fpectl", although the optimisation settings for the 2.3a0 (CVS of yesterday) build were tweaked (2.1.1: "-g -O3"; 2.3a0: "-s -m486 -Os"). My 2.2 OS/2 EMX build (which uses gcc 2.8.1 -O2) produces exactly the same result as 2.3a0 on FreeBSD. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac at bullseye.apana.org.au | Snail: PO Box 370 andymac at pcug.org.au | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From pearu at cens.ioc.ee Sun Mar 3 04:41:15 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sun Mar 3 04:41:15 2002 Subject: [Numpy-discussion] How can CDOUBLE_to_CDOUBLE work correctly? Message-ID: Hi! I am trying to copy a 2-d complex array to another 2-d complex array in an extension module. Both arrays may be noncontiguous. But using the same routine (namely Travis's copy_ND_array, you can find it at the end of this messsage) as for real arrays seems not work. After some playing around and reading docs about strides and stuff, I found that the reason might be in how Numeric (20.3) defines the CDOUBLE_to_CDOUBLE function: static void CDOUBLE_to_CDOUBLE(double *ip, int ipstep, double *op, int opstep, int n) {int i; for(i=0;i<2*n;i++,ip+=ipstep,op+=opstep) {*op = (double)*ip;}} It seems not to take into account that real and imaginary part are always contiguous in memory, even if an array itself is not. Actually, I don't understand how this code can work (unless some magic is done in places where this code is used). I would have expected that the code for CDOUBLE_to_CDOUBLE to be analoguous to relative functions but for the real data. For example, DOUBLE_to_DOUBLE is defined as static void DOUBLE_to_DOUBLE(double *ip, int ipstep, double *op, int opstep, int n) {int i; for(i=0;i= (max_ind)[k]) { \ while (k >= 0 && ((ret_ind)[k] >= (max_ind)[k]-1)) \ (ret_ind)[k--] = 0; \ if (k >= 0) (ret_ind)[k]++; \ else (ret_ind)[0] = (max_ind)[0]; \ } \ } #define CALCINDEX(indx, nd_index, strides, ndim) \ { \ int i; \ indx = 0; \ for (i=0; i < (ndim); i++) \ indx += nd_index[i]*strides[i]; \ } extern int copy_ND_array(const PyArrayObject *in, PyArrayObject *out) { /* This routine copies an N-D array in to an N-D array out where both can be discontiguous. An appropriate (raw) cast is made on the data. */ /* It works by using an N-1 length vector to hold the N-1 first indices into the array. This counter is looped through copying (and casting) the entire last dimension at a time. */ int *nd_index, indx1; int indx2, last_dim; int instep, outstep; if (0 == in->nd) { in->descr->cast[out->descr->type_num]((void *)in->data,1, (void*)out->data,1,1); return 0; } if (1 == in->nd) { in->descr->cast[out->descr->type_num]((void *)in->data,1, (void*)out->data,1,in->dimensions[0]); return 0; } nd_index = (int *)calloc(in->nd-1,sizeof(int)); last_dim = in->nd - 1; instep = in->strides[last_dim] / in->descr->elsize; outstep = out->strides[last_dim] / out->descr->elsize; if (NULL == nd_index ) { fprintf(stderr,"Could not allocate memory for index array.\n"); return -1; } while(nd_index[0] != in->dimensions[0]) { CALCINDEX(indx1,nd_index,in->strides,in->nd-1); CALCINDEX(indx2,nd_index,out->strides,out->nd-1); /* Copy (with an appropriate cast) the last dimension of the array */ (in->descr->cast[out->descr->type_num])((void*)(in->data+indx1),instep, (void*)(out->data+indx2),outstep,in->dimensions[last_dim]); INCREMENT(nd_index,in->nd-1,in->dimensions); } free(nd_index); return 0; } /* EOF copy_ND_array */ From pearu at cens.ioc.ee Sun Mar 3 07:10:08 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sun Mar 3 07:10:08 2002 Subject: [Numpy-discussion] How can CDOUBLE_to_CDOUBLE work correctly? In-Reply-To: Message-ID: Hi again, On Sun, 3 Mar 2002, Pearu Peterson wrote: > So, I would have expected CDOUBLE_to_CDOUBLE to be > > static void CDOUBLE_to_CDOUBLE(double *ip, int ipstep, > double *op, int opstep, int n) > { int i; > for(i=0;i *op = (double)*ip; /* copy real part */ > *(op+1) = (double)*(ip+1); /* copy imaginary part that always > follows the real part in memory */ > } > } After some testing I found that CDOUBLE_to_CDOUBLE should be static void CDOUBLE_to_CDOUBLE(double *ip, int ipstep, double *op, int opstep, int n) { int i; for(i=0;i Message-ID: A lot of this speculation should have been cut short by my first msg. Yes, something changed in 2.2; follow the referenced link: http://sf.net/tracker/?group_id=5470&atid=105470&func=detail&aid=496104 For the rest of it, it looks like the "1e-200**2 raises OverflowError" glitch is unique to platforms using glibc. What isn't clear is whether it's dependent on which version of glibc, or on whether Python is linked with -lieee, or both. Unfortunately, the C standard (neither one) isn't a lick of help here -- error reporting from C math functions is a x-platform crapshoot. Can someone who sees this problem confirm or deny that they link with -lieee? If they see this problem and don't link with -lieee, also please try linking with -lieee and see whether the problem goes away then. On boxes with this problem, I'm also curious what import math print math.pow(1e-200, 2.0) does under 2.1. One probably-relevant thing that changed between 2.1 and 2.2 is that float**int calls the platform pow(float, int) in 2.2. 2.1 did it with repeated multiplication instead, but screwed up endcases. An example under 2.1: >>> x = -1. >>> import sys >>> x**(-sys.maxint-1L) Traceback (most recent call last): File "", line 1, in ? ValueError: negative number cannot be raised to a fractional power >>> The same thing under 2.2 returns 1.0, provided your platform pow() isn't braindead. Repeated multiplication is also less accurate than a decent-quality pow(). From paul at pfdubois.com Mon Mar 4 08:21:18 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Mar 4 08:21:18 2002 Subject: [Numpy-discussion] Matrix.py change breaks LinearAlgebra Message-ID: <000201c1c398$8c1a8c80$1001a8c0@NICKLEBY> Your change to Matrix.py has a fatal flaw: Python 2.2 (#1, Mar 1 2002, 11:11:28) [GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-81)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import LinearAlgebra Traceback (most recent call last): File "", line 1, in ? File "/home/dubois/linux/lib/python2.2/site-packages/Numeric/LinearAlgebra.py ", line 10, in ? import MLab File "/home/dubois/linux/lib/python2.2/site-packages/Numeric/MLab.py", line 10, in ? import Matrix File "/home/dubois/linux/lib/python2.2/site-packages/Numeric/Matrix.py", line 8, in ? from LinearAlgebra import inverse ImportError: cannot import name inverse >>> From paul at pfdubois.com Mon Mar 4 11:55:03 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Mar 4 11:55:03 2002 Subject: [Numpy-discussion] RE: Matrix.py change breaks LinearAlgebra In-Reply-To: Message-ID: <000101c1c3b6$60369aa0$1001a8c0@NICKLEBY> Travis fixed the error he accidentally made when improving MLab.py. This incident points out that our test suite does not include any tests for the non-core modules. Recently the discussion over the meaning of cov shows this too: I didn't even have a test that showed what the INTENDED answer is. We need test suites for FFT, LinearAlgebra, MLab, etc. If you have the subject competence to make a test file for us, please volunteer. I'd like them to be like the ones in Test/test.py, but as a separate file so that I can test the modules separately. From bsder at allcaps.org Mon Mar 4 13:12:23 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Mon Mar 4 13:12:23 2002 Subject: [Numpy-discussion] Interface for numpy C-API <-> simple C++ matrix class In-Reply-To: <20020215211700.GA22189@gibbs.physik.uni-konstanz.de> Message-ID: <20020304131041.G51226-100000@mail.allcaps.org> You might want to check out the Boost Python Library. It is peer reviewed and seems to get most things correct. It should make writing wrappers a lot easier. -a From bsder at allcaps.org Mon Mar 4 13:17:15 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Mon Mar 4 13:17:15 2002 Subject: [Numpy-discussion] Interface for numpy C-API <-> simple C++ matrix class In-Reply-To: <20020304131041.G51226-100000@mail.allcaps.org> Message-ID: <20020304131616.P51226-100000@mail.allcaps.org> Ummmm, it helps if I include the URL. Sorry. http://www.boost.org/libs/python/doc/ -a On Mon, 4 Mar 2002, Andrew P. Lentvorski wrote: > You might want to check out the Boost Python Library. It is peer reviewed > and seems to get most things correct. > > It should make writing wrappers a lot easier. > > -a > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From tim.one at comcast.net Tue Mar 5 16:51:03 2002 From: tim.one at comcast.net (Tim Peters) Date: Tue Mar 5 16:51:03 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: [Huaiyu Zhu] > ... > 1. The -lieee is indeed the most direct cure. On the specific platform tried. The libm errno rules changed between C89 and C99, and I'm afraid there's no errno behavior Python can rely on anymore. So I expect more changes will be needed in Python, regardless of how things turn out on this specific platform. > ... > 2. Is there a configure option to guarantee -lieee? If anyone can answer this question, please don't answer it here: it will just get lost. Attach it to Huaiyu's bug report instead: Thanks. > ... > 3. errno 34 (or -lieee) may not be the sole reason. > > On a RedHat 6.1 upgraded to 7.1 (both gcc and glibc), errno 34 > is indeed raised in a C program linked without -lieee, and Python is > indeed compiled without -lieee, but Python does not raise > OverflowError. I expect you're missing something. Skip posted the Python code before, and if errno gets set, Python *does* raise OverflowError: errno = 0; /* Skip forgot to post this line, and it's important */ ... ix = pow(iv, iw); ... if (errno != 0) { /* XXX could it be another type of error? */ PyErr_SetFromErrno(PyExc_OverflowError); return NULL; If you can read C, believe your eyes . What you may be missing is what an utter mess C is here. How libm behaves may depend on compiler options, linker options, global system flags, and even options set for other libraries you link with. > ... > 4. Is there an easier way to debug such problems? The cause was obvious to the first person (Skip) who stepped into Python to see what the code did on a platform where it failed. It's not going to be obvious to someone who doesn't. > 5. How is 1e200**2 handled? It goes through exactly the same code. > Since both 1e-200**2 and 1e200**2 produce the same errno all the time, > but Python still raises OverflowError for 1e200**2 when linked with > -lieee, there must be a separate mechanism at work. You're speculating from a false base: if platform pow(x, y) sets errno to any non-zero value, Python x**y raises OverflowError. What differs is when platform pow(x, y) does not set errno. In that case, Python synthesizes errno=ERANGE if the pow() result equals +- platform HUGE_VAL. > What is that and how can I override it? Sorry, you can't override it. > I know this is by design, but I think the design is dumb (to put it > politely). I won't get into an argument here. I'll write > up my rationale against this when I have some time. I'm afraid a rationale won't do any good. I'm in favor of supplying full 754 compatibility myself, but: A) Getting from here to there requires volunteers to design, implement, document, and test the code. Given the atrocious state of C's 754 story, and the subtlety of 754's requirements, this needs volunteers who are both 754 experts and platform C experts. That combination is rare. B) Python's floating-point users will never agree it's a good thing, so such a change requires careful backward compatibility work too. This isn't likely to get done by someone who characterizes the other side as "dumb (to put it politely)" <0.9 wink>. Note that the only significant floating-point code ever contributed to the Python core was the fpectl module, and its purpose is to *break* 754 "non-stop" exception semantics in the places Python happens to let them sneak through. > I do remember there being several discussions in the past, but I don't > remember any convincing argument for the current decision. Any URL > would be greatly appreciated, beside the one pointed by Tim. Which "current decision" do you have in mind? There is no design doc for Python's numerics, if that's what you're looking for. As the text at the URL I gave you said, much of Python's fp behavior is accidental, inherited from platform C quirks. From newton at admin.kias.re.kr Tue Mar 5 16:52:19 2002 From: newton at admin.kias.re.kr (=?ks_c_5601-1987?B?wdax4sf8?=) Date: Tue Mar 5 16:52:19 2002 Subject: [Numpy-discussion] [Q] RandomArray has a problem ?? Message-ID: <001701c1c4a8$d135d340$e31d62d2@kias.re.kr> Hi, RandomArray has a problem? I use python2.0, Numpy 20.3. simple source code is ... -------------------------------- #!/usr/bin/env python import math import Numeric import RandomArray import sys RandomArray.seed(1234,5678) i=0L while 1: i = i+1 a = RandomArray.randint(0,100) if a==100: print 'i=',i, 'a=',a --------------------------------- and result is --------------------------------- i= 70164640 a= 100 i= 152242967 a= 100 i= 159619195 a= 100 i= 173219763 a= 100 i= 200933959 a= 100 i= 233219191 a= 100 i= 276114822 a= 100 i= 313589319 a= 100 i= 340689813 a= 100 i= 402397265 a= 100 i= 456099215 a= 100 i= 506078935 a= 100 i= 547758957 a= 100 i= 559163554 a= 100 i= 570211180 a= 100 .......... --------------------------------- RandomArray.randint(0,100) has range 0<= RandomArray.randint(0,100) <100. But, result is not...somtime, a==100 arise. So, I upgrade to python 2.2 and Numpy 21b3. But, I met same problem. And, so, I change the os Mandrake 8.0 to Redhat 7.2. But, same problem... I don't know what is my mistake... Please help me ... Kee-Hyoung Joo -------------- next part -------------- An HTML attachment was scrubbed... URL: From newton at ns.kias.re.kr Tue Mar 5 17:20:17 2002 From: newton at ns.kias.re.kr (newton at ns.kias.re.kr) Date: Tue Mar 5 17:20:17 2002 Subject: [Numpy-discussion] [Q] RandomArray module has a problem ???? Message-ID: Hi, RandomArray has a problem? I use python2.0, Numpy 20.3. simple source code is ... -------------------------------- #!/usr/bin/env python import math import Numeric import RandomArray import sys RandomArray.seed(1234,5678) i=0L while 1: i = i+1 a = RandomArray.randint(0,100) if a==100: print 'i=',i, 'a=',a --------------------------------- and result is --------------------------------- i= 70164640 a= 100 i= 152242967 a= 100 i= 159619195 a= 100 i= 173219763 a= 100 i= 200933959 a= 100 i= 233219191 a= 100 i= 276114822 a= 100 i= 313589319 a= 100 i= 340689813 a= 100 i= 402397265 a= 100 i= 456099215 a= 100 i= 506078935 a= 100 i= 547758957 a= 100 i= 559163554 a= 100 i= 570211180 a= 100 .......... --------------------------------- RandomArray.randint(0,100) has range 0<= RandomArray.randint(0,100) <100. But, result is not...somtime, a==100 arise. So, I upgrade to python 2.2 and Numpy 21b3. But, I met same problem. And, so, I change the os Mandrake 8.0 to Redhat 7.2. But, same problem... I don't know what is my mistake... Please help me ... Kee-Hyoung Joo ------------------------------------------------------------------ I love Jesus Christ who is my savior. He gives me meanning of life. In Christ, I have become shepherd and bible teacher. e-mail : newton at kias.re.kr home : http://newton.skku.ac.kr/~newton (My old home page) ------------------------------------------------------------------ From newton at ns.kias.re.kr Tue Mar 5 17:51:14 2002 From: newton at ns.kias.re.kr (newton at ns.kias.re.kr) Date: Tue Mar 5 17:51:14 2002 Subject: [Numpy-discussion] [Q] RandomArray module has a problem ???? Message-ID: Hi, RandomArray has a problem? I use python2.0, Numpy 20.3. simple source code is ... -------------------------------- #!/usr/bin/env python import math import Numeric import RandomArray import sys RandomArray.seed(1234,5678) i=0L while 1: i = i+1 a = RandomArray.randint(0,100) if a==100: print 'i=',i, 'a=',a --------------------------------- and result is --------------------------------- i= 70164640 a= 100 i= 152242967 a= 100 i= 159619195 a= 100 i= 173219763 a= 100 i= 200933959 a= 100 i= 233219191 a= 100 i= 276114822 a= 100 i= 313589319 a= 100 i= 340689813 a= 100 i= 402397265 a= 100 i= 456099215 a= 100 i= 506078935 a= 100 i= 547758957 a= 100 i= 559163554 a= 100 i= 570211180 a= 100 .......... --------------------------------- RandomArray.randint(0,100) has range 0<= RandomArray.randint(0,100) <100. But, result is not...somtime, a==100 arise. So, I upgrade to python 2.2 and Numpy 21b3. But, I met same problem. And, so, I change the os Mandrake 8.0 to Redhat 7.2. But, same problem... I don't know what is my mistake... Please help me ... Kee-Hyoung Joo ------------------------------------------------------------------ I love Jesus Christ who is my savior. He gives me meanning of life. In Christ, I have become shepherd and bible teacher. e-mail : newton at kias.re.kr home : http://newton.skku.ac.kr/~newton (My old home page) ------------------------------------------------------------------ From oliphant.travis at ieee.org Tue Mar 5 20:36:02 2002 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Mar 5 20:36:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. Message-ID: Recently there has been discussion on the list about the awkwardness of matrix syntax when using Numeric Python. Matrix expressions can be awkard to express in Numeric which is a negative mark on an otherwise excellent computing environment. Currently part of the problem can be solved by working with Matrix objects explicitly: a = Matrix.Matrix("[1 2 3; 4 5 6]") # Notice the strings. However, most operations return arrays which have to be recast to matrices using at best a character with parenthesis: M = Matrix.Matrix M(sin(a)) * M(cos(a)).T The suggestion was made to add ".M" as an attribute of arrays which returns a matrix. Thus, the code above can be written: sin(a).M * cos(a).M.T While some aesthestic simplicity is obtained, the big advantage is in consistency. Somebody else may decide that P = Matrix.Matrix is a better choice. But, if we establish that .M always returns a matrix for arrays < 2d, then we gain consistency. I've made this change and am ready to commit the change to the Numeric tree, unless there are strong objections. I know some people do not like the proliferation of attributes, but in this case the notational convenience it affords to otherwise overly burdened syntax and the consistency it allows Numeric to deal with Matrix equations may be worth it. What do you think? -Travis Oliphant From oliphant.travis at ieee.org Tue Mar 5 20:44:14 2002 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Tue Mar 5 20:44:14 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking Message-ID: I have not heard any feedback back on my proposal to add a final object to the extended slice syntax to current Numeric to allow for unambiguous index and mask-array access. As a modification to the proposal, suppose we just check to see if the last argument (of at least two) is a 0d array of type signed byte (currently this is illegal and will raise an error). This number would be a flag indicating how to interpret the previous objects. Of course these numbers would be hidden from the user who would write: a[index_array, _I] = b = a[index_array, _I] or a[mask_array, _M] = b = a[mask_array, _M] where _M is a 0d signed byte array indicating that the mask_array should be interpreted as a mask while _I is a 0d signed byte array indicating that the index_array should be interpreted as a integers into the flattened version of a. Other indexing schemes could be envisioned as well a[a1,a2,a3,_X] could be the cross product of the integer arrays a1, a2, and a3 for example. or a[a1, a2, a3, _Z] could select elements from a by "zipping" the sequences a1, a2, and a3 together to form a list of tuples to grab from a. Comments? From pearu at cens.ioc.ee Tue Mar 5 23:46:02 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Tue Mar 5 23:46:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: Hi! On Tue, 5 Mar 2002, Travis Oliphant wrote: > The suggestion was made to add ".M" as an attribute of arrays which returns a > matrix. Thus, the code above can be written: > > sin(a).M * cos(a).M.T > > While some aesthestic simplicity is obtained, the big advantage is in > consistency. > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? Would it be possible to use own Matrix classes instead of what is in Matrix.py? I gather that there must be some setter method in Numeric for that: Numeric.set_matrix_factory(MyMatrixClass) with a requirement that MyMatrixClass must be a subclass of Matrix.Matrix. I think it would be a very important feature as users can define their own matrix operations, for example, using their own BLAS routines to speed up operations with matrices (yes, I am thinking of SciPy specific Matrix class). Thanks, Pearu From hinsen at cnrs-orleans.fr Wed Mar 6 00:50:04 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Wed Mar 6 00:50:04 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. References: Message-ID: <200203060813.g268DeA09890@chinon.cnrs-orleans.fr> Travis Oliphant writes: > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it At the risk of sounding unconstructively negative, I think this is a misuse of attributes. For someone used to read standard Python code, where attributes are, well, attributes, code using this notation is just weird. Personally, consistent notation is more important than short notation. The Pythonesque solution to this problem, in my opinion, is separate matrix and array objects (which can and should of course share implementation code) plus explicit constructors to convert between the two. I am a bit worried that kludges such as fake attributes set bad precedents for the future. One of the main reasons why I like Python is its clean syntax and its simple object model. This kind of notation messes up both of them. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From geus at inf.ethz.ch Wed Mar 6 00:51:10 2002 From: geus at inf.ethz.ch (Roman Geus) Date: Wed Mar 6 00:51:10 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines Message-ID: <3C85D86D.EDCCB1EE@inf.ethz.ch> Dear Numerical Python user and developers I ran into the following problem: The python application I'm developing uses Numerical Python and other C modules that call LAPACK. My application runs well on 32bit architectures: When I tried to run the application on a HP-UX 64bit machine the application produced bus errors. After a long debugging session I found out that Fortran integers are still 32bit wide on this machine. Therefore also the HP LAPACK library has to called using 32bit integers. Numerical Python however codes Fortran integers as C 'long int' variables, which are 64bit wide on this machine. To make my application run on the HP-UX 64bit machine, I had to change all 'long int' to 'int' variables in Src/lapack_litemodule.c, which is a rather painful hack (see end of message for an example). My question is: Should Fortran integers not be coded as 'int' instead of 'long int' in Numerical Python? This way, it would still work on all 32 bit machines and also on the 64-bit machines I know. Would this work on all 64-bit machines? Thanks for your comments/help. -- Roman Geus E.g. the lapack_lite_dgetrf() function now looks like this: static PyObject *lapack_lite_dgetrf(PyObject *self, PyObject *args) { int lapack_lite_status__; int m; int n; PyObject *a; int lda; PyObject *ipiv; int info; int i; int *ipiv_int; int ipiv_len; TRY(PyArg_ParseTuple(args,"iiOiOi",&m,&n,&a,&lda,&ipiv,&info)); TRY(lapack_lite_CheckObject(a,PyArray_DOUBLE,"a","PyArray_DOUBLE","dgetrf")); TRY(lapack_lite_CheckObject(ipiv,PyArray_LONG,"ipiv","PyArray_LONG","dgetrf")); ipiv_len = m < n ? m : n; ipiv_int = (int *)malloc(ipiv_len * sizeof(int)); assert(ipiv_int); for (i = 0; i < ipiv_len; i ++) ipiv_int[i] = LDATA(ipiv)[i]; #if defined(NO_APPEND_FORTRAN) lapack_lite_status__ = dgetrf(&m,&n,DDATA(a),&lda,ipiv_int,&info); #else lapack_lite_status__ = dgetrf_(&m,&n,DDATA(a),&lda,ipiv_int,&info); #endif for (i = 0; i < ipiv_len; i ++) LDATA(ipiv)[i] = ipiv_int[i]; free(ipiv); return Py_BuildValue("{s:l,s:l,s:l,s:l,s:l}","dgetrf_",(long)lapack_lite_status__,"m",(long)m,"n",(long)n,"lda",(long)lda,"info",(long)info); } From Roy.Dragseth at cc.uit.no Wed Mar 6 01:03:03 2002 From: Roy.Dragseth at cc.uit.no (Roy Dragseth) Date: Wed Mar 6 01:03:03 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines In-Reply-To: <3C85D86D.EDCCB1EE@inf.ethz.ch> References: <3C85D86D.EDCCB1EE@inf.ethz.ch> Message-ID: <200203060902.g2692Df12551@newton.cc.uit.no> On Wednesday 06 March 2002 09:50 am, Roman Geus wrote: > Dear Numerical Python user and developers > > I ran into the following problem: > > The python application I'm developing uses Numerical Python and other > C modules that call LAPACK. My application runs well on 32bit > architectures: > > When I tried to run the application on a HP-UX 64bit machine the > application produced bus errors. After a long debugging session I > found out that Fortran integers are still 32bit wide on this > machine. Therefore also the HP LAPACK library has to called using > 32bit integers. Numerical Python however codes Fortran integers as C > 'long int' variables, which are 64bit wide on this machine. Have you tried the +i8 flag for the HP fortran compiler? It converts all fortran integers to 8bytes entities. Regards, Roy. From geus at inf.ethz.ch Wed Mar 6 01:27:03 2002 From: geus at inf.ethz.ch (Roman Geus) Date: Wed Mar 6 01:27:03 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines References: <3C85D86D.EDCCB1EE@inf.ethz.ch> <200203060902.g2692Df12551@newton.cc.uit.no> Message-ID: <3C85E0DE.F7F3ADC7@inf.ethz.ch> Roy Dragseth wrote: > > On Wednesday 06 March 2002 09:50 am, Roman Geus wrote: > > Dear Numerical Python user and developers > > > > I ran into the following problem: > > > > The python application I'm developing uses Numerical Python and other > > C modules that call LAPACK. My application runs well on 32bit > > architectures: > > > > When I tried to run the application on a HP-UX 64bit machine the > > application produced bus errors. After a long debugging session I > > found out that Fortran integers are still 32bit wide on this > > machine. Therefore also the HP LAPACK library has to called using > > 32bit integers. Numerical Python however codes Fortran integers as C > > 'long int' variables, which are 64bit wide on this machine. > > Have you tried the +i8 flag for the HP fortran compiler? It converts all > fortran integers to 8bytes entities. > > Regards, > Roy. I think this wouldn't help. The optimized BLAS/LAPACK library supplied by HP expects 32bit integers and other software incorporated into my python application (e.g. SuperLU) calls BLAS/LAPACK library using 32bit integers (C 'int' type). So, what really needs to be changed (at least for this machine) is how Numerical Python calls BLAS/LAPACK. It also needs to use 32bit integers. So this means using 'int' instead of 'long int'. Regards, Roman From pearu at cens.ioc.ee Wed Mar 6 01:43:01 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 6 01:43:01 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines In-Reply-To: <3C85E0DE.F7F3ADC7@inf.ethz.ch> Message-ID: Hi, On Wed, 6 Mar 2002, Roman Geus wrote: > So, what really needs to be changed (at least for this machine) is how > Numerical Python calls BLAS/LAPACK. It also needs to use 32bit integers. > So this means using 'int' instead of 'long int'. Having wrapped a lot of Fortran codes to Python, I agree, that Numerical Python should use 'int', instead of, 'long'. Though I have little influence to make this change to happen in Numeric but just agreeing with you. Pearu From hinsen at cnrs-orleans.fr Wed Mar 6 03:10:05 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Wed Mar 6 03:10:05 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines In-Reply-To: References: Message-ID: Pearu Peterson writes: > Having wrapped a lot of Fortran codes to Python, I agree, that Numerical > Python should use 'int', instead of, 'long'. Though I have little > influence to make this change to happen in Numeric but just agreeing with > you. Even without the Fortran aspect, I'd prefer 'int' for integer arrays in general. There may be applications that need 64-bit integers, but any portable application wouldn't rely on them anyway. 64-bit arrays take up more memory, and when you pickle them you cannot read those files on 32-bit machines. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From gpk at bell-labs.com Wed Mar 6 05:28:13 2002 From: gpk at bell-labs.com (Greg Kochanski) Date: Wed Mar 6 05:28:13 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: Message-ID: Please, no. There are many short names that you could use that would avoid overloading the [] operator. Especially in Python, where one cannot trivially decide the type of a variable, the behavior should change as little as possible as the type of each variable changes. Here, the indexing operation changes completely if you change the last index from an int to an array. That means you have to execute the code to understand it -- one can't just look and assume from local syntax. Besides, you know some idiot is going to eventually write code that looks like this: def access(a, b, x): return a[b, x] # I think that a must be a 2-D array... # 1000 lines later... access(a, _I) # Whoops all my assumptions were wrong... > From: Travis Oliphant > Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking > > > I have not heard any feedback back on my proposal to add a final object to > the extended slice syntax to current Numeric to allow for unambiguous index > and mask-array access. > ...hidden from the user who would write: > > a[index_array, _I] = > b = a[index_array, _I] > > or > > a[mask_array, _M] = > b = a[mask_array, _M] > > where _M is a 0d signed byte array indicating that the mask_array should be > interpreted as a mask while _I is a 0d signed byte array indicating that the > index_array should be interpreted as a integers into the flattened version of > a. > > Other indexing schemes could be envisioned as well... From perry at stsci.edu Wed Mar 6 09:05:02 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Mar 6 09:05:02 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: Message-ID: Travis Oliphant writes: > > I have not heard any feedback back on my proposal to add a final > object to > the extended slice syntax to current Numeric to allow for > unambiguous index > and mask-array access. > > As a modification to the proposal, suppose we just check to see > if the last > argument (of at least two) is a 0d array of type signed byte > (currently this > is illegal and will raise an error). This number would be a > flag indicating > how to interpret the previous objects. Of course these numbers would be > hidden from the user who would write: > > a[index_array, _I] = > b = a[index_array, _I] > > or > > a[mask_array, _M] = > b = a[mask_array, _M] > > where _M is a 0d signed byte array indicating that the > mask_array should be > interpreted as a mask while _I is a 0d signed byte array > indicating that the > index_array should be interpreted as a integers into the > flattened version of > a. > > Other indexing schemes could be envisioned as well > > a[a1,a2,a3,_X] could be the cross product of the integer arrays > a1, a2, and > a3 for example. > > or > > a[a1, a2, a3, _Z] could select elements from a by "zipping" the > sequences a1, > a2, and a3 together to form a list of tuples to grab from a. > > Comments? > Like Greg I'm wary of having many different interpretations for indexing behavior (I'm not even that crazy about having numarray handle boolean index arrays differently than the others --something we haven't implemented yet, and perhaps we shouldn't). Before discussing the merits of this, shouldn't we take the attitude that absence of feedback is not necessarily equivalent to approval, particularly for something that affects the public interface of the module? I would feel better about this if I saw several affirming the need for such features rather than few openly opposing it. But if one were to do something like this, I would use a different kind of object than 0d arrays, e.g., an instance of a class defined for just that purpose. You would really want to make sure that no data could mistakenly be interpreted as a flag, even if the chances were remote. I would also not use an underscore as the beginning of the name. Maybe I'm wrong about this, but I've come to take that to mean its a private variable that should not be used by users of the module, and that usage would confuse that. Finally, the name of the flag should be descriptive (e.g. MaskInd). But there could be better alternatives. As an example, x[nonzero(maskarray)] instead of x[maskarray, MaskInd] (Yes, it does generate a temporary so that is a drawback) Perry From jwpark at aeroguy.snu.ac.kr Wed Mar 6 10:03:16 2002 From: jwpark at aeroguy.snu.ac.kr (Jin Woo Park) Date: Wed Mar 6 10:03:16 2002 Subject: [Numpy-discussion] problem with integer type array object Message-ID: <1015437799.22162.23.camel@Maestro> I was working on an external module build in C where a function returns a 2-D integer array. However, when I imported module, I found a strange behavior concerning the type of the element of the array. This is what basically happens: static PyObject* foo(PyObject* arg, PyObject* args) { PyObject* a; int dims[2]={2,2}; a = PyArray_FromDims(2,dims,PyArray_INT); return a; } in python, >>> a=foo() >>> type(a[0,0]) I expected the type 'int'. If you make a Numpy array in python, then you get the expected type of 'int'. >>> b=array([[0,0],[0,0]]) >>> type(b[0,0]) Is this somehow intended? Thanks for any info, -- +----------------------------------------+ | Jin Woo Park (jwpark at aeroguy.snu.ac.kr)| | Research Assistant,Dept.Aerospace Eng. | | Seoul National University, Korea | +----------------------------------------+ From jwpark at aeroguy.snu.ac.kr Wed Mar 6 10:44:22 2002 From: jwpark at aeroguy.snu.ac.kr (Jin Woo Park) Date: Wed Mar 6 10:44:22 2002 Subject: [Numpy-discussion] Re: problem with integer type array object In-Reply-To: <1015437799.22162.23.camel@Maestro> References: <1015437799.22162.23.camel@Maestro> Message-ID: <1015440300.22162.29.camel@Maestro> Guess, I didn't read the DOC carefully. I just found out that Python int is equivalent to a C long. And PyArray_INT doesn't have a corresponding Python scalar type. Sorry for a 'dumb' question. -- +----------------------------------------+ | Jin Woo Park (jwpark at aeroguy.snu.ac.kr)| | Research Assistant,Dept.Aerospace Eng. | | Seoul National University, Korea | +----------------------------------------+ From perry at stsci.edu Wed Mar 6 10:49:05 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Mar 6 10:49:05 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: Travis Oliphant writes: > > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > I've made this change and am ready to commit the change to the > Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational > convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? > > -Travis Oliphant > > I'd have to agree with Konrad and Paul on this one. While it makes simple matrix expressions clearer, it opens a whole can of worms that were discussed (and never resolved) a couple years ago. Suppose I do this: x = a.M * libfunc(b.M, c.M) where libfunc is a 3rd party module written in Python that was written assuming that operators were elementwise operators. It may silently do a matrix multiplication (depending on the shapes) instead of the intended elementwise multiplication. Yet the usage above looks just as legitimate as x = a.M * b.M In other words, it raises the issues of having incompatible modules, some written with Numeric objects in mind, others with Matrix objects in mind. Conceivably there will be modules useful for both kinds of objects. Do we need to support two kinds? How do we deal with this? This is still a problem if we don't allow the .M attribute but still have widespread usage of a array object with different behavior for operations. Unlike masked arrays, whose basic behavior is unchanged for "good" data, the behavior for identical data is completely different. I wish I had a good answer for this. I don't remember all of the past suggestions, but it looks like one of the following solutions is needed: 1) Campaign for new operators in Python (various proposals to this affect. This is probably best from the Numeric point of view (maybe not from Python in general though). 2) Allow different array classes with different behavior, but come up with conventions and utilities for library developers to produce versions of arrays compatible with the convention assumed for the module (and convert back to the input type for output values). This doesn't prevent all opportunities for confusion and errors however. It also puts a stronger burden on library developers. 3) Do nothing and deal with the resulting mess. Perhaps the two camps have little need for each other's tools and it won't be much of a problem. Do option 2 retroactively if it is a problem. Other suggestions? Perry Perry From oliphant at ee.byu.edu Wed Mar 6 10:50:08 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Mar 6 10:50:08 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: Message-ID: > Please, no. There are many short names that you could use that would > avoid overloading the [] operator. Especially in Python, where > one cannot trivially decide the type of a variable, > the behavior should change as little as possible as the type > of each variable changes. Right now, what I'm suggesting fails with an error. Everyone talks about things not changing when types change but this is actually almost never the case. There is almost always a different behavior if the objects have different types. Are you opposed to anything going inside the [] operator to help indicate how the objects inside should be interpreted > > Here, the indexing operation changes completely if you change the > last index from an int to an array. That means you have to > execute the code to understand it -- one can't just look and > assume from local syntax. No, only a very specific kind of array. Currently, such a change gives you an error. And if the array was in the wrong place it would also give you an error. Your concerns seem motivated by not really understanding the suggested change. Could you provide me with other examples that show precisely what you mean? > Besides, you know some idiot is going to eventually write > code that looks like this: > > def access(a, b, x): > return a[b, x] # I think that a must be a 2-D array... > > # 1000 lines later... > access(a, _I) # Whoops all my assumptions were wrong... I have no idea, what your concern is here. This would result in an error currently and under the scheme I suggested. -Travis From tpitts at accentopto.com Wed Mar 6 11:21:37 2002 From: tpitts at accentopto.com (Todd Alan Pitts, Ph.D.) Date: Wed Mar 6 11:21:37 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: ; from perry@stsci.edu on Wed, Mar 06, 2002 at 01:48:15PM -0500 References: Message-ID: <20020306122211.A32137@fermi.accentopto.com> I often (perhaps inappropriately) fall into the "silent user" category. However, many of those in this conversation have put significant effort into python development and the least I can do is offer a comment from the standpoint of someone who uses python and Numeric extensively. Perhaps I am stepping into the middle of a conversation here -- I hope I have read all the relevant material. People may "like" Matlab syntax because it requires less typing or because it pleases them aesthetically. I personally feel that explicit function based operators (like transpose()) are very clear and unambiguous. While I understand the desire to have the code and the "math" look similar I think, in general, this is leads to the same kind of difficulty one has with notation in mathematics -- Notation that works well in some fields is extremely cumbersome in others. I don't expect it to look like an equation. I find orderly, predictable behavior that doesn't send me to the source code too often to figure out what is happening very helpful. Treating 1d or 2d arrays as matrices is admittedly *very* useful in some applications but cumbersome in others. This problem is reminiscent of the "clash" between the PIL and Numeric modules or between the C-language row-major matrix storage format and the (in my opinion) better thought out FORTRAN column-major matrix storage format. These differences place limitations on the potential for synergistic profits in the project. It is my personal experience/opinion that "convenience" methods are best added in a specific application that is not intended to be released generally. -Todd * Perry Greenfield (perry at stsci.edu) wrote: > Travis Oliphant writes: > > > > > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > > > I've made this change and am ready to commit the change to the > > Numeric tree, > > unless there are strong objections. I know some people do not like the > > proliferation of attributes, but in this case the notational > > convenience it > > affords to otherwise overly burdened syntax and the consistency it allows > > Numeric to deal with Matrix equations may be worth it. > > > > What do you think? > > > > -Travis Oliphant > > > > > I'd have to agree with Konrad and Paul on this one. While it makes simple > matrix expressions clearer, it opens a whole can of worms that were > discussed (and never resolved) a couple years ago. Suppose I do this: > > x = a.M * libfunc(b.M, c.M) > > where libfunc is a 3rd party module written in Python that was written > assuming that operators were elementwise operators. It may silently > do a matrix multiplication (depending on the shapes) instead of the > intended elementwise multiplication. Yet the usage above looks just as > legitimate as > > x = a.M * b.M > > In other words, it raises the issues of having incompatible modules, some > written with Numeric objects in mind, others with Matrix objects in mind. > Conceivably there will be modules useful for both kinds of objects. Do > we need to support two kinds? How do we deal with this? > > This is still a problem if we don't allow the .M attribute but still have > widespread usage of a array object with different behavior for operations. > Unlike masked arrays, whose basic behavior is unchanged for "good" data, > the behavior for identical data is completely different. > > I wish I had a good answer for this. I don't remember all of the past > suggestions, but it looks like one of the following solutions is needed: > > 1) Campaign for new operators in Python (various proposals to this affect. > This is probably best from the Numeric point of view (maybe not from > Python in general though). > 2) Allow different array classes with different behavior, but come up with > conventions and utilities for library developers to produce versions of > arrays compatible with the convention assumed for the module (and convert > back to the input type for output values). This doesn't prevent all > opportunities for confusion and errors however. It also puts a stronger > burden on library developers. > 3) Do nothing and deal with the resulting mess. Perhaps the two camps have > little need for each other's tools and it won't be much of a problem. > Do option 2 retroactively if it is a problem. > > Other suggestions? > > Perry > > > > Perry > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From hinsen at cnrs-orleans.fr Wed Mar 6 11:25:07 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Wed Mar 6 11:25:07 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: "Perry Greenfield" writes: > discussed (and never resolved) a couple years ago. Suppose I do this: > > x = a.M * libfunc(b.M, c.M) > > where libfunc is a 3rd party module written in Python that was written > assuming that operators were elementwise operators. It may silently Then you are calling a routine with wrong arguments - that can happen in Python all the time. >From my point of view, arrays and matrices are two entirely different things. A function written for matrix objects cannot be expected to work with array objects, and vice versa. Matrix operations should return matrix objects, and array operations should return array objects. What arrays and matrices have in common is not semantics, but implementation. That is something that implementors should profit from, but users shouldn't even need to know about. The discussion about matrices has focused on matrix multiplication as the main difference between the two objects. I suppose this was motivated by comparisons to Matlab and similar environments, which do not have the notion of data types and thus cannot properly distinguish between matrices and arrays. I don't see why should follow this limited approach. A matrix object should not only do matrix multiplication properly, but also provide methods such as diagonalization, application of functions as matrix functions, etc. That would be much more than syntactic sugar, it would be a real implementation of the mathematical concept "matrix". Seen from this point of view, it is not at all clear why an array should have an attribute that is an "equivalent" matrix, as no such equivalence exists in general (only for 2D arrays). > Conceivably there will be modules useful for both kinds of objects. Do I don't think so. The only analogous operations between arrays and matrices are addition, subtraction, negation, and multiplication with a scalar, and those would use the same syntax anyway. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From paul at pfdubois.com Wed Mar 6 11:38:51 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Mar 6 11:38:51 2002 Subject: [Numpy-discussion] Final prep for 21.0 Message-ID: <000001c1c546$45df76e0$1001a8c0@NICKLEBY> I have committed to CVS that which I expect to become 21.0. The final set of changes is a revision to the way we use distutils to make RPMs. I am unqualified to test these but the submitter (Vermeulen) did. Please update your CVS and test this version. Developers, please make no further commits until I make the release. I will make the release this weekend unless I receive advice to the contrary from testers. From paul at pfdubois.com Wed Mar 6 11:49:22 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Mar 6 11:49:22 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: <000801c1c547$dcfdb950$1001a8c0@NICKLEBY> I believe the correct solution is a major upgrade to Matrix.py along the lines of what is done in MA; that is, to craft an object that uses Numeric for its implementation but which defines all its own operators in a manner that is semantically sensible for the type of object it is. Such an upgrade could subsequently be improved by using different underlying software for various operations, or even more sophisticated changes such as using a transposed attribute to lazily evaluate transposes in a cleaner way than Numeric does it. Also an upgrade to Numarray would then be virtually painless. If you have never looked at MA, please examine source file Packages/MA/Lib/MA.py before commenting. This file is fairly complex and the required changes to Matrix.py would be considerably simpler; but you can verify that it is fairly straightforward to do. On my project we have done something similar to create a "climate data variable" object. Such a design includes an "exit" function to allow the instance to cheaply view itself as the underlying Numeric array. (In MA, this is "filled", which makes a Numeric array by replacing missing values, but if there are no missing values returns the underlying Numeric array). I'm willing to do this for the community but it would have a side effect; if anyone has been doing "from Matrix import *" they would suddenly get a lot more names imported that would conflict with any imported from Numeric. From pearu at cens.ioc.ee Wed Mar 6 12:00:31 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 6 12:00:31 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: On Wed, 6 Mar 2002, Perry Greenfield wrote: > Other suggestions? Here is one suggestion that is based on the observation that all we need is an easy way to tell that the following operation should be applied as a matrix operation. So, the suggestion is to provide an attribute or a member function that returns an array (note, not a Matrix instance) that has a bit, called it asmatrix, set true but _only_ temporally. The bit is cleaned on every operation. And before applying an operation, the corresponding method (currently there seem to be only four relevant methods: __mul__, __pow__ and their r-versions) checks if either of the operants has asmatrix bit true, then performs the corresponding matrix operation, otherwise the default element-wise operation. And before returning, it cleans up asmatrix bit. For the sake of an example, let .m be Numeric array attribute that when accessed sets asmatrix=1 and returns the array. Examples: a * b - is element-wise multiplication a.m * b, a * b.m - are matrix multiplications, the resulting array, as well a and b, have asmatrix=0 a.m ** -1 - is matrix inverse sin(a) - element-wise sin sin(a.m) - matrix sin To summarize the main ideas: * array has asmatrix bit that most of the time is false. * there is a way to set the asmatrix bit true, either by .m or .M attributes or .m(), .M(), .. methods that return the same array. * __mul__, __pow__, etc. methods check if either operant has asmatrix true, then performs the corresponding matrix operation, otherwise the corresponding element-wise operation. * all operations clean asmatrix bit. So, what do you think? Pearu From perry at stsci.edu Wed Mar 6 12:43:59 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Mar 6 12:43:59 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: Pearu writes: > > Here is one suggestion that is based on the observation that all we need > is an easy way to tell that the following operation should be applied > as a matrix operation. So, the suggestion is to provide an attribute or a > member function that returns an array (note, not a Matrix instance) that > has a bit, called it asmatrix, set true but _only_ temporally. The bit is > cleaned on every operation. And before applying an operation, the > corresponding method (currently there seem to be only four relevant > methods: __mul__, __pow__ and their r-versions) checks if either of > the operants has asmatrix bit true, then performs the corresponding matrix > operation, otherwise the default element-wise operation. And before > returning, it cleans up asmatrix bit. > > For the sake of an example, let .m be Numeric array attribute that when > accessed sets asmatrix=1 and returns the array. > Examples: > > a * b - is element-wise multiplication > a.m * b, a * b.m - are matrix multiplications, the resulting > array, as well a and b, have asmatrix=0 > a.m ** -1 - is matrix inverse > sin(a) - element-wise sin > sin(a.m) - matrix sin > > To summarize the main ideas: > * array has asmatrix bit that most of the time is false. > * there is a way to set the asmatrix bit true, either by .m or .M > attributes or .m(), .M(), .. methods that return the same array. > * __mul__, __pow__, etc. methods check if either operant has asmatrix > true, then performs the corresponding matrix operation, otherwise > the corresponding element-wise operation. > * all operations clean asmatrix bit. > > So, what do you think? > > Pearu > This is a clever idea that reminds me of something we were considering for something else (exactly what I can't quite remember :-). But like all such schemes it still does produce an object, and a user might infer (reasonably or unreasonably) that if they can type x = a.m * b Then they can treat a.m as an array, e.g., x = a.m In that case, the special case behavior still becomes camouflaged, when x is used later, albeit only to bite you once. It is clear what the operation does when the attribute is used in the expression, but if it isn't, there is still room for confusion. I like the idea, but I'll have to think about whether the downside outweighs the benefits. Perry From perry at stsci.edu Wed Mar 6 12:49:06 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Mar 6 12:49:06 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: Konrad Hinsen writes: > > discussed (and never resolved) a couple years ago. Suppose I do this: > > > > x = a.M * libfunc(b.M, c.M) > > > > where libfunc is a 3rd party module written in Python that was written > > assuming that operators were elementwise operators. It may silently > > Then you are calling a routine with wrong arguments - that can happen > in Python all the time. > > From my point of view, arrays and matrices are two entirely different > things. A function written for matrix objects cannot be expected to > work with array objects, and vice versa. Matrix operations should > return matrix objects, and array operations should return array > objects. > > What arrays and matrices have in common is not semantics, but > implementation. That is something that implementors should profit > from, but users shouldn't even need to know about. > > The discussion about matrices has focused on matrix multiplication as > the main difference between the two objects. I suppose this was > motivated by comparisons to Matlab and similar environments, which do > not have the notion of data types and thus cannot properly distinguish > between matrices and arrays. I don't see why should follow this > limited approach. > > A matrix object should not only do matrix multiplication properly, but > also provide methods such as diagonalization, application of functions > as matrix functions, etc. That would be much more than syntactic > sugar, it would be a real implementation of the mathematical concept > "matrix". > > Seen from this point of view, it is not at all clear why an array > should have an attribute that is an "equivalent" matrix, as no such > equivalence exists in general (only for 2D arrays). > Not an unreasonable position. Are you also arguing that the two types should know about each other and raise an exception if there is an attempt to mix them in operations? Perry From bsder at allcaps.org Wed Mar 6 12:50:18 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Wed Mar 6 12:50:18 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: <20020306123153.U55509-100000@mail.allcaps.org> Well, I believe that it solves the wrong problem. What I really want are Matrix objects that stay as Matrix objects even through their associated functions. And Array objects that stay as array objects. Why add any characters or casts at all when the objects can stay in their original type? Please correct me if I'm missing something here. -a On Tue, 5 Mar 2002, Travis Oliphant wrote: > > Recently there has been discussion on the list about the awkwardness of > matrix syntax when using Numeric Python. > > Matrix expressions can be awkard to express in Numeric which is a negative > mark on an otherwise excellent computing environment. > > Currently part of the problem can be solved by working with Matrix objects > explicitly: > > a = Matrix.Matrix("[1 2 3; 4 5 6]") # Notice the strings. > > However, most operations return arrays which have to be recast to matrices > using at best a character with parenthesis: > > M = Matrix.Matrix > > M(sin(a)) * M(cos(a)).T > > The suggestion was made to add ".M" as an attribute of arrays which returns a > matrix. Thus, the code above can be written: > > sin(a).M * cos(a).M.T > > While some aesthestic simplicity is obtained, the big advantage is in > consistency. Somebody else may decide that > > P = Matrix.Matrix is a better choice. But, if we establish that > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? > > -Travis Oliphant > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From oliphant at ee.byu.edu Wed Mar 6 13:01:39 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Mar 6 13:01:39 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: Message-ID: > Like Greg I'm wary of having many different interpretations > for indexing behavior (I'm not even that crazy about having > numarray handle boolean index arrays differently than the others > --something we haven't implemented yet, and perhaps we shouldn't). > You may be wary, but there are already multiple ways people think about using integers to index arrays. I'm trying to suggest a facility that allows several different interpretations of array access. > Before discussing the merits of this, shouldn't we take the attitude > that absence of feedback is not necessarily equivalent to approval, > particularly for something that affects the public interface of > the module? I would feel better about this if I saw several > affirming the need for such features rather than few openly > opposing it. > I do have this view. I'm not changing anything, right now. Well, I affirm that this is one of the drawbacks of Numeric as compared with other array-oriented environments. We definitely need a way to index an array using integers and masks. I guess if nobody else feels this way, then I'm alone in my discomfort. > But if one were to do something like this, I would use a different kind > of object than 0d arrays, e.g., an instance of a class defined for just > that purpose. We could do that as well. > You would really want to make sure that no data could > mistakenly be interpreted as a flag, even if the chances were remote. > I would also not use an underscore as the beginning of the name. I'm not particularly wedded to _I notation, it was just a start. > Maybe > I'm wrong about this, but I've come to take that to mean its a private > variable that should not be used by users of the module, and that usage > would confuse that. Finally, the name of the flag should be descriptive > (e.g. MaskInd). > > But there could be better alternatives. As an example, > > x[nonzero(maskarray)] instead of x[maskarray, MaskInd] I've thought about that, too, it would work if nonzero returned some class that stored away (but didn't copy) the maskarray info. -Travis From pearu at cens.ioc.ee Wed Mar 6 13:03:40 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 6 13:03:40 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: On Wed, 6 Mar 2002, Perry Greenfield wrote: > This is a clever idea that reminds me of something we were considering > for something else (exactly what I can't quite remember :-). But like > all such schemes it still does produce an object, and a user might > infer (reasonably or unreasonably) that if they can type > > x = a.m * b > > Then they can treat a.m as an array, e.g., > > x = a.m Yes, I was also thinking about the same issue. If we could somehow confince users that .m attribute comes only with operations, e.g. a .m* b then it should be safe... Note that in order to use .m feature, an user must read about it somewhere, say, from tutorial. And there should be noted explicitely where _not_ to use .m feature, that is, in assignments and in function arguments, in order to avoid side any unwanted side effects. Actually, library functions probably use asarray for arguments, this function could also clean up the asmatrix bit. Pearu From pearu at cens.ioc.ee Wed Mar 6 13:05:57 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 6 13:05:57 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: On Wed, 6 Mar 2002, Pearu Peterson wrote: > Actually, library functions probably use asarray for arguments, this > function could also clean up the asmatrix bit. Ok, ignore this remark. Library functions actually would use this bit for triggering different operations, eg. sin(a), sin(a.m). Pearu From oliphant at ee.byu.edu Wed Mar 6 13:06:50 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Mar 6 13:06:50 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: > > Other suggestions? > > Here is one suggestion that is based on the observation that all we need > is an easy way to tell that the following operation should be applied > as a matrix operation. So, the suggestion is to provide an attribute or a > member function that returns an array (note, not a Matrix instance) that > has a bit, called it asmatrix, set true but _only_ temporally. The bit is > cleaned on every operation. And before applying an operation, the > corresponding method (currently there seem to be only four relevant > methods: __mul__, __pow__ and their r-versions) checks if either of > the operants has asmatrix bit true, then performs the corresponding matrix > operation, otherwise the default element-wise operation. And before > returning, it cleans up asmatrix bit. > Frankly, I like this kind of proposal. I disagree with Konrad about the separation between arrays and matrices. From my discussions with other people, it sounds like this is actually a point of disagreement for many in the broader community. To me, matrices are just arrays of rank <=2 which should be interpreted with their specific algebra. > For the sake of an example, let .m be Numeric array attribute that when > accessed sets asmatrix=1 and returns the array. > Examples: > > a * b - is element-wise multiplication > a.m * b, a * b.m - are matrix multiplications, the resulting > array, as well a and b, have asmatrix=0 > a.m ** -1 - is matrix inverse > sin(a) - element-wise sin > sin(a.m) - matrix sin > > To summarize the main ideas: > * array has asmatrix bit that most of the time is false. > * there is a way to set the asmatrix bit true, either by .m or .M > attributes or .m(), .M(), .. methods that return the same array. > * __mul__, __pow__, etc. methods check if either operant has asmatrix > true, then performs the corresponding matrix operation, otherwise > the corresponding element-wise operation. > * all operations clean asmatrix bit. > Again, I wouldn't mind it, but I suspect the more aesthetically critical on the list will dislike it because it blurs the (currently clumsy) distinction between arrays and Matrices that I'm beginning to see people actually like. -Travis From pearu at cens.ioc.ee Wed Mar 6 13:25:25 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 6 13:25:25 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: <000801c1c547$dcfdb950$1001a8c0@NICKLEBY> Message-ID: On Wed, 6 Mar 2002, Paul F Dubois wrote: > If you have never looked at MA, please examine source file > Packages/MA/Lib/MA.py before commenting. This file is fairly complex and You mean those who never used MA should take a day off to read 2000 line code in order to understand the implications of using MA and give a comment? ;-) Indeed, I have never used MA but as a first look it does not look too promising regarding performance: e.g. there seem to be lots of python code involved to apply a simple multiplication of arrays. Could someone more familiar with MA give a comment on performance issues, especially, keeping in mind number crunchers? Pearu From paul at pfdubois.com Wed Mar 6 13:52:07 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Mar 6 13:52:07 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: <000101c1c559$1ce27950$1001a8c0@NICKLEBY> Travis wrote: To me, matrices are just arrays of rank <=2 which should be interpreted with their specific algebra. -- If a class is roughly data plus behaviors, a matrix is not simply an array of rank <=2. You can express the concept of a matrix most cleanly as a separate class. Adding an argumentless member function .M to "convert" from one class to the other, and not make the other class explicit, is a bit weird. But if the other class "Matrix" is explicit, you needn't give it a privleged status with respect to Numeric.array by having a member function in Numeric.array that amounts to a Matrix constructor. The only real motivation for that seems to me to be the feeling that M(x) is somehow less clear than x.M. Note that except for a tricky property behavior, you really ought to have to write the latter as x.M(). As I said, I think we can beef up Matrix to make the linear algebra freaks happy, even to making things like transpose(A)*(B) as optimized operations. From gpk at bell-labs.com Wed Mar 6 14:19:01 2002 From: gpk at bell-labs.com (Greg Kochanski) Date: Wed Mar 6 14:19:01 2002 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #406 - 13 msgs References: Message-ID: <3C869538.202DC99F@bell-labs.com> > Are you opposed to anything going inside the [] operator to help indicate > how the objects inside should be interpreted I'm opposed to overloading any operator so that it becomes confusing. Confusion is generated when a look at a small part of code (a line or two) does not tell you what the code is doing. This is a real problem for languages like Python and Perl, where any variable can contain any data, IF the code behaves in different ways, dependent on the data type. "Different" is defined by the user, and it roughly translates into the amount of coffee you have to drink to fix the code, if the input data changes it's type. > > Besides, you know some idiot is going to eventually write > > code that looks like this: > > > > def access(a, b, x): > > return a[b, x] # I think that a must be a 2-D array... > > > > # 1000 lines later... > > access(a, _I) # Whoops all my assumptions were wrong... > > I have no idea, what your concern is here. This would result in an error > currently and under the scheme I suggested. Then, perhaps you should explain again. I assumed you were proposing a magic second argument to [] that would cause the first argument to be interpreted differently. If so, I think that's a bad idea from a human-interface point of view, because (a) To a person who only uses Numpy occasionally, it is not obvious that an argument is "magic". That makes the code less readable. (b) It is possible to write code where one can pass in the "magic" value in a variable, and no simple inspection of the code will tell if it is magic or not. Using an explicit function or method call fixes (a) by telling the naive user that "this is not normal array access here." It also fixes (b) by making it more obvious that fancy stuff is going on. From oliphant at ee.byu.edu Wed Mar 6 17:42:03 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Wed Mar 6 17:42:03 2002 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #406 - 13 msgs In-Reply-To: <3C869538.202DC99F@bell-labs.com> Message-ID: > > > # 1000 lines later... > > > access(a, _I) # Whoops all my assumptions were wrong... > > > > I have no idea, what your concern is here. This would result in an error > > currently and under the scheme I suggested. > > > Then, perhaps you should explain again. No you grasp it, I think your example contained errors (you only called access with two arguments rather than the three expected by the interface, for example). Thank you for explaining your concerns in more detail, below. > > If so, I think that's a bad idea from a human-interface point of view, > because > > (a) To a person who only uses Numpy occasionally, it is not obvious > that an argument is "magic". That makes the code less readable. That can be a "problem", but it is a "problem" in many languages that currently are in wide-spread use in numerical computing. Apparently, the convenience outweighs the perceived concern. Currently, whenever you use variables to index arrays you know that something is going on that you can't see by just looking at it (i.e. something fancy). For example, you can currently write a[b] and have this do different things depending on whether b is a sequence or a slice object or an integer. I don't see how the addition of another check drastically changes the current state. I actually think the flexiblility is a good thing and it makes Python very powerful. It comes down to trusting people to write code you can understand (if you have any reason to interface with them in the first place). My programming philosophy definitely leans toward empowering people, even if it means they can do something stupid later. > > (b) It is possible to write code where one can pass in the "magic" value > in a variable, and no simple inspection of the code will tell if it is > magic > or not. This is already possible (and frequently used), now. > Using an explicit function or method call fixes (a) by telling the naive > user > that "this is not normal array access here." > You are focusing on the naive user at the expense of convenience for the power user. I think this is appropriate sometimes, but not when we are talking about a language that somebody will use constantly for many years to implement their daily work. I think it makes the code much more readable and therefore understandable and maintainable to overload the [] operator rather than use a method call.. MATLAB's big advantage over Numeric Python right now is that it allows this sort of indexing already which Python users currently have to implement using a function call. > It also fixes (b) by making it more obvious that fancy stuff is going > on. > Whenever you see a[b] instead of a[1:3,4] you alread know that something fancy is going on... Nothing would change here. From eric at enthought.com Wed Mar 6 22:05:02 2002 From: eric at enthought.com (eric) Date: Wed Mar 6 22:05:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. References: Message-ID: <020601c1c595$24bd76c0$6b01a8c0@ericlaptop> Boy did this one get a rise! Nice to hear so many voices. I also feel we need a more compact notation for linear algebra and would like to be able to do it without explicitly casting arrays to Matrix.Matrix objects. This attribute approach will work, but I wonder if trying the "adding an operator to Python" approach one more time would be worth while. At Python10 developer's day, Guido explicitly mentioned the linear algebra operator in a short comment saying something to the affect that, if the numeric community could agree on an appropriate operator, he would strongly consider the addition. He also mentioned the strangness of the 2 PEPs proposed on the topic at a coffee break... I noticed the status of both PEPs is "deferred." http://python.sourceforge.net/peps/pep-0211.html This one proposes the @ operator for outer products. http://python.sourceforge.net/peps/pep-0225.html This one proposes decorating the current binary ops with some symbols to indicate that they have different behavior than the standard binary ops. This is similar to Matlab's use of * for matrix multiplication and .* for element-wise multiplication or to R's use of * for element-wise multiplication and %*% for "object-wise" multiplication. It proposes prepending ~ to operators to change their behavior so that ~* would become matrix multiply. The PEP is a little more general, but this gives the flavor. My hunch is that some form of the second (perhaps drastically reduced) would meet with more success. The suggested ~* or even the %*% operator are both palitable. Such details can be decided later. The question is whether there is sufficient interest to try and push the operator idea through? It would take much longer than choosing something we can do ourselves (like .M), but the operator solution seems more desirable to me. eric ----- Original Message ----- From: "Travis Oliphant" To: Sent: Tuesday, March 05, 2002 11:44 PM Subject: [Numpy-discussion] adding a .M attribute to the array. > > Recently there has been discussion on the list about the awkwardness of > matrix syntax when using Numeric Python. > > Matrix expressions can be awkard to express in Numeric which is a negative > mark on an otherwise excellent computing environment. > > Currently part of the problem can be solved by working with Matrix objects > explicitly: > > a = Matrix.Matrix("[1 2 3; 4 5 6]") # Notice the strings. > > However, most operations return arrays which have to be recast to matrices > using at best a character with parenthesis: > > M = Matrix.Matrix > > M(sin(a)) * M(cos(a)).T > > The suggestion was made to add ".M" as an attribute of arrays which returns a > matrix. Thus, the code above can be written: > > sin(a).M * cos(a).M.T > > While some aesthestic simplicity is obtained, the big advantage is in > consistency. Somebody else may decide that > > P = Matrix.Matrix is a better choice. But, if we establish that > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? > > -Travis Oliphant > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From eric at enthought.com Wed Mar 6 23:27:05 2002 From: eric at enthought.com (eric) Date: Wed Mar 6 23:27:05 2002 Subject: [Numpy-discussion] big picture? Message-ID: <022f01c1c5a0$a5364100$6b01a8c0@ericlaptop> While we're discussing several big issues, it would be worthwhile to step back and get an idea of what people feel is still missing for numeric computation in the Python language and also what is missing from Numeric itself. I'm not talking about libraries here, but issues with notation and array functionality that you run into in day-to-day programming. Things are pretty dang good right now, but there are some areas (array indexing, matrix multiply, etc.) that some people see as sub-optimal or better implemented in other langauges. A list will provide a "big picture" of where we want to go (maybe we are there...) and also help us pick our battles on language changes, etc. So, what mathematical expressions are commonly used and yet difficult to write in Python? I don't mean integrals divergence, etc. I mean things like matrix multiply and transpose. Here is a beginning to the list: 1. Matrix Multiply -- should we ask for ~*? 2. Transpose -- In a perfect world, we'd have an operator for this. 3. complex conjugate -- An operator for this would also be welcomed. 4. Others?? These three have all been discussed on this list or on the SciPy list in the last month, so they are obvious. I don't think there is a solution for 2 and 3 besides using the current function or method calls (but they are still on my list). As I mentioned in my last post, 1 might be fixable. As far as core Numeric functionality: 1. Array indexing with arrays. 2. .M attributes -- an alternative to (1) in language changes. And I'll add a third, that I'd like. 3. tensor notation indexing as in the Blitz++ array library http://www.oonumerics.org/blitz/manual/blitz03.html#l75 NewAxis and Ellipses allow for the same functionality, but the tensor notation is much easier to read. This requires yet more indexing trickery though... Please limit this thread to language changes or Numeric enhancements. Desired changes to the current behavior or interface of Numeric should be saved for a different discussion. thanks, eric -- Eric Jones Enthought, Inc. [www.enthought.com and www.scipy.org] (512) 536-1057 From nwagner at mecha.uni-stuttgart.de Thu Mar 7 00:23:02 2002 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Thu Mar 7 00:23:02 2002 Subject: [Numpy-discussion] OverflowError: math range error Message-ID: <3C873258.C238E33D@mecha.uni-stuttgart.de> Hi, >gdb /usr/bin/python GNU gdb 20010316 Copyright 2001 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-suse-linux"...(no debugging symbols found)... (gdb) run fredholm.py Starting program: /usr/bin/python fredholm.py (no debugging symbols found)...[New Thread 1024 (LWP 13596)] (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)... Traceback (most recent call last): File "fredholm.py", line 2, in ? from LinearAlgebra import * File "/usr/lib/python2.1/site-packages/Numeric/LinearAlgebra.py", line 10, in ? import MLab File "/usr/lib/python2.1/site-packages/Numeric/MLab.py", line 17, in ? import RandomArray File "/usr/lib/python2.1/site-packages/Numeric/RandomArray.py", line 30, in ? seed() File "/usr/lib/python2.1/site-packages/Numeric/RandomArray.py", line 24, in seed ndigits = int(math.log10(t)) OverflowError: math range error (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)... Program exited with code 01. (gdb) Any idea ? Thanks in advance. Nils From hinsen at cnrs-orleans.fr Thu Mar 7 00:59:04 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 00:59:04 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. References: Message-ID: <200203070858.g278wJJ14269@chinon.cnrs-orleans.fr> "Perry Greenfield" writes: > Not an unreasonable position. Are you also arguing that the two types > should know about each other and raise an exception if there is an > attempt to mix them in operations? No need to know about each other, they'd be different types and therefore by default incompatible. There should of course be some explicit conversion facility. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Thu Mar 7 01:12:05 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 01:12:05 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. References: <020601c1c595$24bd76c0$6b01a8c0@ericlaptop> Message-ID: <200203070910.g279AlE14306@chinon.cnrs-orleans.fr> "eric" writes: > Matrix.Matrix objects. This attribute approach will work, but I > wonder if trying the "adding an operator to Python" approach one > more time would be worth while. At Python10 developer's day, Guido If it were only one operator, perhaps, although I might even give up on Python completely if starts to use Perlish notations like ~@!. But if you really want to have a short-hand syntax for the common matrix operations, you'd need multiplication, division (shorthand for multiplying by inverse), power, transpose and hermitian transpose. If you want to go the "operator way", the goal should rather be something like APL, with composite operators. Matrix multiplication would then be a special case of a reduction operator that uses multiplication and addition (in APL this is written as "+.x"). Note that I am *not* suggesting this, my opinion is still that matrices and arrays should be semantically different types. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From geus at inf.ethz.ch Thu Mar 7 01:21:04 2002 From: geus at inf.ethz.ch (Roman Geus) Date: Thu Mar 7 01:21:04 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines References: Message-ID: <3C8730E9.293490E8@inf.ethz.ch> Hello, Pearu Peterson wrote: > > Hi, > > On Wed, 6 Mar 2002, Roman Geus wrote: > > > So, what really needs to be changed (at least for this machine) is how > > Numerical Python calls BLAS/LAPACK. It also needs to use 32bit integers. > > So this means using 'int' instead of 'long int'. > > Having wrapped a lot of Fortran codes to Python, I agree, that Numerical > Python should use 'int' instead of, 'long'. Though I have little > influence to make this change to happen in Numeric but just agreeing with > you. > > Pearu What would be the best way to convince the NumPy developers to use 'int' instead 'long' for Fortran integers? I would be willing to help making the necessary changes. -- Roman From hinsen at cnrs-orleans.fr Thu Mar 7 01:22:04 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 01:22:04 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking References: Message-ID: <200203070915.g279Fhl14309@chinon.cnrs-orleans.fr> Travis Oliphant writes: > Well, I affirm that this is one of the drawbacks of Numeric as compared > with other array-oriented environments. We definitely need a way to index > an array using integers and masks. > > I guess if nobody else feels this way, then I'm alone in my discomfort. No, I basically agree, I just don't have that need immediately and therefore am less motivated to work on it. My preferred solution would be to use special objects (in the spirit of the slice object) for special indexing methods, rather than special cases of existing objects. The advantage is that any number of those can be added over time as the need arises, and there is never a risk of changing the meaning of existing code. However, I do think that this should be thought out and discussed carefully, but unfortunately I won't be able to help much due to lack of time. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Thu Mar 7 01:24:01 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 01:24:01 2002 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #406 - 13 msgs References: Message-ID: <200203070922.g279MUl14314@chinon.cnrs-orleans.fr> Travis Oliphant writes: > > (a) To a person who only uses Numpy occasionally, it is not obvious > > that an argument is "magic". That makes the code less readable. > > That can be a "problem", but it is a "problem" in many languages that > currently are in wide-spread use in numerical computing. > Apparently, the convenience outweighs the perceived concern. Currently, But some people, like me, prefer Python to other languages for exactly that reason. I'll give up shortness for clarity any time. So I am certainly agains "magic" objects, but different kinds of indexing objects, provided that they can be inspected/printed in the code, are nothing magic to me. At the moment, each axis index can be an integer, a range, and a slice. Adding a "boolean mask" to this seems like a natural extension. And even "reduction" operations such as Paul mentioned make perfect sense. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bsder at allcaps.org Thu Mar 7 01:32:03 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Thu Mar 7 01:32:03 2002 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #406 - 13 msgs In-Reply-To: <200203070922.g279MUl14314@chinon.cnrs-orleans.fr> Message-ID: <20020307012721.T56438-100000@mail.allcaps.org> On Thu, 7 Mar 2002, Konrad Hinsen wrote: > So I am certainly agains "magic" objects, but different kinds of > indexing objects, provided that they can be inspected/printed in the How do the new Iterator mechanisms now active in Python 2.2 play into these ideas of indexing objects? Can we get these kinds of slices by providing appropriate iterator objects and references? -a From bsder at allcaps.org Thu Mar 7 02:31:04 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Thu Mar 7 02:31:04 2002 Subject: [Numpy-discussion] big picture? In-Reply-To: <022f01c1c5a0$a5364100$6b01a8c0@ericlaptop> Message-ID: <20020307014154.N56438-100000@mail.allcaps.org> On Thu, 7 Mar 2002, eric wrote: > 1. Matrix Multiply -- should we ask for ~*? > 2. Transpose -- In a perfect world, we'd have an operator for this. > 3. complex conjugate -- An operator for this would also be welcome I, personally, don't find the arguments particularly compelling for extra operators for numeric stuff. While extra operators may make code more "math-like", "MATLAB-like" or "Fortran-like", it won't help with efficiency. If I have code to compute A*x+B*y+C, I'm going to have to call out the A*x+Z and Z=B*y+C primitives as functions anyway. No set of binary operators will work out that optimization. Requesting domain specific operators actually scares me. The main problem is that it is *impossible* to remove them if your choices later turn out to be confusing or wrong. If operators must be added, I would rather see a generic operator mechanism in place in Python. Choice 1 would be a fixed set of operators getting allocated (~* ~+ ~- etc.) which the core language *does not use*. Then any domain can override with their special meaning without collapsing the base language under the weight of domain specific extensions. Now the specific domains can make their changes and only break their own users rather than the Python community at large. Choice 2 would be for a way for Python to actually adjust the interpretation semantics and introduce new operators from inside code. This is significantly trickier and more troublesome, but has the potential of being a much more generally useful solution (far beyond the realm of numerics). Furthermore, it allows people to make code look like whatever they choose. -a From hinsen at cnrs-orleans.fr Thu Mar 7 02:42:02 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 02:42:02 2002 Subject: [Numpy-discussion] Re: Numpy-discussion digest, Vol 1 #406 - 13 msgs In-Reply-To: <20020307012721.T56438-100000@mail.allcaps.org> (bsder@allcaps.org) References: <20020307012721.T56438-100000@mail.allcaps.org> Message-ID: <200203071040.g27AewT14812@chinon.cnrs-orleans.fr> > How do the new Iterator mechanisms now active in Python 2.2 play into > these ideas of indexing objects? Can we get these kinds of slices by > providing appropriate iterator objects and references? An iterator needs to be called for each element, which is probably too slow for a general "extended indexing" solution. But an iterator yielding boolean values could perhaps be a useful class of index object in some cases. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Thu Mar 7 03:23:02 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 03:23:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: <000801c1c547$dcfdb950$1001a8c0@NICKLEBY> (paul@pfdubois.com) References: <000801c1c547$dcfdb950$1001a8c0@NICKLEBY> Message-ID: <200203071121.g27BLe714885@chinon.cnrs-orleans.fr> > I believe the correct solution is a major upgrade to Matrix.py along the > lines of what is done in MA; that is, to craft an object that uses > Numeric for its implementation but which defines all its own operators > in a manner that is semantically sensible for the type of object it is. That is exactly my idea as well. However, from a quick glance at MA, it seems that this solution could suffer from performance problems when done in Python. What are the real-life experiences with MA in that respect? I suppose the new type inheritance mechanisms in Python 2.2 could help to make this more efficient, but I haven't used them for anything yet. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From paul at pfdubois.com Thu Mar 7 07:57:03 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Thu Mar 7 07:57:03 2002 Subject: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines In-Reply-To: <3C8730E9.293490E8@inf.ethz.ch> Message-ID: <000001c1c5f0$8ca17920$1001a8c0@NICKLEBY> If someone is going to make the change they should change the source to use FortranInt or some similar typedef so that one ifdef could be used to change it. I believe the current lapack/blas were made by an automatic conversion tool. It is easy to make a case that they shouldn't even be in the distribution, that rather a user should install their own library. However, this is a problem on Windows, where many users do not have a development environment, and in general, because it makes the instructions for installing more complicated. So we have sort of felt stuck with it. I have no real way of convincing myself that the proposed change won't break some other platform, although it seems unlikely. -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Roman Geus Sent: Thursday, March 07, 2002 1:21 AM To: numpy-discussion at lists.sourceforge.net Subject: Re: [Numpy-discussion] Numerical Python and LAPACK on 64-bit machines Hello, Pearu Peterson wrote: > > Hi, > > On Wed, 6 Mar 2002, Roman Geus wrote: > > > So, what really needs to be changed (at least for this machine) is > > how Numerical Python calls BLAS/LAPACK. It also needs to use 32bit > > integers. So this means using 'int' instead of 'long int'. > > Having wrapped a lot of Fortran codes to Python, I agree, that > Numerical Python should use 'int' instead of, 'long'. Though I have > little influence to make this change to happen in Numeric but just > agreeing with you. > > Pearu What would be the best way to convince the NumPy developers to use 'int' instead 'long' for Fortran integers? I would be willing to help making the necessary changes. -- Roman _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From oliphant at ee.byu.edu Thu Mar 7 09:11:08 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Thu Mar 7 09:11:08 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: <200203070915.g279Fhl14309@chinon.cnrs-orleans.fr> Message-ID: > No, I basically agree, I just don't have that need immediately and > therefore am less motivated to work on it. > > My preferred solution would be to use special objects (in the spirit > of the slice object) for special indexing methods, rather than special > cases of existing objects. The advantage is that any number of those > can be added over time as the need arises, and there is never a risk > of changing the meaning of existing code. > Thanks for the comments you have made. I always appreciate them. Are you suggesting something like: b = IndexArray([1,3,10,100]) a[b]? This is really not much different than. a[[1,3,10,100],IndexArray] which is essentially what I've suggested (I was looking for shortcuts), but in principle IndexArray could be a class with a method that the code in Numeric interfaces with. > However, I do think that this should be thought out and discussed > carefully, but unfortunately I won't be able to help much due to lack > of time. > Thanks for participating thus far. -Travis From hinsen at cnrs-orleans.fr Thu Mar 7 10:32:04 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 7 10:32:04 2002 Subject: [Numpy-discussion] Adding a flag to allow integer array access and masking In-Reply-To: (message from Travis Oliphant on Thu, 7 Mar 2002 10:13:05 -0500 (EST)) References: Message-ID: <200203071830.g27IUxA16088@chinon.cnrs-orleans.fr> > Are you suggesting something like: > > b = IndexArray([1,3,10,100]) > > a[b]? Exactly. With IndexArray being some special object (if only a thin wrapper), that prints differently from a simple array and can be type-tested. > This is really not much different than. > > a[[1,3,10,100],IndexArray] Except that in the first case, there is exactly one indexing object per axis, the operation can be a different one along each axis, and the index object carries specifies its own meaning. But the effect is the same, of course. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From perry at stsci.edu Thu Mar 7 17:29:02 2002 From: perry at stsci.edu (Perry Greenfield) Date: Thu Mar 7 17:29:02 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <022f01c1c5a0$a5364100$6b01a8c0@ericlaptop> Message-ID: Eric makes a good point about stepping back and thinking about these issues in a broader context. Along these lines I'd like to make a proposal and see what people think. I think Konrad made a very good point about matrix vs array representation. If we made it illegal to combine them in expressions without explicit conversions, we could prevent much confusion about what kind of operations would be performed. An attempt to use one kind of in place of an other would trigger an exception and thus users would always know when that was a problem. Implementing this behavior in numarray would be simple, as would having both share the same implementation for common operations (without any extra performance penalty). That still leaves the question of how to do the conversions, i.e., one of the following options matrix(a) * b # matrix multiply of array (a) with matrix (b) a.M * b a.M() * b likewise: a * array(b) # element-wise multiply of array (a) with matrix (b) a * b.A a * b.A() I strongly prefer the first (functional) form. Rick White has also convinced me that this alone isn't sufficient. There are numerous occasions where people would like to use matrix multiply, even in a predominately "array" context, enough so that this would justify a special operator for matrix multiplication. If the Numeric community is united on this, I think Guido would be receptive. We might suggest a particular operator symbol or pair (triple) but leave him some room to choose alternatives he feels are better for Python (he could well come up with a better one). It would be nice if it were a single character (such as @) but I'd be happy with many of the other operator suggestions (~*, (*), etc.) Note that this does not imply we don't need a seperate matrix object. I think it is clear that simply providing a matrix multiply operator is not going to answer all their needs. As to the other related issues that Eric raises, in particular: operators for transpose and complex conjugate, I guess I don't see these as so important. Both of these are unary operators, and as such either of the following options does not seem to be notationally much worse (whereas using binary functions in place of binary operators is much less readable) transpose(x) conjugate(x) x.transpose() x.conjugate() x.T() x.C() x.T x.C (Personally, I prefer the first two) Perry From hinsen at cnrs-orleans.fr Fri Mar 8 00:06:04 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Mar 8 00:06:04 2002 Subject: [Numpy-discussion] big picture? One proposal References: Message-ID: <200203080804.g2884uY16995@chinon.cnrs-orleans.fr> "Perry Greenfield" writes: > That still leaves the question of how to do the conversions, i.e., one > of the following options ... > I strongly prefer the first (functional) form. Me too. I wouldn't call it "functional" though, it's exactly the way object constructors are written. > Rick White has also convinced me that this alone isn't sufficient. > There are numerous occasions where people would like to use matrix > multiply, even in a predominately "array" context, enough so that this > would justify a special operator for matrix multiplication. If the Could you summarize those reasons please? I know that there are applications of matrix multiplication in array processing, but in my experience they are rare enough that writing dot(a, b) is not a major distraction. Maybe we need to take another step back as well: Python is a general-purpose language, with several specialized subcommunities such as ours, some of them even more important in size. Most likely they are having similar discussions. Perhaps the database guys are discussing why they need two more special operators for searching and concatenating databases. I don't think such requests are reasonable. It is tempting to think that it doesn't matter, if you don't need that operator, you just don't use it. But a big advantage of Python is readability. If we get our (well, *yours*, I don't want it ;-) matrix multiply operator, a month later someone will decide that it's just great for his database application, and the database community will have to get used to it as well. > Numeric community is united on this, I think Guido would be receptive. > We might suggest a particular operator symbol or pair (triple) but Actually I feel quite safe: there might be a majority for another operator, but I don't expect we'd ever agree on a symbol :-) Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From pearu at cens.ioc.ee Fri Mar 8 02:27:13 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri Mar 8 02:27:13 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <200203080804.g2884uY16995@chinon.cnrs-orleans.fr> Message-ID: Hi, On Fri, 8 Mar 2002, Konrad Hinsen wrote: > > Numeric community is united on this, I think Guido would be receptive. > > We might suggest a particular operator symbol or pair (triple) but > > Actually I feel quite safe: there might be a majority for another > operator, but I don't expect we'd ever agree on a symbol :-) Thanks Konrad for this excellent point. Seeing all these proposals about solving our matrix-array issues makes also me feel safer with the current situation. My general point is that a _good_ solution is simple, if a solution is not simple, then it is probably a bad solution. I find separating array and matrix instances (in a sense of raising exception when doing ) not a very simple solution: New concepts are introduced that actually do not solve the simplicity problem of representing matrix operations. As I see it, they only introduce restrictions and the main assumption behind the rationale is that "users are dumb and they don't know what is best for them". This is how I interpret the raised exception as behind the scenes matrix and array are the same (in the sense of data representation). Let me remind at least to myself that the discussion started from a proposal that aimed to write matrix multiplication of arrays Numeric.dot(a,b) in somewhat simpler form. Current solution is to use Matrix class that being just a wrapper of arrays, redefines __mul__ method; and after one has defined a = Matrix.Matrix(a) the matrix multiplication of arrays looks simple: a * b Travis's proposal was to reduce the first step and have it inside an expression in a short form: a.M * b (there have been two implementation approaches proposed for this: (i) a.M returns a Matrix instance, (ii) a.M returns the same array with a temporarily set bit saying that the following operation is somehow special). To me, this looks like a safe solution. Though it is a hack, at least it is simple and understandable in anywhere where it is used (having a * b where b can be either matrix or array, it is not predictable from just looking the code what the result will be -- not very pythonic indeed). The main objection to this proposal seems to be that it deviates from a good pythonic style (ie don't mess with attributes in this way). I'd say that if python does not provide a good solution to our problem, then we are entitled to deviate from a general style. After all, in doing numerics the efficiency issue has a rather high weight. And a generally good style of Python cannot always support that. I guess I am missing here a goal to get Numeric or numarray joined to Python. With this perspective the only efficient solution seems to be introducing a new operator (or new operators). Few candidates have been proposed: a ~* b - BTW, to me this looks like dot(conjugate(a),b). a (*) b - note the in situ version of it: a (*)= b a (**) b - looks ugly enough? ;-) Actually, why not a [*] b, a {*} b for direct products of matrices (BTW (*) seems more appropriate here). So, my ideal preference would be: a .* b - element-wise multiplication of arrays, 2nd pref.: a * b a * b - matrix multiplication of arrays, 2nd preference: a [*] b a (*) b - direct matrix multiplication (also know as tensor product) of arrays a~ - conjugate of arrays a` - transpose of arrays This looks great but requires many new features to Python (new operators, the concept of right-hand unary operator). I don't think that Python should introduce these new operators just because of Numeric community. It is fine if they get used in other fields as well that suffer from the lack of operators. About unary operations: transpose and conjugate. BTW, in complex linear algebra their composition is equally frequent operator. Let me propose the following solution: to have a ** T for Numeric.transpose(a) a ** H for Numeric.transpose(Numeric.conjugate(a)) define T = TransposeOp() H = TransposeOp(conjugate=1) where class TransposeOp: def __init__(self, conjugate=0): self.conjugate = conjugate def __rpow__(self,arr): if self.conjugate: return Numeric.transpose(Numeric.conjugate(a)) return Numeric.transpose(arr) Looks Pythonic to me;-) Regards, Pearu From hinsen at cnrs-orleans.fr Fri Mar 8 03:38:18 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Mar 8 03:38:18 2002 Subject: [Numpy-discussion] big picture? One proposal Message-ID: <200203081137.g28BbKF17817@chinon.cnrs-orleans.fr> > situation. My general point is that a _good_ solution is simple, > if a solution is not simple, then it is probably a bad solution. Agreed. So we just need to agree on what is "simple". > I find separating array and matrix instances (in a sense of raising > exception when doing ) not a very simple solution: > New concepts are introduced that actually do not solve the simplicity > problem of representing matrix operations. As I see it, they I disagree there, separating the concepts of matrix and array *does* solve the simplicity problem in my opinion. Matrices use operators for common matrix operations, and arrays use the same operators for common array operations. > only introduce restrictions and the main assumption behind the rationale > is that "users are dumb and they don't know what is best for them". No, not at all. On the contrary, the rationale is "users are smart and know that arrays and matrices are different" ;-) Unfortunately, I have the impression that there are two schools of thought in collision here (and not just when it comes to programming). There is the "mathematical" school that defines matrices and arrays as abstract entities with certain properties and associated operations. And there is the "engineering" school that sees arrays as a convenient data structure to express certain operations, of which "matrix operations" are a subset. As a student, I had a friend who studied mechanical engineering, and his math exercises made me go mad more than once. When I read "...the vector of the masses...", I just had to scream ;-) Many engineering textbooks have the same effect on me. Now obviously I belong to the "mathematical" school, but I don't expect to convert everyone else to it. So my arguments will remain pythonic and pragmatic: the "mathematical" approach solves the problem without asking for new operators, and thus has a better chance of getting realized. > This is how I interpret the raised exception as behind the scenes matrix > and array are the same (in the sense of data representation). But data representation and data semantics are two different things. Readibility of code depends on semantics, not on internal representations or even implementation. Using the same representation merely implies that conversion should be efficient, but not necessarily implicit. > The main objection to this proposal seems to be that it deviates from a > good pythonic style (ie don't mess with attributes in this way). > I'd say that if python does not provide a good solution to our problem, > then we are entitled to deviate from a general style. After all, in doing That's another point where I disagree. I use Python for many different uses, numerics is only one of them (though the most important one). Uniformity of style is an important value for me. Moreover, I claim that Python *does* provide a good solution, it is merely a very different one. > numerics the efficiency issue has a rather high weight. And a generally > good style of Python cannot always support that. Computational efficiency is not the issue here. If that's all you want, call a BLAS routine for matrix multiplication with two array arguments - doable today, without any modification of whatsoever. Even Fortran programmers do that, instead of suggesting that Fortran 2002 should add a "multiply-by-calling-BLAS" operator. > define > > T = TransposeOp() > H = TransposeOp(conjugate=1) Does that work? I'd expect that a**T would first call .__pow__(T) which quite probably crashes... (Not that it matters to me, I find this almost as abusive as the matrix attributes.) Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From peterson at math.utwente.nl Fri Mar 8 05:05:06 2002 From: peterson at math.utwente.nl (Pearu Peterson) Date: Fri Mar 8 05:05:06 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <200203081117.g28BHXK17770@chinon.cnrs-orleans.fr> Message-ID: On Fri, 8 Mar 2002, Konrad Hinsen wrote: > Unfortunately, I have the impression that there are two schools of > thought in collision here (and not just when it comes to programming). > There is the "mathematical" school that defines matrices and arrays > as abstract entities with certain properties and associated operations. > And there is the "engineering" school that sees arrays as a convenient > data structure to express certain operations, of which "matrix operations" > are a subset. I see arrays as a convenient data structure (being implemented in computer programs) to hold matrices (being members of a mathematical concept). I guess that my views are narrow-minded (but willing to widen it) regarding to consider arrays as a mathematical concept too. Just in mathematics I never (need to) use arrays in that way (my fields are mathematical analysis, integrable systems, and not computer science nor engineering). So, I also belong to the school of "mathematics", but may be into a different one. > That's another point where I disagree. I use Python for many different > uses, numerics is only one of them (though the most important one). > Uniformity of style is an important value for me. Me too. Just I am not too crazy about the constant style but more of if something can be accomplished efficiently. To be honest, I don't like programming in Python because it has a nice style, but because I can accomplish a lot with it in a very efficient way (and not only by using efficient algorithms). Writing, for example, Numeric.transpose(a) instead of a**T, a.T, a`, or whatever just reduces this efficiency. I also realize and respect that for computer scientists (that I presume the developers of Python are) it is crucial to have consistent style for their reasons. Sometimes this style makes some site-specific simple tasks too verbose to follow. > Moreover, I claim that Python *does* provide a good solution, it is > merely a very different one. So, what is it? > Does that work? I'd expect that a**T would first call .__pow__(T) > which quite probably crashes... (Not that it matters to me, I find > this almost as abusive as the matrix attributes.) Yes, it works: >>> from Numeric import * >>> class T: __rpow__ = lambda s,o: transpose(o) ... >>> print array([[1,2],[3,4]]) ** T() [[1 3] [2 4]] And I don't understand why it is abusive (because it is a different approach?). It's just an idea. Pearu From rlw at stsci.edu Fri Mar 8 05:22:15 2002 From: rlw at stsci.edu (Rick White) Date: Fri Mar 8 05:22:15 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <200203080804.g2884uY16995@chinon.cnrs-orleans.fr> Message-ID: On Fri, 8 Mar 2002, Konrad Hinsen wrote: > "Perry Greenfield" writes: > > > Rick White has also convinced me that this alone isn't sufficient. > > There are numerous occasions where people would like to use matrix > > multiply, even in a predominately "array" context, enough so that this > > would justify a special operator for matrix multiplication. If the > > Could you summarize those reasons please? I know that there are > applications of matrix multiplication in array processing, but in my > experience they are rare enough that writing dot(a, b) is not a major > distraction. A couple of quick examples: I do lots of image processing (e.g. deconvolution) using arrays. It is often helpful to take the outer product of two 1-D vectors; e.g. if there is a separable function f(x,y) = g(x)*h(y), you can compute separate g & h vectors and then combine them with outer product (a special case of matrix multiply) to get the desired 2-D image. Another example: when I'm working with either 2-D images or 1-D vectors, it is helpful to be able to compute projections using a set of basis vectors (e.g. for singular value decomposition, eigenvectors, etc.) This is most easily expressed using matrix multiplies - but most uses of the data still treat them as simple arrays instead of matrices. Being able to group these operations together is helpful both for readability of the code and for efficiency of execution. Having said that, I think I actually agree with Konrad that these sorts of operations are rare enough (in the data processing context) that it is no great burden to write them using function calls instead of operators. If we could agree on a matrix-multiply operator, that would be nice -- but if we can't, I can live with that too. For my purposes, I certainly don't see the need to add special operations to do things like transpose. Those should be limited to a separate matrix class as Konrad proposes and should be available as function calls for arrays. Rick ------------------------------------------------------------------ Richard L. White rlw at stsci.edu http://sundog.stsci.edu/rick/ Space Telescope Science Institute Baltimore, MD From hinsen at cnrs-orleans.fr Fri Mar 8 06:52:15 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Mar 8 06:52:15 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: (message from Pearu Peterson on Fri, 8 Mar 2002 14:04:15 +0100 (CET)) References: Message-ID: <200203081449.g28EnET18086@chinon.cnrs-orleans.fr> > regarding to consider arrays as a mathematical concept too. Just in > mathematics I never (need to) use arrays in that way (my fields are > mathematical analysis, integrable systems, and not computer science nor I meant "mathematical" as a school of thought (going from the abstract to the concrete), not as a domain of research. I don't know any area of mathematics either that uses the array concept, but it is definitely common in computer science (as a structured collection of similar data). Image data is a good example. > something can be accomplished efficiently. To be honest, I don't like > programming in Python because it has a nice style, but because I can > accomplish a lot with it in a very efficient way (and not only by using I want both :-) > > Moreover, I claim that Python *does* provide a good solution, it is > > merely a very different one. > > So, what is it? Separate matrix and array objects, with computationally efficient but explicit (verbose) interconversion. > Yes, it works: > >>> from Numeric import * > >>> class T: __rpow__ = lambda s,o: transpose(o) > ... > >>> print array([[1,2],[3,4]]) ** T() > [[1 3] > [2 4]] Right, it works as long as the left argument doesn't try to do the power operation itself. > And I don't understand why it is abusive (because it is a different > approach?). It's just an idea. For me, "power" is a shorthand for repeated multiplication, with certain properties attached to it. I have no problem with using the ** operator for something else, but then on different data types. The idea that a**b could be completely different operations for the same a as a function of b is not very appealing to me. In fact, the idea that an operand instead of the operator defines the operation is not very appealing to me. There's also a more pragmatic objection which is purely technical, I like to stay away from playing tricks with the binary operator type coercion system in Python. Sooner or later it always bites back. And the details have changed over Python releases, which is a compatibility nightmare. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From perry at stsci.edu Fri Mar 8 07:12:11 2002 From: perry at stsci.edu (Perry Greenfield) Date: Fri Mar 8 07:12:11 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: Message-ID: Pearu Peterson writes: [...] > > I find separating array and matrix instances (in a sense of raising > exception when doing ) not a very simple solution: > New concepts are introduced that actually do not solve the simplicity > problem of representing matrix operations. As I see it, they > only introduce restrictions and the main assumption behind the rationale > is that "users are dumb and they don't know what is best for them". > This is how I interpret the raised exception as behind the scenes matrix > and array are the same (in the sense of data representation). > I don't think the issue is whether users are "dumb" but rather, will it be more or less transparent to them what is supposed to happen. Remember, this particular proposal affects in no way the notational convenience when operands are of the same type. It doesn't even affect the notational convenience of most of the examples presented (e.g., a.M * b.M) as long as the resulting operands are of the same type. It only affects cases involving mixed types. Do we really want * (yes, it would be possible to have one dominate over * the other always regardless of order) to mean two different things for example? Will a user always be aware that a module function returns arrays rather than matrices. Yes, users ought to check the documentation, but they often don't or they misremember. The more I think about it the more I come to think it really is better to be safer in this case. It will not be hard for users to explicitly convert, nor should it be notationally cumbersome. E.g. (just to use one the proposed options) matrix(a) * b a * array(b) I don't see this as a big burden. I would rather do it this way myself for my own code. [...] > (there have been two implementation approaches proposed for this: (i) a.M > returns a Matrix instance, (ii) a.M returns the same array with a > temporarily set bit saying that the following operation is somehow > special). > To me, this looks like a safe solution. Though it is a hack, at least it > is simple and understandable in anywhere where it is used (having a * b > where b can be either matrix or array, it is not predictable from just > looking the code what the result will be -- not very pythonic indeed). > It's safer, but it isn't safe. Besides, one could still do this and raise exceptions on mixed types. Is the issue that people strongly want to do (if both a an b are arrays) a.M * b (or matrix(a) * b) instead a.M * b.M (or matrix(a) * matrix(b)) to get matrix behavior? Perry From pearu at cens.ioc.ee Fri Mar 8 07:38:04 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri Mar 8 07:38:04 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: Message-ID: On Fri, 8 Mar 2002, Perry Greenfield wrote: > It's safer, but it isn't safe. Besides, one could still do this and raise > exceptions on mixed types. Is the issue that people strongly want to do > (if both a an b are arrays) > > a.M * b (or matrix(a) * b) > > instead > > a.M * b.M (or matrix(a) * matrix(b)) > > to get matrix behavior? Just to be clear, my suggestion consisted on the following points: 1) not to introduce any new type or concept such as matrix 2) to forget the current Matrix class 3) all objects are arrays, including a.M 4) a.M has a temporary bit set only for the following operation So, with this setup there is no issue with mixed types at all and it is easy to implement. But if there will be introduced a new type, matrix, then this setup does not work. The reason why I proposed the above setup was exactly because I didn't like that a.M would return a different object type in the middle of an expresssion. Pearu From dfb at mrao.cam.ac.uk Fri Mar 8 08:05:29 2002 From: dfb at mrao.cam.ac.uk (David Buscher) Date: Fri Mar 8 08:05:29 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <200203081449.g28EnET18086@chinon.cnrs-orleans.fr> Message-ID: On Fri, 8 Mar 2002, Konrad Hinsen wrote: > > regarding to consider arrays as a mathematical concept too. Just in > > mathematics I never (need to) use arrays in that way (my fields are > > mathematical analysis, integrable systems, and not computer science nor > > I meant "mathematical" as a school of thought (going from the abstract > to the concrete), not as a domain of research. I don't know any area > of mathematics either that uses the array concept, but it is > definitely common in computer science (as a structured collection of > similar data). Image data is a good example. Just my 2c worth: I count myself in the "mathematical" school despite being a physicist. I look at matrices as having a specific algebra which, for instance, cannot be easily made to apply to higher-dimensional arrays. Therefore they are not just arrays looked at in a different way. For object-oriented thinkers this means they are different objects. They may "inherit" a lot of attributes from arrays but are not arrays. Another point to note is that a specific complaint earlier in the thread was the computational inefficiency of using numpy arrays for matrix-intensive operations. It seems to me that it would be far easier to write an optimised set of code for matrices if they were known to be a separate class. An example (which is probably not useful, but serves for illustration) is that one could "cache" or delay transposes etc, knowing that a matrix-multiply was likely to be about to come up. This sort of thing would be more difficult if the result of the transpose would have to be sensible when followed by a generic array operation. David From paul at pfdubois.com Fri Mar 8 08:25:06 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Mar 8 08:25:06 2002 Subject: [Numpy-discussion] An historical precedent for matrix operation symbols In-Reply-To: <200203080804.g2884uY16995@chinon.cnrs-orleans.fr> Message-ID: <000001c1c6bd$97902750$1001a8c0@NICKLEBY> To sum up my previous postings: I prefer the "constructor" notation, not the funny-attribute notation. However, I believe an efficient matrix class can be done in Python on top of a numeric array object. Shadow classing just doesn't seem to be an overhead problem in most cases. The coding for MA is very complicated because of its semantics are so different; for Matrix it would be much less complicated. When I designed Basis (1984) I was faced with the operator issue. The vast majority of the operations were going to be elementwise but some users would also want matrix multiply and the solution of linear systems. (Aside: a/b meaning the solution x of bx = a, is best calculated not as (b**-1) * a but by solving the system without forming the inverse.) My choice was: matrix multiply and divide were *!, /!. This was successful in two senses: the users found it easy to remember, and you could implement it in the tokenizer or just in the grammar. I chose to make it a token so as to forbid internal spaces, but for Python it could be done without touching the tokenizer, and it doesn't use up any new symbols. Symbols are precious; when I designed Basis I had a keyboard map and I would cross out the keys I had "used up". If I were Guido I would be very reluctant to give up anything valuable like @ for our purposes. One should not have any illusions: putting such operators on the array class is just expediency, a way to give the arrays a bit of a dual life. But a real matrix facility would have an abstract base class, be restricted to <= 2 dimensions, have realizations including symmetric, Hermitian, sparse, tridiagonal, yada yada yada. Another aside: There are mathematical operations whose output type depends not just on the input type but some of these other considerations. This led me to believe that the Numpy approach is essentially correct, that the type of the elements be variable rather than having separate classes for each type. From paul at pfdubois.com Fri Mar 8 08:46:04 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Mar 8 08:46:04 2002 Subject: [Numpy-discussion] How the a**T trick works Message-ID: <000801c1c6c0$9df53d80$1001a8c0@NICKLEBY> I confess I admire the a**T suggestion for a notation for transpose(a). The question was raised as to whether this really works and an empirical proof was offered that it does. Here is how it works. In a**T the first thing tried is to ask a.__pow__ if it can manage with T as an argument. The array says, hmm, T is an instance of a class that I never heard of, and it doesn't have an __array__ attribute to call in order to make it an array. I pass. Then T's class' __rpow__ is given a turn, and asked if it can do the job with a as an argument, which it can. This is how 2**a works, or 2 - a, too. The int type disavows any knowledge of what to do, so the other operand gets a chance to save the day. So this is a "trick" we use every day. From perry at stsci.edu Fri Mar 8 08:48:09 2002 From: perry at stsci.edu (Perry Greenfield) Date: Fri Mar 8 08:48:09 2002 Subject: [Numpy-discussion] RE: An historical precedent for matrix operation symbols In-Reply-To: <000001c1c6bd$97902750$1001a8c0@NICKLEBY> Message-ID: Paul Dubois writes: [...] Paul puts this very well and I agree with virtually everything he says. > > One should not have any illusions: putting such operators on the array > class is just expediency, a way to give the arrays a bit of a dual life. > But a real matrix facility would have an abstract base class, be > restricted to <= 2 dimensions, have realizations including symmetric, > Hermitian, sparse, tridiagonal, yada yada yada. > A few comments on this point. I do think that this is correct. If a matrix is stored as an array intrinsically, then constructing an array representation (e.g., array(b) where b is a matrix) would be very efficient since a new array object consists of creating a new object that still uses the same underlying data buffer the matrix object does. There is no significant increase in memory. On the other hand, if a matrix class does use other representations, particularly such as sparse or tridiagnonal, then there naturally would be some cost to creating an array object since a new copy of the data would be required. Having such various matrix representations would certainly be useful (particularly sparse) but will certainly require work to support (not something we (STScI) can sign up to do, but I hope someone else is willing). Perry From hinsen at cnrs-orleans.fr Fri Mar 8 09:56:06 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Mar 8 09:56:06 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: <20020308104153.A145426@oakland.edu> (message from Jon Moody on Fri, 8 Mar 2002 10:41:53 -0500) References: <200203080804.g2884uY16995@chinon.cnrs-orleans.fr> <20020308104153.A145426@oakland.edu> Message-ID: <200203081754.g28HshQ19721@chinon.cnrs-orleans.fr> > The Python core has long had at least 2 examples of operators which > act as object constructors: 'j' which performs complex() and 'L' which > performs long() (you can't get much more `pythonic' than a built-in > type). Those are suffixes for constants, not operators. If they were operators, you could apply them to variables - which you can't. More importantly, the L suffix wouldn't even work as an operator, as the preceding number might extend the range of integers before it has a chance of being converted to a long integer. > I would venture to say that the numeric community is pretty high up > there in importance if not size, given the early appearance of the > complex number type and strong math capacity not to mention GvR's The complex type was introduced for the benefit of NumPy (I remember it all too well, as I did the initial implementation), but after a long discussion on the Python list, with many expressing disapprovement because of its special-need status. I'd say it shows the limits of what one can get accepted. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From jjl at pobox.com Fri Mar 8 13:42:07 2002 From: jjl at pobox.com (John J. Lee) Date: Fri Mar 8 13:42:07 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: <200203070910.g279AlE14306@chinon.cnrs-orleans.fr> Message-ID: On Thu, 7 Mar 2002, Konrad Hinsen wrote: > "eric" writes: > > > Matrix.Matrix objects. This attribute approach will work, but I > > wonder if trying the "adding an operator to Python" approach one > > more time would be worth while. At Python10 developer's day, Guido [...] > If you want to go the "operator way", the goal should rather be > something like APL, with composite operators. Matrix multiplication [...] How about general operator - function equivalence, as explained here by Alex Martelli? The change is large in one sense, but it is conceptually very simple: http://groups.google.com/groups?q=operator+Martelli+Haskell+group:comp.lang.python&hl=en&selm=8t4dl301a4%40news2.newsguy.com&rnum=1 > 2 div 3 > or > div(2,3) > or > 2 `div 3 > [Haskell-ishly syntax-sugar note: Haskell lets you > use any 2-operand function as an infix operator by > just enclosing its name in ``; in Py3K, I think a > single leading ` would suffice -- far nicer than the > silly current use of ` for the rare need of repr -- > and we might also, with pleasing symmetry, let any > operator be used as a normal function a la > `+(a,b) > i.e., the ` marker could lexically switch functions > to operators and operators to functions, without > needing to 'import operator' and recall what the > operator-name for a given operator IS...!-). The > priority and associativity of these infinitely > many "new operators" could be fixed ones...]. Since GvR seems to have given up the idea of 'Py3K' in favour of gradual changes, perhaps this is a real possibility? Travis' r = a.M * b.M would then be written as M = Numeric.matrixmultiply r = a `M b (Konrad also complains about Perl's nasty syntax. This is frequently complained about, but do you really think the syntax is the problem -- surely it's Perl's horribly complicated semantics that is the real issue? The syntax is just inconvenient, in comparison at least. Sorry, a bit OT...) John From bsder at allcaps.org Fri Mar 8 15:36:02 2002 From: bsder at allcaps.org (Andrew P. Lentvorski) Date: Fri Mar 8 15:36:02 2002 Subject: [Numpy-discussion] big picture? One proposal In-Reply-To: Message-ID: <20020308144903.L2621-100000@mail.allcaps.org> On Fri, 8 Mar 2002, Pearu Peterson wrote: > This is how I interpret the raised exception as behind the scenes matrix > and array are the same (in the sense of data representation). Matricies can have many different storage formats. Sparse, Banded, Dense, Triangular, etc. Arrays and Matricies are the same behind the scenes *for the moment*. For examples of matrix storage formats, check the BLAST Technical Forum documentation at netlib. Unfortunately, www.netlib.org appears to be down right now. Locking the assumption that Matrix and Array are "the same behind the scenes" into the main Python specification is not a good idea. -a From tim.one at comcast.net Fri Mar 8 21:18:01 2002 From: tim.one at comcast.net (Tim Peters) Date: Fri Mar 8 21:18:01 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: I hope the bogus underflow problem is fixed in CVS Python now. Since my platform didn't have the problem (and still doesn't, of course), we won't know for sure until people try it and report back. Python used to link with -lieee, but somebody changed it for a reason history may have lost. See comments attached to Huaiyu Zhu's bug report for more on that: If someone on a box that had the problem can build current CVS Python with and without -lieee (I'm told it defaults to "without" today, although that appears to mixed up with whether or not __fpu_control is found outside of libieee (search for "ieee" in configure.in)), please try the following in a shell: import math x = 1e200 y = 1/x math.pow(x, 2) # expect OverflowError math.pow(y, 2) # expect 0. pow(x, 2) # expect OverflowError pow(y, 2) # expect 0. x**2 # expect OverflowError y**2 # expect 0. If those all work as annotated on a box that had the problem, please report the success *as a comment to the bug report* (link above). If one or more still fail, likewise please give box details and paste the failures into a comment on the bug report. Thanks! From hinsen at cnrs-orleans.fr Sat Mar 9 14:38:02 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Sat Mar 9 14:38:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. References: Message-ID: <200203092236.g29Matv21585@chinon.cnrs-orleans.fr> "John J. Lee" writes: > (Konrad also complains about Perl's nasty syntax. This is frequently > complained about, but do you really think the syntax is the problem -- > surely it's Perl's horribly complicated semantics that is the real issue? > The syntax is just inconvenient, in comparison at least. Sorry, a bit > OT...) It's both, of course. I don't really wish to decide which is worse, especially not because I'd have to read more Perl code to reach such a decision ;-) But syntax is an issue for readability. There are some symbols that are generally used as operators in computer languages, and I think Python uses all of them already. Moreover, the general semantics are quite uniform as well: * stands for multiplication, for example, although the details of what multiplication means can vary. Symbols like @ are not operators everywhere, and where they are there is no uniform meaning attached to them, so they create confusion. As a test, take a Python program and replace all * by @. It does look weird. Konrad. From jmiller at stsci.edu Tue Mar 12 14:21:02 2002 From: jmiller at stsci.edu (Todd Miller) Date: Tue Mar 12 14:21:02 2002 Subject: [Numpy-discussion] ANN: numarray-0.3 Message-ID: <3C8E7F39.7090100@stsci.edu> Numarray 0.3 ------------ Numarray is a Numeric replacement which features c-code generated from python template scripts, the capacity to operate directly on arrays in files, and improved type promotion semantics. Numarray-0.3 incorporates safety checks to prevent crashing Python when a user accidentally changes private variables in numarray. The new safety checks ensure that: 1. Numarray C-functions are called with properly sized buffers. 2. Numarray C-functions are called with properly aligned buffers. 3. Parameters match the C-function in count and i/o direction. 4. The correct generic function wrapper is used to call each C-function. 5. All indices implied by the array strides are valid. Failed checks result in python exceptions. A new memory object fixes an unforunate limitation of the python buffer object, namely the lack of guaranteed double aligned storage. The largest generated source module, _ufuncmodule.c, has been partitioned by data type into several smaller, more gcc-friendly modules, e.g. _ufuncFloat64module.c. The sort and argsort functions are fixed. The dot function is fixed for 1D arrays. Transpose, swapaxes, and reshape once again return views. WHERE ----------- Numarray-0.3 windows executable installers and source code tar ball is here: http://sourceforge.net/project/showfiles.php?group_id=1369 Numarray is hosted by Source Forge in the same project which hosts Numeric: http://sourceforge.net/projects/numpy/ The web page for Numarray information is at: http://stsdas.stsci.edu/numarray/index.html Trackers for Numarray Bugs, Feature Requests, Support, and Patches are at the Source Forge project for NumPy at: http://sourceforge.net/tracker/?group_id=1369 REQUIREMENTS -------------------------- numarray-0.3 requires Python 2.0 or greater. AUTHORS, LICENSE ------------------------------ Numarray was written by Perry Greenfield, Rick White, Todd Miller, JC Hsu, Paul Barrett, and Phil Hodge at the Space Telescope Science Institute. Numarray is made available under a BSD-style License. See LICENSE.txt in the source distribution for details. -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 From paul at pfdubois.com Wed Mar 13 15:02:06 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Mar 13 15:02:06 2002 Subject: [Numpy-discussion] Numerical Python 21.0 released Message-ID: <000001c1cae3$0b5aede0$0e01a8c0@NICKLEBY> Version 21.0 March 13, 2002 Fixed bugs: [ #482603 ] Memory leak in MA/Numeric/Python Reported by Reggie Dugard. Turned out to be *two* memory leaks in one case in a routine in Numeric, array_objectype. (Dubois) [ none ] if vals was a null-array array([]) putmask and put would crash. Fixed with check. [ #469951 ] n = n1[0] gives array which shares dimension of n1 array. This causes bugs if shape of n1 is changed (n didn't used to have it's own dimensions array (Travis Oliphant) [ #514588 ] MLab.cov(x,x) != MLab.cov(x) (Travis Oliphant) [ #518702 ] segfault when invalid typecode for asarray (Travis Oliphant) [ #497530 ] MA __getitem__ prevents 0 len arrays (Reggie Duggard) [ #508363 ] outerproduct of noncontiguous arrays (Martin Wiechert) [ #513010 ] memory leak in comparisons (Byran Nollett) [ #512223 ] Character typecode not defined (Jochen Kupper) [ #500784 ] MLab.py diff error (anonymous, fixed by Dubois) [ #503741 ] accuracy of MLab.std(x) (Katsunori Waragai) [ #507568 ] overlapping copy a[2:5] = a[3:6] Change uses of memcpy to memmove which allows overlaps. [ numpy-Patches-499722 ] size of buffer created from array is bad (Michel Sanner). [ #502186 ] a BUG in RandomArray.normal (introduced by last bug fix in 20.3) (Katsunori Waragai). Fixed errors for Mac (Jack Jensen). Make rpm's properly, better Windows installers. (Gerard Vermeulen) Added files setup.cfg; setup calculates rpm_install.sh to use current Python. New setup.py, eliminate setup_all.py. Use os.path.join everywhere. Revision in b6 added file README.RPM, further improvements. Implement true division operations for Python 2.2. (Bruce Sherwood) Note: true division of all integer types results in an array of floats, not doubles. This decision is arbitrary and there are arguments either way, so users of this new feature should be aware that the decision may change in the future. New functions in Numeric; they work on any sequence a that can be converted to a Numeric array. Similar change to average in MA. (Dubois) def rank (a): "Get the rank of a (the number of dimensions, not a matrix rank)" def shape (a): "Get the shape of a" def size (a, axis=None): "Get the number of elements in a, or along a certain axis." def average (a, axis=0, weights=None, returned = 0): """average(a, axis=0, weights=None) Computes average along indicated axis. If axis is None, average over the entire array. Inputs can be integer or floating types; result is type Float. If weights are given, result is: sum(a*weights)/(sum(weights)) weights must have a's shape or be the 1-d with length the size of a in the given axis. Integer weights are converted to Float. Not supplying weights is equivalent to supply weights that are all 1. If returned, return a tuple: the result and the sum of the weights or count of values. The shape of these two results will be the same. raises ZeroDivisionError if appropriate when result is scalar. (The version in MA does not -- it returns masked values). """ From amundt at pvv.org Thu Mar 14 05:36:10 2002 From: amundt at pvv.org (Amund Tveit) Date: Thu Mar 14 05:36:10 2002 Subject: [Numpy-discussion] Plans for RandomArray support in numarray? Message-ID: <03b401c1cb5d$27272b40$4132f181@AMUND> Are there any plans for supporting RandomArray in numarray type arrays? Amund http://www.idi.ntnu.no/~amundt/ From perry at stsci.edu Thu Mar 14 06:30:16 2002 From: perry at stsci.edu (Perry Greenfield) Date: Thu Mar 14 06:30:16 2002 Subject: [Numpy-discussion] Plans for RandomArray support in numarray? In-Reply-To: <03b401c1cb5d$27272b40$4132f181@AMUND> Message-ID: > > Are there any plans for supporting RandomArray in numarray type arrays? > > Amund > http://www.idi.ntnu.no/~amundt/ > > Yes, we plan to support that and some version of the other existing libraries available (e.g. linear algebra, fft...). Now that we are done (mostly anyway) with the changes needed to add safety checks to numarray, our short-term plans are: 1) Further changes to make numarray more backward compatible. (This is fairly minor work and probably involves less than a weeks work). 2) Documenting the C-API by: a) Adding the appropriate chapters to the numarray manual. b) providing examples of various kinds of C functions, e.g., i) How to add a ufunc. ii) how to add simple C functions... iii) how to add more sophisticated C functions... iv) (possibly) how to use SWIG with numarray 3) To help us do 2) better, add some existing Numeric libraries. In particular: a) FFT b) RandomArray c) linear algebra Probably in that order. Items 2) or 3) may or may not require new releases of numarray. If no basic changes to numarray are required. We probably will try to release the work piecemeal (e.g., updates to the manual, examples, or libraries as add-ons). Hopefully, you will begin to see results here starting in 2 to 4 weeks. Perry From paul at pfdubois.com Thu Mar 14 11:43:15 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Thu Mar 14 11:43:15 2002 Subject: [Numpy-discussion] Numerical Python 21.0 released In-Reply-To: <02031419303100.10325@taco.polycnrs-gre.fr> Message-ID: <000001c1cb90$64e2d140$0e01a8c0@NICKLEBY> Gerard Vermeulen asked me to remove the RPMs from SourceForge, and I have done so. Apparently what is made by default is not useful except for me. I absolutely refuse to learn about RPMs; it is not useful for my job. If the RPM-cult wants this stuff to work then I need patches to setup.py that ALWAYS produce a suitable RPM, or a volunteer who will make and install such files for each release. I will run an automated test if supplied but it can't interfere with my system or require superuser privs, because I don't have that. I appreciate people trying to help and I'm sorry if it wasn't clear just how incompetent I intend to be on this. From gvermeul at polycnrs-gre.fr Thu Mar 14 23:41:11 2002 From: gvermeul at polycnrs-gre.fr (Gerard Vermeulen) Date: Thu Mar 14 23:41:11 2002 Subject: [Numpy-discussion] Numerical Python 21.0 released In-Reply-To: <000001c1cb90$64e2d140$0e01a8c0@NICKLEBY> References: <000001c1cb90$64e2d140$0e01a8c0@NICKLEBY> Message-ID: <02031508403700.11584@taco.polycnrs-gre.fr> Because all RPM based Linux distributions have subtle incompatibilities, it is impossible to write a setup.py script that produces RPMs that will work on all those distributions. Normally, RPMs build on a particular version of a particular distribution will work on other systems with exactly the same version of the same distribution (Paul's setup is not "normal", because his Python interpreter lives in his home directory). If anybody wants to provide RPMs, please code distribution+version in the name of the RPM. The RPM.README in the tar.gz explains how to do this. Gerard On Thursday 14 March 2002 20:42, Paul F Dubois wrote: > Gerard Vermeulen asked me to remove the RPMs from SourceForge, and I > have done so. Apparently what is made by default is not useful except > for me. > > I absolutely refuse to learn about RPMs; it is not useful for my job. If > the RPM-cult wants this stuff to work then I need patches to setup.py > that ALWAYS produce a suitable RPM, or a volunteer who will make and > install such files for each release. I > will run an automated test if supplied but it can't interfere with my > system or require superuser privs, because I don't have that. > > I appreciate people trying to help and I'm sorry if it wasn't clear just > how incompetent I intend to be on this. > From hinsen at cnrs-orleans.fr Fri Mar 15 01:35:06 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Mar 15 01:35:06 2002 Subject: [Numpy-discussion] Numerical Python 21.0 released In-Reply-To: <02031508403700.11584@taco.polycnrs-gre.fr> References: <000001c1cb90$64e2d140$0e01a8c0@NICKLEBY> <02031508403700.11584@taco.polycnrs-gre.fr> Message-ID: Gerard Vermeulen writes: > Because all RPM based Linux distributions have subtle incompatibilities, it > is impossible to write a setup.py script that produces RPMs that will work on > all those distributions. Or at least not simple. I like the idea of providing RPMs for as many distributions as possible, and I volunteer to participate in the effort. In fact, I always make RPMs of all my Python-related packages already for my own use (seven machines), for RedHat 7.x systems. However, I have had difficulties with uploading to SourceForge for almost a full year now, and I don't expect it to get better soon. While building RPMs is a small job for me, uploading the result costs me an enormous effort each time, and I am not willing to waste time on that. I don't know how many people are in a similar situation, but perhaps we could get more RPM packaging volunteers by opening an RPM archive elsewhere. Konrad. From karshi.hasanov at utoronto.ca Fri Mar 15 20:21:02 2002 From: karshi.hasanov at utoronto.ca (Karshi) Date: Fri Mar 15 20:21:02 2002 Subject: [Numpy-discussion] fft_help Message-ID: <20020316042020Z234697-26166+4@bureau8.utcc.utoronto.ca> Hi all, How do use Matlab like " fft_shift" in NumPy? Thanks From huaiyu_zhu at yahoo.com Mon Mar 18 00:40:05 2002 From: huaiyu_zhu at yahoo.com (Huaiyu Zhu) Date: Mon Mar 18 00:40:05 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: <020601c1c595$24bd76c0$6b01a8c0@ericlaptop> Message-ID: I'm a little late to this discussion, but it gives me a chance to read all the existing comments. I'd like to offer some background to one possible solution. On Thu, 7 Mar 2002, eric wrote: > > http://python.sourceforge.net/peps/pep-0225.html > > This one proposes decorating the current binary ops with some > symbols to indicate that they have different behavior than > the standard binary ops. This is similar to Matlab's use of > * for matrix multiplication and .* for element-wise multiplication > or to R's use of * for element-wise multiplication and %*% for > "object-wise" multiplication. > > It proposes prepending ~ to operators to change their behavior so > that ~* would become matrix multiply. > > The PEP is a little more general, but this gives the flavor. > > My hunch is that some form of the second (perhaps drastically reduced) would > meet with more success. The suggested ~* or even the %*% operator are both > palitable. Such details can be decided later. The question is whether there is > sufficient interest to try and push the operator idea through? It would take > much longer than choosing something we can do ourselves (like .M), but the > operator solution seems more desirable to me. > I'm one of the coauthor of this PEP. I'm very glad to see additional interest in this proposal. It is not just a proposal - there was actually a patch made by Gregory Lielens for the ~op operators for Python 2.0. It redefines ~ so that it can be combined with + - * / ** to form new operators. It is quite ingenious in that all the original bitwise operations on ~ alone are still valid. The new operator can be assigned any semantics with hooks like __tmul__ and __rtmul__. The idea is that a matrix class would define __mul__ so that * is matrix multiplication and define __tmul__ so that ~* is elementwise operation. There is a test implementation on the MatPy homepage (matpy.sourceforge.net). So what was holding it back? Well, last time around when this was discussed, it appears that most of the heavy weights in the Numeric community favored either keeping the status quo, or using ~* symbol for arrays. We hoped to use the MatPy package as a test case to show that it is possible to have two entirely different kinds of objects, where the meaning of * and ~* are switched. However, for various reasons I was not able to act upon it for months, and Python evolved into 2.1 and 2.2. I never had much time to update the patch, and felt the attempt was futile as 1) Python was evolving quite fast, 2) I did not heard much about this issue since then. I often feel guilty about the lapse. Now it might be a good time to revive this proposal, as the idea of having matrices and arrays with independent semantics but possibly related implementation appears to be gaining some additional acceptance. Some ancillary issues that hindered the implementation at that time have also been solved. For example, using .I for inverse, .T for transpose, etc, was costly because of the need to override __getattr__ and __coerce__, making a matrix class less attractive in practice. These can now be implemented efficiently using the new set/get mechanism. I'd like to hear any suggestions on how to proceed. My own favorite would be to have separate array and matrix classes with easy but explicit conversions between them. Without conversions, arrays and matrices would be completely independent semantically. In other words, I'm mostly in favor of Konrad Hinsen's position, with the addition of using ~ operators for elementwise operations for matrix-like classes. The PEP itself also discussed ideas of extending the meaning of ~ to other parts of Python for elementwise operations on aggregate types, but my impressions of people's impressions is that it has a better chance without that part. Huaiyu From a.schmolck at gmx.net Mon Mar 18 06:55:20 2002 From: a.schmolck at gmx.net (A.Schmolck) Date: Mon Mar 18 06:55:20 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: [Sorry about the crossposting, but it also seemed relevant to both scipy and numpy...] Huaiyu Zhu writes: [...] > I'd like to hear any suggestions on how to proceed. My own favorite would > be to have separate array and matrix classes with easy but explicit > conversions between them. Without conversions, arrays and matrices would > be completely independent semantically. In other words, I'm mostly in > favor of Konrad Hinsen's position, with the addition of using ~ operators > for elementwise operations for matrix-like classes. The PEP itself also > discussed ideas of extending the meaning of ~ to other parts of Python for > elementwise operations on aggregate types, but my impressions of people's > impressions is that it has a better chance without that part. > Well, from my impression of the previous discussions, the situation (both for numpy and scipy) seems to boil down to me as follows: Either `array` currently is too much of a matrix, or too little: Linear algebra functionality is currently exclusively provided by `array` and libraries that operate on and return `array`s, but the computational and notational efficiency leaves to be desired (compared to e.g. Matlab) in some areas, importantly matrix multiplications (which are up to 40 times slower) and really awkward to write (and much more importantly, decipher afterwards). So I think what one should really do is discuss the advantages and disadvantages of the two possible ways out of this situation, namely providing: 1) a new (efficient) `matrix` class/type (and appropriate libraries that operate on it) [The Matrix class that comes with Numeric is more some syntactic sugar wrapper -- AFAIK it's not use as a return type or argument in any of the functions that only make sense for arrays that are matrices]. 2) the additional functionality that is needed for linear algebra in `array` and the libraries that operate on it. (see [1] below for what I feel is currently missing and could be done either in way 1) or 2)) I think it might be helpful to investigate these "macro"-issues before one gets bogged down in discussions about operators (I admit these are not entirely unrelated -- given that one of the reasons for the creation of a Matrix type would be that '*' is already taken in 'array's and there is no way to add a new operator without modifying the python core -- just for the record and ignoring my own advice, _iff_ there is a chance of getting '~*' into the language, I'd rather have '*' do the same for both matrices and arrays). My impression is that the best path also very much depends on the what the feature aspirations and divisions of labor of numpy/numarray and scipy are going to be. For example, scipy is really aimed at scientific users, which need performance, and are willing to buy it with inconvenience (like the necessity to install other libraries on one's machine, most prominently atlas and blas). The `array` type and the functions in `Numeric`, on the other hand, potentially target a much wider community -- the efficient storage and indexing facilities (rich comparisons, strides, the take, choose etc. functions) make it highly useful for code that is not necessarily numeric, (as an example I'm currently using it for feature selection algorithms, without doing any numerical computations on the arrays). So maybe (a subset of) numpy should make it into the python core (or an as yet `non-existent sumo-distribution`) [BTW, I also wonder whether the python-core array module could be superseded/merged with numpy's `array`? One potential show stopper seems to be that it is e.g. `pop`able]. In such a scenario, where numpy remains relatively general (and might even aim at incorporation into the core), it would be a no-no to bloat it with too much code aimed at improving efficiency (calling blas when possible, sparse storage etc.). On the other hand people who want to do serious numerical work will need this -- and the scipy community already requires atlas etc. and targets a more specialized audience. Under this consideration it might be an attractive solution do incorporate good matrix functionality (and possibly other improvements for hard core number crunchers) in scipy only (or at least limit the efficient _implementation_ of matrices to scipy, providing at only a pure python class or so in numpy). I'm not suggesting, BTW, to necessarily put all of [1] into a single class -- it seems sensible to have a couple of subclasses (for masked, sparse representations etc.) to `matrix` (maybe the parent-class should even be a relatively na?ve Numpy implementation, with the good stuff as subclasses in scipy...). In any event, creating a new matrix class/type would also mean that matrix functionality in libraries should use and return this class (existing libraries should presumably largely still operate on arrays for backwards-compatibily (or both -- after a typecheck), and some matrix operations are so useful that it makes sense to provide array versions for them (e.g. dot) -- but on the whole it makes little sense to have a computationally and space efficient matrix type if one has to cast it around all the time). A `matrix` class is more specialized than an `array` and since the operations one will often do on it are consequently more limited, I think it should provide most important functionality as methods (rather than as external functions; see [2] for a list of suggestions). Approach 1) on the other hand would have the advantage that the current interface would stay pretty much the same, and as long as 2D arrays can just be regarded as matrices, there is no absolutely compelling reason not to stuff everything into array (at least the scipy-version thereof). Another important question to ask before deciding what to change how and if, is obviously how many people in the scipy/numpy community do lots of linear algebra (and how many deflectors from matlab etc. one could hope to win if one spiced things up a bit for them...), but I would suppose there must be quite a few (but I'm certainly biased ;). Unfortunately, I've really got to do some work again now, but before I return to number-crunching I'd like to say that I'd be happy to help with the implementation of a matrix class/type in python (I guess a .py-prototype would be helpful to start with, but ultimately a (subclassable) C(++)-type will be called for, at least in scipy). --alex Footnotes: [1] The required improvements for serious linear algebra seem to be: - optional use (atlas) blas routines for real and complex matrix, matrix `dot`s if atlas is available on the build machine (see http://www.scipy.org/Members/aschmolck for a patch -- it produces speedups of more than factor 40 for big matrices; I'd be willing to provide an equivalent patch for the scipy distribution if there is interest) - making sure that no unnecessary copies are created (e.g. when transposing a matrix to use it in `dot` -- AFAIK although the transpose itself only creates a new view, using it for dot results in a copy (but I might be wrong here )) - allowing more space efficient storage forms for special cases (e.g. sparse matrices, upper triangular etc.). IO libraries that can save and load such representations are also needed (methods and static methods might be a good choice to keep things transparent to the user). - providing a convinient and above all legible notation for common matrix operations (better than `dot(tranpose(A),B)` etc. -- possibilities include A * B.T or A ~* B.T or A * B ** T (by overloding __rpow__ as suggested in a previous post)) - (in the case of a new `matrix` class): indexing functionality (e.g. `where`, `choose` etc. should be available without having to cast, e.g. for the common case that I want to set everything under a certain threshold to 0., I don't want to have to cast my sparse matrix to an array etc.) [2] What should a matrix class contain? - a dot operator (certainly eventually, but if there is a good chance to get ~* into python, maybe '*' should remain unimplemented till this can be decided) - most or all of what scipy's linalg module does - possibly IO, (reading as a static method) - indexing (the like of take, choose etc. (some should maybe be functions or static methods)) -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.Schmolck at gmx.net http://www.dcs.ex.ac.uk/people/aschmolc/ From hinsen at cnrs-orleans.fr Mon Mar 18 08:00:08 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Mon Mar 18 08:00:08 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: a.schmolck at gmx.net (A.Schmolck) writes: > Linear algebra functionality is currently exclusively provided by `array` and > libraries that operate on and return `array`s, but the computational and > notational efficiency leaves to be desired (compared to e.g. Matlab) in some > areas, importantly matrix multiplications (which are up to 40 times slower) > and really awkward to write (and much more importantly, decipher afterwards). Computational and notational efficiency are rather well separated, fortunately. Both the current dot function and an hypothetical matrix multiply operator could be implemented in straightforward C code or using a high-performance library such as Atlas. In fact, this should even be an installation choice in my opinion, as installing Atlas isn't trivial on all machines (e.g. with some gcc versions), and I consider it important for fundamental libraries that they work everywhere easily, even if not optimally. > My impression is that the best path also very much depends on the what the > feature aspirations and divisions of labor of numpy/numarray and scipy are > going to be. For example, scipy is really aimed at scientific users, which > need performance, and are willing to buy it with inconvenience (like the I see the main difference in distribution philosophy. NumPy is an add-on package to Python, which is in turn used by other add-on packages in a modular way. SciPy is rather a monolithic super-distribution for scientific users. Personally I strongly favour the modular package approach, and in fact I haven't installed SciPy on my system for that reason, although I would be interested in some of its components. > algorithms, without doing any numerical computations on the arrays). So maybe > (a subset of) numpy should make it into the python core (or an as yet This has been discussed already, and it might well happen one day, but not with the current NumPy implementation. Numarray looks like a much better candidate, but isn't ready yet. > In such a scenario, where numpy remains relatively general (and > might even aim at incorporation into the core), it would be a no-no > to bloat it with too much code aimed at improving efficiency > (calling blas when possible, sparse storage etc.). On the other hand The same approach as for XML could be used: a slim-line version in the standard distribution that could be replaced by a high-efficiency extended version for those who care. > attractive solution do incorporate good matrix functionality (and > possibly other improvements for hard core number crunchers) in scipy > only (or at least limit the efficient _implementation_ of matrices > to scipy, providing at only a pure python class or so in numpy). I'm I'd love to have efficient matrices without having to install the whole SciPy package! Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From pearu at cens.ioc.ee Mon Mar 18 12:35:24 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Mon Mar 18 12:35:24 2002 Subject: [SciPy-dev] Re: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: Message-ID: Off topic warning On 18 Mar 2002, Konrad Hinsen wrote: > I see the main difference in distribution philosophy. NumPy is an > add-on package to Python, which is in turn used by other add-on > packages in a modular way. SciPy is rather a monolithic > super-distribution for scientific users. > > Personally I strongly favour the modular package approach, and in fact > I haven't installed SciPy on my system for that reason, although I > would be interested in some of its components. Me too. In what I have contributed to SciPy, I have tried to follow this modularity approach. Modularity is also important property from the development point of view: it minimizes possible interference with other unreleated modules and their bugs. What I am trying to say here is that SciPy can (and should?, +1 from me) provide its components separately, though, currently only few of its components seem to be available in that way without some changes. Pearu From a.schmolck at gmx.net Mon Mar 18 14:33:05 2002 From: a.schmolck at gmx.net (A.Schmolck) Date: Mon Mar 18 14:33:05 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: Konrad Hinsen writes: > Computational and notational efficiency are rather well separated, > fortunately. Both the current dot function and an hypothetical matrix Yes, the only thing they have in common is that both are currently unsatisfactory (for matrix operations) in numpy, at least for my needs. Although I've solved my most pressing performance problems by patching Numeric [1], I'm obviously interested in a more official solution (i.e. one that is maintained by others :) [...] [order changed by me] > a.schmolck at gmx.net (A.Schmolck) writes: > > My impression is that the best path also very much depends on the what the > > feature aspirations and divisions of labor of numpy/numarray and scipy are ^^^^^^^ Darn, I made a confusing mistake -- this should read _future_. > > going to be. For example, scipy is really aimed at scientific users, which > > need performance, and are willing to buy it with inconvenience (like the > > I see the main difference in distribution philosophy. NumPy is an > add-on package to Python, which is in turn used by other add-on > packages in a modular way. SciPy is rather a monolithic > super-distribution for scientific users. > > Personally I strongly favour the modular package approach, and in fact > I haven't installed SciPy on my system for that reason, although I > would be interested in some of its components. [...] > The same approach as for XML could be used: a slim-line version in the > standard distribution that could be replaced by a high-efficiency > extended version for those who care. [...] I personally agree with all your above points -- if you have a look at our "dotblas"-patch mentioned earlier (see [1]), you will find that it aims to do provide that -- have dot run anywhere without a hassle but run (much) faster if the user is willing to install atlas. My main concern was that the argument should shift away a bit from syntactic and implementation details to what audiences and what needs numpy/numarray and are supposed to address and, in this light, how to best strike the balance between convinience for users and maitainers, speed and bloat, generality and efficiency etc. As an example, adding the dotblas patch [1] to Numeric is, I think more convinient for the users (granting a few assumptions (like that it actually works :) for the sake of the argument) -- it gives users that have atlas better-performance and those who don't won't (or at least shouldn't) notice. It is however inconvinient for the maintainers. Whether one should bother including it in this or some other way depends, among the obvious question of whether there is a better way to achieve what it does for both groups (like creating a dedicated Matrix class), also on what numpy is really supposed to achieve. I'm not entirely clear on that. For example I don't know how many numpy users deeply care about their matrix multiplications for big (1000x1000) matrices being 40 times faster. The monolithic approach is not entirely without its charms (remember python's "batteries included" jinggle)? Apart from convinience factors it also has the not unconsiderable advantage that people use _one_ standard module for a certain thing -- rather than 20 different solutions. This certainly helps to improve code quality. Not least because someone goes through the trouble of deciding what merrit's inclusion in the "Big Thing", possibly urging changes but at least almost certainly taking more time for evalutation than an indivdual programmer who just wants to get a certain job done. It also makes life easier for module writers -- they can rely on certain stuff being around (and don't have to reinvent the wheel, another potential improvement to code quality). As such it makes live easier for maintainers, as does the scipy commandment that you have to install atlas/lapack, full-stop (and if it doesn't run on your machine -- well at least it works fast for some people and that might well be better than working slow for everyone in this context). So, I think what's good really depends on what you're aiming at, that's why I'd like to know what users and developers think about these matters. My points regarding scipy and numpy/numarray were just one attempt at interpreting what these respective libraries try to/should/could attempt to be or become. Now, not being a developer for either of them (I've only submitted a few minor patches to scipy), I'm not in a particular good position to venture such interpretations, but I hoped that it would provoke other and more knowledgeable people to share their opinions and insights on this matter (as indeed you did). > I'd love to have efficient matrices without having to install the > whole SciPy package! Welcome to the linear algebra lobby group ;) yep, that would be nice but my impression was that the scipy folks are currently more concerned about performance issues than the numpy/numarray folks and I could live with either package providing what I want. Ideally , I'd like to see a slim core numarray, without any frills (and more streamlined to behave like standard python containers (e.g. indexing and type/casts behavior)) for the python core, something more enabled and efficient for numerics (including matrices!) as a seperate package (like the XML example you quote). And then maybe a bigger pre-bundled collection of (ideally rather modular) numerical libraries for really hard-core scientific users (maybe in the spirit of xemacs-packages and sumo-tar-balls -- no bloat if you don't need it, plenty of features in an instant if you do). Anyway, is there at least general agreement that there should be some new and wonderful matrix class (plus supporting libraries) somewhere (rather than souping up array)? alex Footnotes: [1] patch for faster dot product in Numeric http://www.scipy.org/Members/aschmolck -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.Schmolck at gmx.net http://www.dcs.ex.ac.uk/people/aschmolc/ From hinsen at cnrs-orleans.fr Tue Mar 19 03:09:02 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Tue Mar 19 03:09:02 2002 Subject: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: a.schmolck at gmx.net (A.Schmolck) writes: > > > feature aspirations and divisions of labor of numpy/numarray and scipy are > ^^^^^^^ > Darn, I made a confusing mistake -- this should read _future_. Or perhaps __future__ ;-) > I personally agree with all your above points -- if you have a look at our > "dotblas"-patch mentioned earlier (see [1]), you will find that it aims to do And I didn't know even about this... > It is however inconvinient for the maintainers. Whether one should bother > including it in this or some other way depends, among the obvious question of There could be two teams, one maintaining a standard portable implementation, and another one taking care of optimization add-ons. >From the user's point of view, what matters most is a single entry-point for finding everything that is available. > The monolithic approach is not entirely without its charms (remember > python's "batteries included" jinggle)? Apart from convinience Sure, but... That's the standard library. Everybody has it, in identical form, and its consistency and portability is taken care off by the Python development team. There can be only *one* standard library that works like this. I see no problem either with providing a larger integrated distribution for specific user communities. But such distribution and packaging strategies should be distinct from development projects. If I can get a certain package only as part of a juge distribution that I can't or don't want to install, then that package is effectively lost for me. Worse, if one package comes with its personalized version of another package (SciPy with NumPy), then I end up having to worry about internal conflicts within my installation. On the other hand, package interdependencies are a big problem in the Open Source community at large, and I have personally been bitten more than once by an incompatible change in NumPy that broke my modules. But I don't see any other solution than better communication between development teams. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From DavidA at ActiveState.com Tue Mar 19 14:05:03 2002 From: DavidA at ActiveState.com (David Ascher) Date: Tue Mar 19 14:05:03 2002 Subject: [Numpy-discussion] ANN: numarray-0.3 References: <3C8E7F39.7090100@stsci.edu> Message-ID: <3C97B5C8.5045F50C@activestate.com> Can I suggest that the generated files not be included in the source tarball? I'm making sure that numarray and Numeric will be available in our PPM repository, and it's confusing the build script when the setup.py tries to overwrite files that are under source code control. --david ascher PS: PPMs for numarray 0.3 and Numeric 21 coming up soon. =) From jmiller at stsci.edu Tue Mar 19 14:18:01 2002 From: jmiller at stsci.edu (Todd Miller) Date: Tue Mar 19 14:18:01 2002 Subject: [Numpy-discussion] ANN: numarray-0.3 References: <3C8E7F39.7090100@stsci.edu> <3C97B5C8.5045F50C@activestate.com> Message-ID: <3C97B8CB.3090403@stsci.edu> David Ascher wrote: >Can I suggest that the generated files not be included in the source >tarball? I'm making sure that numarray and Numeric will be available in >our PPM repository, and it's confusing the build script when the >setup.py tries to overwrite files that are under source code control. > >--david ascher > >PS: PPMs for numarray 0.3 and Numeric 21 coming up soon. =) > >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > Yes. I'll tighten up the MANIFEST.in for the next release. Todd -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 From jochen at unc.edu Tue Mar 19 17:10:22 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Tue Mar 19 17:10:22 2002 Subject: [Numpy-discussion] Re: Python 2.2 seriously crippled for numerical computation? In-Reply-To: References: Message-ID: The following message is a courtesy copy of an article that has been posted to comp.lang.python as well. -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sat, 09 Mar 2002 00:17:40 -0500 Tim Peters wrote: Tim> I hope the bogus underflow problem is fixed in CVS Python now. Tim> Since my platform didn't have the problem (and still doesn't, of Tim> course), we won't know for sure until people try it and report Tim> back. Ok, looking at SourceForge and google this seems to be fixed in cvs HEAD. Would it be possible to put the same patch into the cvs python-2.2 branch, please? [1] Greetings, Jochen Footnotes: [1] If it is in there, it doesn't work for me with current python cvs branch release22-maint. I still have to manually add -lieee. (RedHat-7.0 with current updates.) - -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6-cygwin-fcn-1 (Cygwin) Comment: Processed by Mailcrypt and GnuPG iEYEARECAAYFAjyX4IAACgkQiJ/aUUS8zY52jgCbB69a8KZmCuk9MYIKzNu7EpBR 3fIAmQEYH1/ipB7OoiBcBPIAHev3dRlU =5l6n -----END PGP SIGNATURE----- From tim.one at comcast.net Tue Mar 19 17:21:09 2002 From: tim.one at comcast.net (Tim Peters) Date: Tue Mar 19 17:21:09 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: [jochen at bock.chem.unc.edu] > Ok, looking at SourceForge and google this seems to be fixed in cvs > HEAD. Would it be possible to put the same patch into the cvs > python-2.2 branch, please? [1] > > Greetings, > Jochen > > Footnotes: > [1] If it is in there, it doesn't work for me with current python cvs > branch release22-maint. I still have to manually add -lieee. > (RedHat-7.0 with current updates.) I don't know what "current" meant to you at the time you wrote this. Michael Hudson did backport the patch into 2.2.1c1, which was released yesterday. So please try 2.2.1c1, and if you still have a problem, file a bug report about it on SourceForge. 2.2.1 final is expected in about a week. From jochen at unc.edu Tue Mar 19 17:54:02 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Tue Mar 19 17:54:02 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, 19 Mar 2002 20:20:36 -0500 Tim Peters wrote: Thanks for the quick answer. The problem is resolved. Tim> [jochen at bock.chem.unc.edu] >> [1] If it is in there, it doesn't work for me with current python cvs >> branch release22-maint. I still have to manually add -lieee. >> (RedHat-7.0 with current updates.) Tim> I don't know what "current" meant to you at the time you wrote this. About 20:00 (8:00pm) EST today, March 19. Tim> Tim> Michael Hudson did backport the patch into 2.2.1c1, which Tim> Tim> was released yesterday. So please try 2.2.1c1, and if you Tim> Tim> still have a problem, file a bug report about it on Tim> Tim> SourceForge. 2.2.1 final is expected in about a week. Well, changing cvs from release22-maint to r221c1 helps. That is, everything seems to work fine with the cvs sources tagged r221c1. Then, is it really necessary to mess up the cvs tags so much? Why isn't it possible to have a single python-2.2 branch which one could follow to get all the stuff that's incorporated into that version? [1] There are huge differences between release22-maint and r221c1, it seems from the number of patches applied when going from one to the other. But then some files are in the same (non-main) branch. ??? Thanks for all your work, and thank you for the quick help again. Greetings, Jochen - -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6-cygwin-fcn-1 (Cygwin) Comment: Processed by Mailcrypt and GnuPG iEYEARECAAYFAjyX6yYACgkQiJ/aUUS8zY76QwCdGZsJd1b+0qJ19LJ5TlvwI5fP kbcAniNe/5eiPnEUfGbddLCpyYD1+gmr =fLmk -----END PGP SIGNATURE----- From tim.one at comcast.net Tue Mar 19 18:21:03 2002 From: tim.one at comcast.net (Tim Peters) Date: Tue Mar 19 18:21:03 2002 Subject: [Numpy-discussion] RE: Python 2.2 seriously crippled for numerical computation? In-Reply-To: Message-ID: [jochen at bock.chem.unc.edu] > Thanks for the quick answer. The problem is resolved. Cool! Glad to hear it. > Well, changing cvs from > release22-maint > to > r221c1 > helps. That is, everything seems to work fine with the cvs sources > tagged r221c1. That shouldn't have made any difference -- r221c1 is merely a tag on the release22-maint branch. Now I can spend a lot of time trying to guess why your checkout is screwed up (probably stale sticky flags, if it is), or you can try blowing away your checkout and starting over. I know which one gets my vote . CVS branches and tags are a nightmare: when in any doubt, kill the beast and start over. > Then, is it really necessary to mess up the cvs tags so much? Why > isn't it possible to have a single python-2.2 branch which one could > follow to get all the stuff that's incorporated into that version? That's what the release22-maint branch is supposed to be (and, AFIAK, is). > There are huge differences between release22-maint and r221c1, What makes you think so? I just did cvs diff -r release22-maint -r r221c1 and it turned up expected differences in the handful of files that indeed *have* changed since r221c1 was tagged, mostly in the docs and under the Mac subdirectory: Index: Doc/lib/libcopyreg.tex Index: Doc/lib/libthreading.tex Index: Lib/urllib.py Index: Mac/_checkversion.py Index: Mac/Build/PythonCore.mcp Index: Mac/Distributions/(vise)/Python 2.2.vct Index: Mac/Include/macbuildno.h Index: Mac/Modules/macfsmodule.c Index: Mac/Modules/macmodule.c Index: Misc/NEWS Index: PCbuild/BUILDno.txt > ... > Thanks for all your work, and thank you for the quick help again. And thanks for checking that your problem is fixed in 221c1. Had anyone tried this stuff in 22a1 or 22a2 or 22a3 or 22a4 or 22b1 or 22b2 or 22c1 (yes, we actually cut 7 full prerelease distributions for 2.2!), it would have worked in 2.2 out of the box. Keep that in mind when 2.3a1 comes out . From prabhu at aero.iitm.ernet.in Wed Mar 20 10:22:14 2002 From: prabhu at aero.iitm.ernet.in (Prabhu Ramachandran) Date: Wed Mar 20 10:22:14 2002 Subject: [SciPy-dev] Re: [Numpy-discussion] adding a .M attribute to the array. In-Reply-To: References: Message-ID: <15512.54075.312312.299325@monster.linux.in> hi, I'm sorry I havent been following the discussion too closely and this post might be completely unrelated. >>>>> "AS" == A Schmolck writes: AS> Ideally , I'd like to see a slim core numarray, without any AS> frills (and more streamlined to behave like standard python AS> containers (e.g. indexing and type/casts behavior)) for the AS> python core, something more enabled and efficient for numerics AS> (including matrices!) as a seperate package (like the XML AS> example you quote). And then maybe a bigger pre-bundled AS> collection of (ideally rather modular) numerical libraries for AS> really hard-core scientific users (maybe in the spirit of AS> xemacs-packages and sumo-tar-balls -- no bloat if you don't AS> need it, plenty of features in an instant if you do). AS> Anyway, is there at least general agreement that there should AS> be some new and wonderful matrix class (plus supporting AS> libraries) somewhere (rather than souping up array)? Ideally, I'd like something that also has a reasonably easy to use interface from C/C++. The idea is that it should be easy (and natural) for someone to use the same library from C/C++ when performance was desired. This would be really nice and very useful. prabhu From Aureli.Soria_Frisch at ipk.fhg.de Wed Mar 20 10:48:04 2002 From: Aureli.Soria_Frisch at ipk.fhg.de (Aureli Soria Frisch) Date: Wed Mar 20 10:48:04 2002 Subject: [Numpy-discussion] Different behaviour logical_and/and In-Reply-To: References: Message-ID: Hi all, In the version of Numeric with MacPython2.2 the functions "Numeric.logical_and" and "and" behave different, although up to the on-line documentation they should behave the same: >>> Numeric.logical_and(a,b) array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]) >>> a and b array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) for arrays: >>> a array([0, 1, 0, 0, 1, 0, 1, 0, 0, 1]) >>> b array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) or am I misunderstood something...? Regards, Aureli ################################# Aureli Soria Frisch Fraunhofer IPK Dept. Pattern Recognition post: Pascalstr. 8-9, 10587 Berlin, Germany e-mail: aureli at ipk.fhg.de fon: +49 30 39006-150 fax: +49 30 3917517 web: http://vision.fhg.de/~aureli/web-aureli_en.html ################################# From paul at pfdubois.com Wed Mar 20 13:14:10 2002 From: paul at pfdubois.com (Paul Dubois) Date: Wed Mar 20 13:14:10 2002 Subject: [Numpy-discussion] [ANN] Pyfort 7.0 Extending Numpy with C Message-ID: <000701c1d053$c501d860$09860cc0@CLENHAM> A beta version of Pyfort 7.0 is available at pyfortran.sf.net. The documentation is not yet upgraded to this version. Pyfort 7.0 adds the ability to wrap Fortran-like C coding to extend Numpy. Dramatically illustrating the virtue of open source software, credit for this improvement goes to: Michiel de Hoon Human Genome Center University of Tokyo For example, if you have this C code: double mydot(int n, double* x, double *y) { int i; double d; d = 0.0; for (i=0; i < n; ++i) d += x[i] * y[i]; return d; } Then you can create a Pyfort input file mymod.pyf: function mydot (n, x, y) integer:: n=size(x) doubleprecision x(n), y(n) doubleprecision mydot end Compile mydot.c into a library libmydot.a. Then: pyfort -c cc -i -L. -l mydot mymod.pyf builds and installs the module mymod containing function mydot, which you can use from Python: import Numeric, mymod x=Numeric.array([1.,2.3.]) y=Numeric.array([5., -1., -1.]) print mymod.mydot(x,y) Note that by wrapping mydot in this way, Pyfort takes care of problems like converting input arrays of the wrong type, such as integer; making sure that x and y have the same length; and making sure x and y are contiguous. I added directory testc that contains an example like this and one where an array is output. Mr. de Hoon explained his patch as follows. "I have modified fortran_compiler.py to add gcc as a compiler. This enables pyfort to be used for C code instead of Fortran code only. To use this option, call pyfort with the -cgcc option to specify gcc as the compiler. In order to switch off the default TRANSPOSE and MIRROR options, some small modifications were needed in generator.py also. [Editor's note: both -c gcc and -c cc will work] Before writing this addition to pyfort, I tried to use swig to generate the wrapper for my C code. However, pyfort was easier to use in my case because it is integrated with numpy. I wasn't able to get swig use numpy arrays. In addition, I am writing extension code both in fortran and C, so it is easier having to use only one tool (pyfort) for both. In a sense, it is easier to extend python with C than with fortran because you don't have to worry about transposing the array. I tried to be minimally instrusive on the existing pyfort code to switch off transposing arrays; there may be prettier ways to do this than what I have done. With this modification, I was able to pass one- and two-dimensional numpy arrays from Python to C and back without problems, as well as scalar variables with intent(in) and intent(out). I have also used the modified Pyfort on some Fortran routines to make sure I didn't break something in the fortran part of Pyfort. I haven't done an extensive test of this addition, but I haven't found any problems with it so far. I hope this patch will be useful to other people trying to extend Python/numpy with C routines." Michiel de Hoon Human Genome Center University of Tokyo mdehoon at ims.u-tokyo.ac.jp From jochen at jochen-kuepper.de Wed Mar 20 19:06:03 2002 From: jochen at jochen-kuepper.de (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Wed Mar 20 19:06:03 2002 Subject: [Numpy-discussion] Different behaviour logical_and/and In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 20 Mar 2002 19:43:03 +0100 Aureli Soria Frisch wrote: >>>> a Aureli> array([0, 1, 0, 0, 1, 0, 1, 0, 0, 1]) >>>> b Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) >>>> Numeric.logical_and(a,b) Aureli> array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]) This look's correct... >>>> a and b Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) ... and this suspiciously like b. Greetings, Jochen - -- Einigkeit und Recht und Freiheit http://www.Jochen-Kuepper.de Libert?, ?galit?, Fraternit? GnuPG key: 44BCCD8E Sex, drugs and rock-n-roll -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: Processed by Mailcrypt and GnuPG iD8DBQE8mU3jiJ/aUUS8zY4RAnxdAKCfrCyep4b7TKoF+c631cJiX53GmACgtmGo jcS9gt6j1eJJj937JqoUm6M= =TyKW -----END PGP SIGNATURE----- From pearu at cens.ioc.ee Wed Mar 20 23:58:24 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed Mar 20 23:58:24 2002 Subject: [Numpy-discussion] Different behaviour logical_and/and In-Reply-To: Message-ID: On 20 Mar 2002, Jochen K?pper wrote: > >>>> a > Aureli> array([0, 1, 0, 0, 1, 0, 1, 0, 0, 1]) > >>>> b > Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) > > > >>>> Numeric.logical_and(a,b) > Aureli> array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]) > > This look's correct... > > >>>> a and b > Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) > > ... and this suspiciously like b. .. and also correct. It is default behaviour of Python `and' operation. From Python Language Reference: The expression `x and y' first evaluates `x'; if `x' is false, its value is returned; otherwise, `y' is evaluated and the resulting value is returned. So, `and' operation is "object"-operation (unless redefined to something else) while logical_and is "elementwise"-operation. Pearu From huaiyu_zhu at yahoo.com Sat Mar 23 00:32:02 2002 From: huaiyu_zhu at yahoo.com (Huaiyu Zhu) Date: Sat Mar 23 00:32:02 2002 Subject: [Numpy-discussion] Different behaviour logical_and/and In-Reply-To: Message-ID: On Thu, 21 Mar 2002, Pearu Peterson wrote: > On 20 Mar 2002, Jochen K?pper wrote: > > >>>> a > > Aureli> array([0, 1, 0, 0, 1, 0, 1, 0, 0, 1]) > > >>>> b > > Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) > > > > >>>> Numeric.logical_and(a,b) > > Aureli> array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]) > > > > This look's correct... > > > > >>>> a and b > > Aureli> array([1, 1, 0, 1, 0, 0, 1, 1, 1, 0]) > > > > ... and this suspiciously like b. > > . and also correct. It is default behaviour of Python `and' > operation. From Python Language Reference: > > The expression `x and y' first evaluates `x'; if `x' is false, its value > is returned; otherwise, `y' is evaluated and the resulting value is > returned. > > So, `and' operation is "object"-operation (unless redefined to something > else) while logical_and is "elementwise"-operation. There is a section in PEP-225 for elementwise/objectwise operators to extend the meaning of ~ to an "elementizer", so that [1, 0, 1, 0] and [0, 1, 1, 0] => [0, 1, 1, 0] [1, 0, 1, 0] ~and [0, 1, 1, 0] => [0, 0, 1, 0] There are several other places, entirely unrelated to numerical computation, that elementization of an operator makes sense. Huaiyu From paul at pfdubois.com Sun Mar 24 08:49:05 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Sun Mar 24 08:49:05 2002 Subject: [Numpy-discussion] [ANN] Pyfort-7.0b2 -- extending Numeric with C routines Message-ID: <000001c1d353$c816d390$0f01a8c0@NICKLEBY> Pyfort 7.0b2 is now available at sf.net/projects/pyfortran. Nummies who do not use Fortran may be interested in using Pyfort; with Michiel de Hoon's "C" option, it is now extremely easy to wrap a simple kind of C code for processing Numeric's arrays, like this: double ctry(int n, double* x, double* y) { int i; double d; d = 0.0; for (i=0; i < n; ++i) { d += x[i] * y[i]; } return d; } void cout(int n, double* x, double* y) { int i; for (i = 0; i < n; ++i) { y[i] = 1.414159 * x[i]; } } double c2(int n, int m, double x[n][m]) { double sum = 0.0; int i, j; for (i=0; i < n; i++) { for (j=0; j < m; j++) { sum += x[i][j]; } } return sum; } and then call it from Python like this: import testc, Numeric x = Numeric.array([1.,2.,3.]) y = Numeric.array([6.,-1.,-1.]) z = Numeric.arange(6) *1.0 z.shape=(3,2) print "Should be 1.0:", testc.ctry(x,y) print "Should be sqrt(2) * [1,2,3]:", testc.cout(x) print "Should be 15.0:", testc.c2(z) z.shape=(2,3) print "Should be 15.0:", testc.c2(z) ----------- notes Somehow 7.0b1 was missing the C examples. I apparently lost the changes I had made to the MANIFEST and did not realize my CVS commits had failed. Bad day at the office, I guess. I added testc back in, and added a 2-dimensional example. I also eliminated some warning errors in the generated code, and fixed an error in that case. My thanks to Michiel de Hoon and J.S. Whitaker. From pearu at cens.ioc.ee Sun Mar 24 09:59:04 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sun Mar 24 09:59:04 2002 Subject: f2py comparison Re: [Numpy-discussion] [ANN] Pyfort 7.0 Extending Numpy with C In-Reply-To: <000701c1d053$c501d860$09860cc0@CLENHAM> Message-ID: Hi, Nummies might be interested how to wrap C codes with f2py. On Wed, 20 Mar 2002, Paul Dubois wrote: > For example, if you have this C code: > > double mydot(int n, double* x, double *y) { > int i; > double d; > d = 0.0; > for (i=0; i < n; ++i) d += x[i] * y[i]; > return d; > } > > Then you can create a Pyfort input file mymod.pyf: > function mydot (n, x, y) > integer:: n=size(x) > doubleprecision x(n), y(n) > doubleprecision mydot > end Different from pyfort, f2py needs the following signature file: python module mymod interface function mydot (n, x, y) intent(c) mydot integer intent(c):: n=size(x) doubleprecision x(n), y(n) doubleprecision mydot end end interface end python module > > Compile mydot.c into a library libmydot.a. > Then: > > pyfort -c cc -i -L. -l mydot mymod.pyf > > builds and installs the module mymod containing function mydot, With f2py the above is equivalent to f2py -c mydot.c mymod.pyf This compiles mydot.c and builds the module mymod into the current directory. > which you > can use from Python: > > import Numeric, mymod > x=Numeric.array([1.,2.3.]) > y=Numeric.array([5., -1., -1.]) > print mymod.mydot(x,y) Python session with f2py generated mymod: >>> import mymod >>> print mymod.mydot([1,2,3],[1,2,4.]) 17.0 Regards, Pearu From oliphant.travis at ieee.org Sun Mar 24 16:49:03 2002 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sun Mar 24 16:49:03 2002 Subject: [Numpy-discussion] Re: [SciPy-dev] accuracy problem in filterdesign.py In-Reply-To: <000b01c1d37c$4a64a5c0$e5db9e3e@arrow> References: <000b01c1d37c$4a64a5c0$e5db9e3e@arrow> Message-ID: On Sunday 24 March 2002 02:38 pm, you wrote: > Hello, > > while writing a test driver for a minimum phase calculation routine I came > across the following problem. It is causing asymmetriesin the output of > > >>> N=512 > >>> lastpoint=2*pi > >>> w1=arange(0,lastpoint,lastpoint/N) > >>> w2=arange(0,N)*(lastpoint/N) > >>> lastpoint-w1[511]-w1[1] > > -6.3546390371982397e-014 > > >>> lastpoint-w2[511]-w2[1] > > 4.0245584642661925e-016 > > >>> w1[511] > > 6.2709134608765646 > > >>> w2[511] > > 6.2709134608765007 > > >>> w2[511]-w1[511] > > -6.3948846218409017e-014 > I just fixed this in Numeric. The arange in Numeric used to increment the value by the step amount. It now computes the value using value = start + i*step which fixes the problem. Thanks for pointing this out. From mall at kornet.net Wed Mar 27 06:36:11 2002 From: mall at kornet.net (¸ô¸¶½ºÅ¸) Date: Wed Mar 27 06:36:11 2002 Subject: [Numpy-discussion] (numpy-discussion´Ô) µðÁöŻī¸Þ¶ó & ½ºÄ³³Ê Á¤º¸.(È«^º¸) Message-ID: An HTML attachment was scrubbed... URL: From magnus at hetland.org Wed Mar 27 11:23:19 2002 From: magnus at hetland.org (Magnus Lie Hetland) Date: Wed Mar 27 11:23:19 2002 Subject: [Numpy-discussion] Lazy video array? Message-ID: <20020327202138.A4132@idi.ntnu.no> Just something I've been thinking about for a few years (and never gotten around to doing anything about)... How realistic would it be to wrap a video file as a type of three-dimensional (assuming grayscale) array object and then use e.g. numarray to manipulate it? And how easy would it be to make this sort of thing "lazy", so that only the parts needed for the parts you actually access (for display or whatever) are processed? E.g. (silly example): >>> a = videoarray('somefile.mpg') >>> b = sin(a) # No real computation here >>> for frame in b: ... displayFrame(frame) # Computation performed here... Or something... Or maybe I'm just bonkers ;) By the way: Is there any documentation of the numarray C API anywhere yet? -- Magnus Lie Hetland The Anygui Project http://hetland.org http://anygui.org From jochen at unc.edu Wed Mar 27 13:09:00 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Wed Mar 27 13:09:00 2002 Subject: [Numpy-discussion] Lazy video array? In-Reply-To: <20020327202138.A4132@idi.ntnu.no> References: <20020327202138.A4132@idi.ntnu.no> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 27 Mar 2002 20:21:38 +0100 Magnus Lie Hetland wrote: Magnus> By the way: Is there any documentation of the numarray C API Magnus> anywhere yet? I think ,---- | http://stsdas.stsci.edu/numarray/DesignOverview.html `---- is all there is for now. And it's probably not exactly right any more... Greetings, Jochen - -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6-cygwin-fcn-1 (Cygwin) Comment: Processed by Mailcrypt and GnuPG iEYEARECAAYFAjyiMsIACgkQiJ/aUUS8zY6wfgCdHxsqk3NnWRzme0M/6wYcfpKK kA8AoJ0p0Kg05UOwCOk2QkPu6mQft5xj =WLhN -----END PGP SIGNATURE----- From amitha_linux at yahoo.com Thu Mar 28 06:59:10 2002 From: amitha_linux at yahoo.com (Amitha P) Date: Thu Mar 28 06:59:10 2002 Subject: [Numpy-discussion] Returning a 2-D PyArrayObject Message-ID: <20020328145811.66461.qmail@web13403.mail.yahoo.com> Hi all, I am new to Python/Python extension programming. I am writing a c-extension to python which takes a 3-Dimensional array object and returns the 2Dimensional array object. The c function returns the 2-D array and converts into Python object, but the 2D python array object doesnot contain the correct values, it's priting out some irrevelent values. I could successfully return single dimension Python array object but I am not able to return 2D python array object. I am attaching the code. Please look at it and point out the errors. Thank you very much.. ------------------------------------------------------- CMATRIXMUL16.c - This takes the array (the python 3D array) and returns the 2D array ------------------------------------------------------ #include #include float* cmatrixmul16(float *array,float *paddarr, int n,int r,int col,float **store) { int i,j,k; int devices; int pathpoints; int column; int len; float sum; float **a,**b,**c,**d; float *temp; static float **tracematrix; tracematrix = store; pathpoints=r; column = col; devices= n; len = pathpoints*2; temp = (float *)malloc( len* sizeof(float)); sum =0.0; a = (float **)malloc(devices * sizeof(float)); b = (float **)malloc(devices * sizeof(float)); c = (float **)malloc(devices * sizeof(float)); d = (float **)malloc(devices * sizeof(float)); for(i=0;i #include #include static PyObject * Py_arraytest1 (PyObject *, PyObject *); static char _tests17_module_doc[] ="tests11: module documentation"; static char arraytest1__doc__[]= "mytest:function documentation"; /********************************Python symbol table *****************************************/ static PyMethodDef _tests17_methods[] = { {"arraytest1" , (PyCFunction)Py_arraytest1 , METH_VARARGS,arraytest1__doc__ }, {NULL, (PyCFunction)NULL,0,NULL } /* terminates the list of methods */ }; /*********************************End of Symbol table ***************************************/ void init_tests17() { /* We will be using C-functions defined in the array module. So we * need to be sure that the shared library defining these functions * is loaded. */ import_array(); (void) Py_InitModule4( "_tests17", /* module name */ _tests17_methods, /* structure containing python symbol info */ _tests17_module_doc, /* module documentation string */ (PyObject *) NULL, PYTHON_API_VERSION); } /*function to calculate the product of two arrays */ static PyObject * Py_arraytest1(PyObject *self, PyObject *args) { PyArrayObject *array, *paddarr, *product; char *c; int n,r,col,i,j,k; int dimensions[2]; float **store; if (!PyArg_ParseTuple(args, "O!O!", &PyArray_Type, &array,&PyArray_Type,&paddarr)) return NULL; /* The arguments are the 3-D array and the 1-Dpaddarr */ n = (long) array->dimensions[0]; r = (long) array->dimensions[1]; col = (long) array->dimensions[2]; store = (float **)malloc(n*sizeof(float)); for(i=0;idata,(float *)paddarr->data,n,r,col,store); dimensions[0] = n; dimensions[1] = 2*r; product = (PyArrayObject *)PyArray_FromDimsAndData(2, dimensions, PyArray_FLOAT,c); PyArray_Return(product); } __________________________________________________ Do You Yahoo!? Yahoo! Movies - coverage of the 74th Academy Awards? http://movies.yahoo.com/ From hinsen at cnrs-orleans.fr Thu Mar 28 08:28:09 2002 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Thu Mar 28 08:28:09 2002 Subject: [Numpy-discussion] Returning a 2-D PyArrayObject In-Reply-To: <20020328145811.66461.qmail@web13403.mail.yahoo.com> References: <20020328145811.66461.qmail@web13403.mail.yahoo.com> Message-ID: Amitha P writes: > irrevelent values. I could successfully return single > dimension Python array object but I am not able to > return 2D python array object. I am attaching the > code. Please look at it and point out the errors. > product = (PyArrayObject *)PyArray_FromDimsAndData(2, > dimensions, > PyArray_FLOAT,c); The variable c in this call must point to a storage area which holds the elements of the matrix, i.e. a one-dimensional float array. What you pass in your code is a list of pointers to the rows of the matrix. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From johanfo at ohman.no Fri Mar 29 12:03:06 2002 From: johanfo at ohman.no (=?iso-8859-1?Q?Johan_Fredrik_=D8hman?=) Date: Fri Mar 29 12:03:06 2002 Subject: [Numpy-discussion] Right behavior Message-ID: <000d01c1d75c$9bfe1e50$c167f081@matrisen> This code generates very non-random numbers, even when the seed value is reinitialized. Take a look at the first number in each run ! Is this right ? -- JF? #!/usr/local/bin/python2.1 # Python Virtual clock import RandomArray print "Seed", RandomArray.get_seed() for i in range(1000000,10000000,1000000): print "Clock at time:" , i/1000000, ":", RandomArray.normal(10,2) [root at blekkulf /root]# ./t2.py Seed (101743, 1951) Clock at time: 1 : 7.98493051529 Clock at time: 2 : 10.8439420462 Clock at time: 3 : 7.59234881401 Clock at time: 4 : 7.32021093369 Clock at time: 5 : 10.9444898367 Clock at time: 6 : 10.1128772199 Clock at time: 7 : 13.1178274155 Clock at time: 8 : 11.779414773 Clock at time: 9 : 10.7529922128 [root at blekkulf /root]# ./t2.py Seed (101743, 1953) Clock at time: 1 : 7.98525762558 Clock at time: 2 : 9.38142818213 Clock at time: 3 : 7.11979293823 Clock at time: 4 : 10.867649436 Clock at time: 5 : 9.62882992625 Clock at time: 6 : 12.1940765381 Clock at time: 7 : 6.84895467758 Clock at time: 8 : 8.13472533226 Clock at time: 9 : 8.15638375282 [root at blekkulf /root]# ./t2.py Seed (101743, 1959) Clock at time: 1 : 7.98623776436 Clock at time: 2 : 14.5040078163 Clock at time: 3 : 11.3408681154 Clock at time: 4 : 6.32757425308 Clock at time: 5 : 8.94617521763 Clock at time: 6 : 12.1802353859 Clock at time: 7 : 12.0685124397 Clock at time: 8 : 10.5330892205 Clock at time: 9 : 10.9744755626 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tchur at optushome.com.au Fri Mar 29 13:01:15 2002 From: tchur at optushome.com.au (Tim Churches) Date: Fri Mar 29 13:01:15 2002 Subject: [Numpy-discussion] Right behavior References: <000d01c1d75c$9bfe1e50$c167f081@matrisen> Message-ID: <3CA4D3BB.617F2883@optushome.com.au> > Johan Fredrik ?hman wrote: > > This code generates very non-random numbers, even when the seed value > is reinitialized. > Take a look at the first number in each run ! > The first numbers in each of your three runs are 7.98493051529 , 7.98525762558 and 7.98623776436. They look like different numbers to me. If you want the difference between initial values to be greater, you need to make the difference in your seeds greater. For example, if I run your code now, I get 8.29225027561, 8.29484963417 and 8.29744851589, but setting the seed to (1,2) gives an initial value of 5.69397783279. Remember, these are only pseudorandom numbers. Tim C > Is this right ? > > -- > JF? > > > #!/usr/local/bin/python2.1 > > # Python Virtual clock > import RandomArray > > print "Seed", RandomArray.get_seed() > for i in range(1000000,10000000,1000000): > print "Clock at time:" , i/1000000, ":", > RandomArray.normal(10,2) > > > [root at blekkulf /root]# ./t2.py > Seed (101743, 1951) > Clock at time: 1 : 7.98493051529 > Clock at time: 2 : 10.8439420462 > Clock at time: 3 : 7.59234881401 > Clock at time: 4 : 7.32021093369 > Clock at time: 5 : 10.9444898367 > Clock at time: 6 : 10.1128772199 > Clock at time: 7 : 13.1178274155 > Clock at time: 8 : 11.779414773 > Clock at time: 9 : 10.7529922128 > [root at blekkulf /root]# ./t2.py > Seed (101743, 1953) > Clock at time: 1 : 7.98525762558 > Clock at time: 2 : 9.38142818213 > Clock at time: 3 : 7.11979293823 > Clock at time: 4 : 10.867649436 > Clock at time: 5 : 9.62882992625 > Clock at time: 6 : 12.1940765381 > Clock at time: 7 : 6.84895467758 > Clock at time: 8 : 8.13472533226 > Clock at time: 9 : 8.15638375282 > [root at blekkulf /root]# ./t2.py > Seed (101743, 1959) > Clock at time: 1 : 7.98623776436 > Clock at time: 2 : 14.5040078163 > Clock at time: 3 : 11.3408681154 > Clock at time: 4 : 6.32757425308 > Clock at time: 5 : 8.94617521763 > Clock at time: 6 : 12.1802353859 > Clock at time: 7 : 12.0685124397 > Clock at time: 8 : 10.5330892205 > Clock at time: 9 : 10.9744755626 From johanfo at ohman.no Fri Mar 29 13:28:11 2002 From: johanfo at ohman.no (=?iso-8859-1?Q?Johan_Fredrik_=D8hman?=) Date: Fri Mar 29 13:28:11 2002 Subject: [Numpy-discussion] Right behavior References: <000d01c1d75c$9bfe1e50$c167f081@matrisen> <3CA4D3BB.617F2883@optushome.com.au> Message-ID: <002e01c1d768$86ef39c0$c167f081@matrisen> The first numbers in each of your three runs are 7.98493051529 , 7.98525762558 and 7.98623776436. They look like different numbers to me. First, thanks for your answer Time. I do agree, they are different. But I wouldn't call it random. I didn't expect that the small difference in the initial seed would affect the first number with so little. Usually the seed numbers I have experienced other places have much more dramatic effect on the numbers, if you see what I mean... If you want the difference between initial values to be greater, you need to make the difference in your seeds greater. For example, if I run your code now, I get 8.29225027561, 8.29484963417 and 8.29744851589, but setting the seed to (1,2) gives an initial value of 5.69397783279. Remember, these are only pseudorandom numbers. Yes, they are pseudorandom and that is OK. What I just want is some more initial difference between the runs without setting the seed number manually. But know I know this is not a flaw in the RNG, but "its the way it is supposed to be" Thanks -- Johan Fredrik Ohman From tchur at optushome.com.au Fri Mar 29 14:51:03 2002 From: tchur at optushome.com.au (Tim Churches) Date: Fri Mar 29 14:51:03 2002 Subject: [Numpy-discussion] Right behavior References: <000d01c1d75c$9bfe1e50$c167f081@matrisen> <3CA4D3BB.617F2883@optushome.com.au> <002e01c1d768$86ef39c0$c167f081@matrisen> Message-ID: <3CA4ED89.14AE55B0@optushome.com.au> Johan Fredrik ?hman wrote: > > The first numbers in each of your three runs are 7.98493051529 , > 7.98525762558 and 7.98623776436. > They look like different numbers to me. > > First, thanks for your answer Time. > I do agree, they are different. But I wouldn't call it random. I didn't expect > that the small difference in the initial seed would affect the first number with so little. > Usually the seed numbers I have experienced other places have much more > dramatic effect on the numbers, if you see what I mean... OK, you need to use Konrad Hinsen's excellent RNG module which comes with Numeric Python: ################################# # Python Virtual clock import RNG dist = RNG.NormalDistribution(10, 2) rng = RNG.CreateGenerator(0, dist) for i in range(1000000,10000000,1000000): print "Clock at time:" , i/1000000, ":", rng.ranf() ################################## The above code gives 8.46183655136, 7.29889782477 and 5.58243682462 as the first values in three successive runs on my system. Hope this helps, Tim C > > If you want the difference > between initial values to be greater, you need to make the > difference in your seeds greater. For example, if I run your code now, I > get 8.29225027561, 8.29484963417 and 8.29744851589, but setting the seed > to (1,2) gives an initial value of 5.69397783279. Remember, these are > only pseudorandom numbers. > > Yes, they are pseudorandom and that is OK. What I just want is some more > initial difference between the runs without setting the seed number manually. > But know I know this is not a flaw in the RNG, but "its the way it is supposed to be" > > Thanks > > -- > Johan Fredrik Ohman > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From cnetzer at mail.arc.nasa.gov Fri Mar 29 18:01:02 2002 From: cnetzer at mail.arc.nasa.gov (Chad Netzer) Date: Fri Mar 29 18:01:02 2002 Subject: [Numpy-discussion] RandomArray question Message-ID: <200203300200.SAA13173@mail.arc.nasa.gov> > writes: >Take a look at the first number in each run ! That is because the random starting seed is (probably, I haven't looked at the code) set from the clock, and doesn't change all that much from run to run. You'll see similar results when you substitute: print "Clock at time:" , i, ":", RandomArray.random_integers(10) or print "Clock at time:" , i, ":", RandomArray.uniform(1, 10) into your code. The part before the decimal point is always the same on the first call of each run (assuming you run them at roughly the same time). Note that the 'seed' is really the internal state of the RNG and changes at each call. You could call the random function a few dozen times before using results, or hash the first result and use that as a new seed, etc. But basically, the generator will produce similar initial results (ie. one call) for similar seeds, which is what the time value is causing. I'd propose that the implementation, when setting the seed from the time, generate at least one dummy RNG generation before returning results. -- Chad Netzer chad.netzer at stanfordalumni.org From edcjones at erols.com Sat Mar 30 07:55:18 2002 From: edcjones at erols.com (Edward C. Jones) Date: Sat Mar 30 07:55:18 2002 Subject: [Numpy-discussion] IM = Numeric + PIL + OpenCV Message-ID: <3CA5E432.4040503@erols.com> IM (pronounced with a long I) is an Python module that makes it easy to use Numeric and PIL together in programs. Typical functions in IM are: Open: Opens an image file using PIL and converts it to Numeric, PIL, or OpenCV formats. Save: Converts an array to PIL and saves it to a file. Array_ToArrayCast: Converts images between formats and between pixel types. In addition to Numeric and PIL, IM works with the Intel OpenCV computer vision system (http://www.intel.com/research/mrl/research/opencv/). OpenCV is available for Linux at the OpenCV Yahoo Group (http://groups.yahoo.com/group/OpenCV/). IM currently runs under Linux only. It should not be too difficult to port the basic IM system to Windows or Mac. The OpenCV wrapper is large and complex and uses SWIG. It will be harder to port. The IM system appears to be pretty stable. On the other hand, the OpenCV wrapper is probably very buggy. To download the software go to http://members.tripod.com/~edcjones/pycode.html and download "PyCV.032502.tgz". Edward C. Jones edcjones at hotmail.com From rob at pythonemproject.com Sat Mar 30 08:01:15 2002 From: rob at pythonemproject.com (rob) Date: Sat Mar 30 08:01:15 2002 Subject: [Numpy-discussion] Where is SciPy? Message-ID: <3CA5E06E.BCE32F8D@pythonemproject.com> Their site is dead here. If anyone has the latest copy of Weave tar'd up that they can send me, I had a bug report to finally make. For some reason, Weave still can't find libstdc++ on FreeBSD. Rob. -- ----------------------------- The Numeric Python EM Project www.pythonemproject.com From rob at pythonemproject.com Sat Mar 30 08:30:07 2002 From: rob at pythonemproject.com (rob) Date: Sat Mar 30 08:30:07 2002 Subject: [Numpy-discussion] need help with simple conjugate gradient Laplace solver Message-ID: <3CA5E712.5516AD93@pythonemproject.com> I'm experimenting with electrostatics now. I have an iterative Jacobian Laplace solver working but it can be slow. It creates a beautiful 3D Animabob image. So I decided to try out a conjugate-gradient solver, which should be an order of mag better. It runs but doesn't converge. One thing I am wondering, where is the conjugate? In my FEM code, the solver realy does use a conjugate, while this one here that I pieced together from several other programs does not. Why is it called conjugate gradient without a conjugate ? :) Here is the code: from math import * from Numeric import * # #*** ENTER DATA filename= "out" # bobfile=open(filename+".bob","w") print "\n"*30 NX=30 # number of cells NY=30 NZ=30 resmax=1e-3 # conj-grad tolerance #allocate arrays ##Ex=zeros((NX+2,NY+2,NZ+2),Float) ##Ey=zeros((NX+2,NY+2,NZ+2),Float) ##Ez=zeros((NX+2,NY+2,NZ+2),Float) p=zeros((NX+1,NY+1,NZ+1),Float) q=zeros((NX+1,NY+1,NZ+1),Float) r=zeros((NX+1,NY+1,NZ+1),Float) u=zeros((NX+1,NY+1,NZ+1),Float) u[0:30,0:30,0]=0 # box at 1V with one side at 0V u[0:30,0,0:30]=1 u[0,0:30,0:30]=1 u[0:30,0:30,30]=1 u[0:30,30,0:30]=1 u[30,0:30,0:30]=1 r[1:NX-1,1:NY-1,1:NZ-1]=(u[1:NX-1,0:NY-2,1:NZ-1]+ #initialize r matrix u[1:NX-1,2:NY,1:NZ-1]+ u[0:NX-2,1:NY-1,1:NZ-1]+ u[2:NX,1:NY-1,1:NZ-1]+ u[1:NX-1,1:NY-1,0:NZ-2]+ u[1:NX-1,1:NY-1,2:NZ]) p[...]=r[...] #initialize p matrix # #**** START ITERATIONS # N=(NX-2)*(NY-2)*(NZ-2) # left over from Jacobi solution, not used KK=0 # iteration counter res=resmax=0.0; #set residuals=0 while(1): q[1:NX-1,1:NY-1,1:NZ-1]=(p[1:NX-1,0:NY-2,1:NZ-1]+ # finite difference eq p[1:NX-1,2:NY,1:NZ-1]+ p[0:NX-2,1:NY-1,1:NZ-1]+ p[2:NX,1:NY-1,1:NZ-1]+ p[1:NX-1,1:NY-1,0:NZ-2]+ p[1:NX-1,1:NY-1,2:NZ]) # Calculate r dot p and p dot q rdotp = 0.0 pdotq = 0.0 rdotp = add.reduce(ravel( r[1:NX-1,1:NY-1,1:NZ-1] * p[1:NX-1,1:NY-1,1:NZ-1])) pdotq = add.reduce(ravel( p[1:NX-1,1:NY-1,1:NZ-1] * q[1:NX-1,1:NY-1,1:NZ-1])) # Set alpha value alpha = rdotp/pdotq # Update solution and residual u[1:NX-1,1:NY-1,1:NZ-1] += alpha*p[1:NX-1,1:NY-1,1:NZ-1] r[1:NX-1,1:NY-1,1:NZ-1] += - alpha*q[1:NX-1,1:NY-1,1:NZ-1] # calculate beta rdotq = 0.0 rdotq = add.reduce(ravel(r[1:NX-1,1:NY-1,1:NZ-1]*q[1:NX-1,1:NY-1,1:NZ-1])) beta = rdotq/pdotq # Set the new search direction p[1:NX-1,1:NY-1,1:NZ-1] = r[1:NX-1,1:NY-1,1:NZ-1] - beta*p[1:NX-1,1:NY-1,1:NZ-1] res = sort(ravel(r[1:NX-1,1:NY-1,1:NZ-1]))[-1] #find largest residual # resmax = max(resmax,abs(res)) KK+=1 # print "Iteration Number %d Residual %1.2e" %(KK,abs(res)) if (abs(res)<=resmax): break # if residual is small enough break out print "Number of Iterations ",KK -- ----------------------------- The Numeric Python EM Project www.pythonemproject.com From rob at pythonemproject.com Sat Mar 30 09:00:03 2002 From: rob at pythonemproject.com (rob) Date: Sat Mar 30 09:00:03 2002 Subject: [Numpy-discussion] need help with simple conjugate gradient Laplace solver References: <3CA5E712.5516AD93@pythonemproject.com> Message-ID: <3CA5EE17.3C5CED2D@pythonemproject.com> After I post, I always see the dumb error. I am not including the 6x term in my finite difference equation. It now converges, but I get wierd looking V map. Rob. Here is the fixed code from math import * from Numeric import * # #*** ENTER DATA filename= "out" # bobfile=open(filename+".bob","w") print "\n"*30 NX=30 # number of cells NY=30 NZ=30 N=30 # size of box resmax=1e-3 # conj-grad tolerance #allocate arrays ##Ex=zeros((NX+2,NY+2,NZ+2),Float) ##Ey=zeros((NX+2,NY+2,NZ+2),Float) ##Ez=zeros((NX+2,NY+2,NZ+2),Float) p=zeros((NX+1,NY+1,NZ+1),Float) q=zeros((NX+1,NY+1,NZ+1),Float) r=zeros((NX+1,NY+1,NZ+1),Float) u=zeros((NX+1,NY+1,NZ+1),Float) u[0:N,0:N,0]=0 # box at 1V with one side at 0V u[0:N,0,0:N]=1 u[0,0:N,0:N]=1 u[0:N,0:N,N]=1 u[0:N,N,0:N]=1 u[N,0:N,0:N]=1 r[1:NX-1,1:NY-1,1:NZ-1]=(u[1:NX-1,0:NY-2,1:NZ-1]+ #initialize r matrix u[1:NX-1,2:NY,1:NZ-1]+ u[0:NX-2,1:NY-1,1:NZ-1]+ u[2:NX,1:NY-1,1:NZ-1]+ u[1:NX-1,1:NY-1,0:NZ-2]+ u[1:NX-1,1:NY-1,2:NZ]- 6*u[1:NX-1,1:NY-1,1:NZ-1]) p[...]=r[...] #initialize p matrix # #**** START ITERATIONS # N=(NX-2)*(NY-2)*(NZ-2) # left over from Jacobi solution, not used KK=0 # iteration counter res=0.0; #set residuals=0 while(1): q[1:NX-1,1:NY-1,1:NZ-1]=(6*p[1:NX-1,1:NY-1,1:NZ-1]- p[1:NX-1,0:NY-2,1:NZ-1]- # finite difference eq p[1:NX-1,2:NY,1:NZ-1]- p[0:NX-2,1:NY-1,1:NZ-1]- p[2:NX,1:NY-1,1:NZ-1]- p[1:NX-1,1:NY-1,0:NZ-2]- p[1:NX-1,1:NY-1,2:NZ]) # Calculate r dot p and p dot q rdotp = 0.0 pdotq = 0.0 rdotp = add.reduce(ravel( r[1:NX-1,1:NY-1,1:NZ-1] * p[1:NX-1,1:NY-1,1:NZ-1])) pdotq = add.reduce(ravel( p[1:NX-1,1:NY-1,1:NZ-1] * q[1:NX-1,1:NY-1,1:NZ-1])) # Set alpha value alpha = rdotp/pdotq # Update solution and residual u[1:NX-1,1:NY-1,1:NZ-1] += alpha*p[1:NX-1,1:NY-1,1:NZ-1] r[1:NX-1,1:NY-1,1:NZ-1] += - alpha*q[1:NX-1,1:NY-1,1:NZ-1] # calculate beta rdotq = 0.0 rdotq = add.reduce(ravel(r[1:NX-1,1:NY-1,1:NZ-1]*q[1:NX-1,1:NY-1,1:NZ-1])) beta = rdotq/pdotq # Set the new search direction p[1:NX-1,1:NY-1,1:NZ-1] = r[1:NX-1,1:NY-1,1:NZ-1] - beta*p[1:NX-1,1:NY-1,1:NZ-1] res = sort(ravel(r[1:NX-1,1:NY-1,1:NZ-1]))[-1] #find largest residual # resmax = max(resmax,abs(res)) KK+=1 # print "Iteration Number %d Residual %1.2e" %(KK,abs(res)) if (abs(res)<=resmax): break # if residual is small enough break out print "Number of Iterations ",KK From paul at pfdubois.com Sat Mar 30 19:02:04 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Sat Mar 30 19:02:04 2002 Subject: [Numpy-discussion] Right behavior In-Reply-To: <3CA4D3BB.617F2883@optushome.com.au> Message-ID: <000001c1d860$58bfcf30$0f01a8c0@NICKLEBY> About random number generation with Numeric: a. IMHO, RNG is the right choice if you are picky about the quality of the generator. This generator has a long history of heavy use. RandomArray is in the core because someone put it there early, not because it is the best. I do not claim to be an authority on this but that is my understanding. b. The suggestion made by one correspondent, that a generator should generate and throw away one value when the seed is set, sounds correct if viewed from the point of view of the initial set of a single stream. But many users need multiple streams that are independent and reproducible. This is done by saving the state of the generator and then restoring it later. It is important that this save/restore not change the results compared to not doing it. The presence or absence of another computation, or frequency of dump/restarts, that require a save/restore, must not affect the result. Thus a decision to throw away a result must come from the application level.