From nwagner at mecha.uni-stuttgart.de Thu Feb 7 01:00:05 2002 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Thu Feb 7 01:00:05 2002 Subject: [Numpy-discussion] Simulating white noise with Numpy Message-ID: <3C625102.5863F1EE@mecha.uni-stuttgart.de> Hi, How can I simulate a stationary, white noise, Gaussian random process (zero mean and unit variance) with Numpy ? Nils From jsaenz at wm.lc.ehu.es Thu Feb 7 02:14:02 2002 From: jsaenz at wm.lc.ehu.es (Jon Saenz) Date: Thu Feb 7 02:14:02 2002 Subject: [Numpy-discussion] Simulating white noise with Numpy In-Reply-To: <3C625102.5863F1EE@mecha.uni-stuttgart.de> Message-ID: import RandomArray mean=0.0 stdev=1.0 RandomArray.normal(mean,stdev,shape=ReturnFloat) Manual of Numpy, chapter 17, page 97. Edition October 200. Jon Saenz. | Tfno: +34 946012445 Depto. Fisica Aplicada II | Fax: +34 944648500 Facultad de Ciencias. \\ Universidad del Pais Vasco \\ Apdo. 644 \\ 48080 - Bilbao \\ SPAIN On Thu, 7 Feb 2002, Nils Wagner wrote: > Hi, > > How can I simulate a stationary, white noise, Gaussian random process > (zero mean and unit variance) with Numpy ? > > Nils > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From paul at pfdubois.com Thu Feb 14 10:06:22 2002 From: paul at pfdubois.com (Paul F. Dubois) Date: Thu Feb 14 10:06:22 2002 Subject: [Numpy-discussion] Proposal, recoded average. Message-ID: <000001c1b581$82612de0$0a01a8c0@freedom> I recoded the new average function in both Numeric and MA. This version ensures that in the returned=1 case the weights returned will be of the same shape as the average. The new version is in CVS and labeled as 21.0b2. The average now allows weights that, rather than having to be the same size as the input, can be the same size as the input's axis'th dimension. This relieves the user of the rather tricky code needed to make an array of a given shape broadcasting a vector along a certain axis. This is a relatively frequent need and so I will write a separate routine for it soon. My proposed calling sequence is: def broadcast(v, axis=0, s) """Returns an array of the type of v and the shape s, such that setting the axis'th subscript to j results in an array whose values are all v[j]. """ Comments about this idea or the way I explain it would be welcome. From mathew at fugue.jpl.nasa.gov Thu Feb 14 18:25:01 2002 From: mathew at fugue.jpl.nasa.gov (Mathew Yeates) Date: Thu Feb 14 18:25:01 2002 Subject: [Numpy-discussion] subsampling Message-ID: <200202150224.SAA22330@fugue.jpl.nasa.gov> Hi- Maybe this is a dumb question .... but.... how can I take a matrix and subsample it down to an arbitrary size? It seems that using slices is limited to constant integer step sizes while, if I'm converting between arbitrary sizes, I would like to subsample at a nonuniform rate. Am I missing something? Mathew From cgw at alum.mit.edu Thu Feb 14 18:43:01 2002 From: cgw at alum.mit.edu (Charles G Waldman) Date: Thu Feb 14 18:43:01 2002 Subject: [Numpy-discussion] subsampling In-Reply-To: <200202150224.SAA22330@fugue.jpl.nasa.gov> References: <200202150224.SAA22330@fugue.jpl.nasa.gov> Message-ID: <15468.30210.93588.763998@dragonfly.sportsdatabase.com> Mathew Yeates writes: > how can I take a matrix and subsample it down to > an arbitrary size? It seems that using slices is > limited to constant integer step sizes while, if > I'm converting between arbitrary sizes, I would like > to subsample at a nonuniform rate. I'm not sure exactly what you're trying to do, but maybe the following Python session will show you the way. >>> from Numeric import * >>> take.__doc__ 'take(a, indices, axis=0). Selects the elements in indices from array a along t he given axis.' >>> a = fromfunction(lambda i,j: 10*i+j, (10,5)) >>> a array([[ 0, 1, 2, 3, 4], [10, 11, 12, 13, 14], [20, 21, 22, 23, 24], [30, 31, 32, 33, 34], [40, 41, 42, 43, 44], [50, 51, 52, 53, 54], [60, 61, 62, 63, 64], [70, 71, 72, 73, 74], [80, 81, 82, 83, 84], [90, 91, 92, 93, 94]]) >>> b = take(a,(0,2,3,5,7)) >>> b array([[ 0, 1, 2, 3, 4], [20, 21, 22, 23, 24], [30, 31, 32, 33, 34], [50, 51, 52, 53, 54], [70, 71, 72, 73, 74]]) >>> c = take(b,(0,2,3), 1) >>> c array([[ 0, 2, 3], [20, 22, 23], [30, 32, 33], [50, 52, 53], [70, 72, 73]]) From jjl at pobox.com Fri Feb 15 12:08:17 2002 From: jjl at pobox.com (John J. Lee) Date: Fri Feb 15 12:08:17 2002 Subject: [Numpy-discussion] subsampling In-Reply-To: <15468.30210.93588.763998@dragonfly.sportsdatabase.com> Message-ID: On Thu, 14 Feb 2002, Charles G Waldman wrote: > Mathew Yeates writes: > > > how can I take a matrix and subsample it down to > > an arbitrary size? It seems that using slices is > > limited to constant integer step sizes while, if > > I'm converting between arbitrary sizes, I would like > > to subsample at a nonuniform rate. > > I'm not sure exactly what you're trying to do, but maybe the following > Python session will show you the way. [examples of take() usage snipped] I think -- correct me if I'm wrong -- he was asking about interpolation. If a Python loop is too slow, you can probably do it with the standard numpy functions, with some experimentation. John From cgw at alum.mit.edu Fri Feb 15 12:36:56 2002 From: cgw at alum.mit.edu (Charles G Waldman) Date: Fri Feb 15 12:36:56 2002 Subject: [Numpy-discussion] subsampling In-Reply-To: References: <15468.30210.93588.763998@dragonfly.sportsdatabase.com> Message-ID: <15469.28286.394897.690559@dragonfly.sportsdatabase.com> John J. Lee writes: > I think -- correct me if I'm wrong -- he was asking about interpolation. I think that only Mr Yeates knows for sure! I read the query a few times and I think he's looking for straight-ahead subsampling without interpolation. > If a Python loop is too slow, you can probably do it with the standard > numpy functions, with some experimentation. If interpolation is desired, it might also be worthwhile looking at the Python Imaging Library (PIL), I believe it has routines for resizing images with bilinear interpolation. Since PIL supports many different data formats (even floating point) and it's fairly easy, if inelegant, to inter-convert PIL images and NumPy arrays using "fromstring/tostring" methods (there should be tighter integration between these packages, maybe somebody has done this since last I looked), you might find that the PIL image resizing does what you're looking for. From mathew at fugue.jpl.nasa.gov Fri Feb 15 12:50:15 2002 From: mathew at fugue.jpl.nasa.gov (mathew at fugue.jpl.nasa.gov) Date: Fri Feb 15 12:50:15 2002 Subject: [Numpy-discussion] subsampling Message-ID: <200202152041.MAA14166@fugue.jpl.nasa.gov> John J. Lee writes: > I think -- correct me if I'm wrong -- he was asking about interpolation. I think that only Mr Yeates knows for sure! I read the query a few times and I think he's looking for straight-ahead subsampling without interpolation. Yes, I was just asking about subsampling although I did start to think about doing bilinear interpolation. But I was hoping to subsample using built in Numpy functions and .... your solution fits the bill!! Thank you! Mathew From Stefan.Heinrichs at uni-konstanz.de Fri Feb 15 13:21:54 2002 From: Stefan.Heinrichs at uni-konstanz.de (Stefan Heinrichs) Date: Fri Feb 15 13:21:54 2002 Subject: [Numpy-discussion] Interface for numpy C-API <-> simple C++ matrix class Message-ID: <20020215211700.GA22189@gibbs.physik.uni-konstanz.de> Hello, we would like to write routines processing numerical python arrays in C++, so that at least boundary checking can be enabled at runtime. While there are a lot of matrix libraries available for C++, I could not find the glue that interfaces such a library to the C-API of numerical python. Seamless access to a minimal C++ library would make the C++ part of programming much easier. Has anyone already written some wrapper/glue code? Thanks and best regards, Stefan -- ------------------------------------------------------------------- Email: Stefan.Heinrichs at uni-konstanz.de Address: Fakulaet fuer Physik, Universitaet Konstanz, Universitaetsstr.10, 78457 Konstanz, Germany Phone: +49 7531 88 3814 From rob at pythonemproject.com Sat Feb 16 10:04:03 2002 From: rob at pythonemproject.com (Rob) Date: Sat Feb 16 10:04:03 2002 Subject: [Numpy-discussion] Converting from FORTRAN Equivalence to a Numpy alternative Message-ID: <3C6E9E25.7C556248@pythonemproject.com> I hoped there might be a combo FORTRAN/Numpy hacker that might help me with this. I have the following statements in a FORTRAN routine that I am converting to Numpy. I am wondering how to handle the EQUIVALENCE statement. As I understand it, ARL1[1] should be equal to the first item in the memory block for AR1. But not knowing how FORTRAN allocates arrays, its hard to tell what to do in Numpy. I tried ARL1=ravel(AR1) but that didn't work. Similarly ARL=AR[:,1,1} didn't work. I'm lost. Thanks in advance for any help. Rob. #COMMON /GGRID/ AR1(11,10,4),AR2(17,5,4),AR3(9,8,4),EPSCF,DXA(3),& # &DYA(3),XSA(3),YSA(3),NXA(3),NYA(3 #DIMENSION ARL1(1), ARL2(1), ARL3(1) #EQUIVALENCE (ARL1,AR1), (ARL2,AR2), (ARL3,AR3), (XS2,XSA(2)), (YS3,YSA(3)) -- The Numeric Python EM Project www.pythonemproject.com From rob at pythonemproject.com Sat Feb 16 11:40:01 2002 From: rob at pythonemproject.com (Rob) Date: Sat Feb 16 11:40:01 2002 Subject: [Numpy-discussion] I solved my fortran equivalence to Python conversion Message-ID: <3C6EB4B5.6E67EECE@pythonemproject.com> Here is the code: ARL1=zeros((441),Complex) ARL2=zeros((341),Complex) ARL3=zeros((289),Complex) #DIMENSION ARL1(1), ARL2(1), ARL3(1) #EQUIVALENCE (ARL1,AR1), (ARL2,AR2), (ARL3,AR3), (XS2,XSA(2)), (YS3,YSA(3)) AR1=swapaxes(AR1,0,2) AR2=swapaxes(AR2,0,2) AR3=swapaxes(AR3,0,2) ARL1[1:]=ravel(AR1[1:,1:,1:]) ARL2[1:]=ravel(AR2[1:,1:,1:]) ARL3[1:]=ravel(AR3[1:,1:,1:]) -- The Numeric Python EM Project www.pythonemproject.com From kragen at pobox.com Mon Feb 18 12:13:19 2002 From: kragen at pobox.com (Kragen Sitaker) Date: Mon Feb 18 12:13:19 2002 Subject: [Numpy-discussion] memory-mapped Numeric arrays: arrayfrombuffer version 2 Message-ID: <20020218200719.AB961BDC5@panacea.canonical.org> (I thought I had sent this mail on January 30, but I guess I was mistaken.) Eric Nodwell writes: > Since I have a 2.4GB data file handy, I thought I'd try this > package with it. (Normally I process this data file by reading > it in a chunk at a time, which is perfectly adequate.) Not > surprisingly, it chokes: Yep, that's pretty much what I expected. I think that adding code to support mapping some arbitrary part of a file should be fairly straightforward --- do you want to run the tests if I write the code? > File "/home/eric/lib/python2.2/site-packages/maparray.py", line 15, > in maparray > m = mmap.mmap(fn, os.fstat(fn)[stat.ST_SIZE]) > OverflowError: memory mapped size is too large (limited by C int) This error message's wording led me to something that was *not* what I expected. That's a sort of alarming message --- it suggests that it won't work on >2G files even on LP64 systems, where longs and pointers are 64 bits but ints are 32 bits. The comments in the mmap module say: The map size is restricted to [0, INT_MAX] because this is the current Python limitation on object sizes. Although the mmap object *could* handle a larger map size, there is no point because all the useful operations (len(), slicing(), sequence indexing) are limited by a C int. Horrifyingly, this is true. Even the buffer interface function arrayfrombuffer uses to get the size of the buffer return int sizes, not size_t sizes. This is a serious bug in the buffer interface, IMO, and I doubt it will be fixed --- the buffer interface is apparently due for a revamp soon at any rate, so little changes won't be welcomed, especially if they break binary backwards compatibility, as this one would on LP64 platforms. Fixing this, so that LP64 Pythons can mmap >2G files (their birthright!), is a bit of work --- probably a matter of writing a modified mmap() module that supports a saner version of the buffer interface (with named methods instead of a type object slot), and can't be close()d, to boot. Until then, this module only lets you memory-map files up to two gigs. > (details: Python 2.2, numpy 20.3, Pentium III, Debian Woody, Linux > kernel 2.4.13, gcc 2.95.4) My kernel is 2.4.13 too, but I don't have any large files, and I don't know whether any of my kernel, my libc, or my Python even support them. > I'm not a big C programmer, but I wonder if there is some way for > this package to overcome the 2GB limit on 32-bit systems. That > could be useful in some situations. I don't know, but I think it would probably require extensive code changes throughout Numpy. -- Kragen Sitaker The sages do not believe that making no mistakes is a blessing. They believe, rather, that the great virtue of man lies in his ability to correct his mistakes and continually make a new man of himself. -- Wang Yang-Ming From mathew at fugue.jpl.nasa.gov Mon Feb 18 12:26:58 2002 From: mathew at fugue.jpl.nasa.gov (Mathew Yeates) Date: Mon Feb 18 12:26:58 2002 Subject: [Numpy-discussion] memory-mapped Numeric arrays: arrayfrombuffer version 2 In-Reply-To: Your message of "Mon, 18 Feb 2002 15:07:19 EST." <20020218200719.AB961BDC5@panacea.canonical.org> Message-ID: <200202182023.MAA25723@fugue.jpl.nasa.gov> Has anyone checked out VMaps at http://snafu.freedom.org/Vmaps/ ?? This might be what you're looking for. Mathew > (I thought I had sent this mail on January 30, but I guess I was > mistaken.) > > Eric Nodwell writes: > > Since I have a 2.4GB data file handy, I thought I'd try this > > package with it. (Normally I process this data file by reading > > it in a chunk at a time, which is perfectly adequate.) Not > > surprisingly, it chokes: > > Yep, that's pretty much what I expected. I think that adding code to > support mapping some arbitrary part of a file should be fairly > straightforward --- do you want to run the tests if I write the code? > > > File "/home/eric/lib/python2.2/site-packages/maparray.py", line 15, > > in maparray > > m = mmap.mmap(fn, os.fstat(fn)[stat.ST_SIZE]) > > OverflowError: memory mapped size is too large (limited by C int) > > This error message's wording led me to something that was *not* what I > expected. > > That's a sort of alarming message --- it suggests that it won't work > on >2G files even on LP64 systems, where longs and pointers are 64 > bits but ints are 32 bits. The comments in the mmap module say: > > The map size is restricted to [0, INT_MAX] because this is the current > Python limitation on object sizes. Although the mmap object *could* handle > a larger map size, there is no point because all the useful operations > (len(), slicing(), sequence indexing) are limited by a C int. > > Horrifyingly, this is true. Even the buffer interface function > arrayfrombuffer uses to get the size of the buffer return int sizes, > not size_t sizes. This is a serious bug in the buffer interface, IMO, > and I doubt it will be fixed --- the buffer interface is apparently > due for a revamp soon at any rate, so little changes won't be > welcomed, especially if they break binary backwards compatibility, as > this one would on LP64 platforms. > > Fixing this, so that LP64 Pythons can mmap >2G files (their > birthright!), is a bit of work --- probably a matter of writing a > modified mmap() module that supports a saner version of the buffer > interface (with named methods instead of a type object slot), and > can't be close()d, to boot. > > Until then, this module only lets you memory-map files up to two gigs. > > > (details: Python 2.2, numpy 20.3, Pentium III, Debian Woody, Linux > > kernel 2.4.13, gcc 2.95.4) > > My kernel is 2.4.13 too, but I don't have any large files, and I don't > know whether any of my kernel, my libc, or my Python even support > them. > > > I'm not a big C programmer, but I wonder if there is some way for > > this package to overcome the 2GB limit on 32-bit systems. That > > could be useful in some situations. > > I don't know, but I think it would probably require extensive code > changes throughout Numpy. > > -- > Kragen Sitaker > The sages do not believe that making no mistakes is a blessing. They believe, > rather, that the great virtue of man lies in his ability to correct his > mistakes and continually make a new man of himself. -- Wang Yang-Ming > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From nwagner at mecha.uni-stuttgart.de Tue Feb 19 05:33:05 2002 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Tue Feb 19 05:33:05 2002 Subject: [Numpy-discussion] Hilbert transform Message-ID: <3C7262DA.D1A551B7@mecha.uni-stuttgart.de> Hi, I am looking for a Numpy function that performs the Hilbert transform of a scalar time series. Thanks in advance. Nils Wagner From mathew at fugue.jpl.nasa.gov Fri Feb 22 16:20:02 2002 From: mathew at fugue.jpl.nasa.gov (Mathew Yeates) Date: Fri Feb 22 16:20:02 2002 Subject: [Numpy-discussion] memory not being freed Message-ID: <200202230019.QAA27573@fugue.jpl.nasa.gov> Hi I'm having problems with garbage collection I wrote an extension which creates an array and returns it foo() { arr = (PyArrayObject *) PyArray_FromDims( ..... ret = Py_BuildValue("O", arr); return ret; } but now if I do while 1: a=foo() memory is never free'd. I've even tried explicitly calling gc.collect and adding del(a) after a=foo. Is the problem that Py_BuildValue increases the reference count? Mathew From reggie at merfinllc.com Fri Feb 22 17:26:06 2002 From: reggie at merfinllc.com (Reggie Dugard) Date: Fri Feb 22 17:26:06 2002 Subject: [Numpy-discussion] memory not being freed In-Reply-To: <200202230019.QAA27573@fugue.jpl.nasa.gov> References: <200202230019.QAA27573@fugue.jpl.nasa.gov> Message-ID: <20020223.1251000@auk.merfinllc.com> I believe PyArray_FromDims() returns a new reference to the object (arr) and that the Py_BuildValue creates another reference so that you've got two references to that array and python is only going to DECREF one when it's done. I would suggest either 1) using the 'N' format character to Py_BuildValue so that another reference isn't created or 2) explicitly calling Py_DECREF(arr) just before you return. Reggie >>>>>>>>>>>>>>>>>> Original Message <<<<<<<<<<<<<<<<<< On 2/22/02, 4:19:15 PM, Mathew Yeates wrote regarding [Numpy-discussion] memory not being freed: > Hi > I'm having problems with garbage collection > I wrote an extension which creates an array > and returns it > foo() { > arr = (PyArrayObject *) PyArray_FromDims( ..... > ret = Py_BuildValue("O", arr); > return ret; > } > but now if I do > while 1: > a=foo() > memory is never free'd. I've even tried explicitly calling gc.collect and > adding del(a) after a=foo. > Is the problem that Py_BuildValue increases the reference count? > Mathew > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From paul at pfdubois.com Fri Feb 22 17:29:06 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Feb 22 17:29:06 2002 Subject: [Numpy-discussion] memory not being freed In-Reply-To: <200202230019.QAA27573@fugue.jpl.nasa.gov> Message-ID: <000201c1bc09$4c1f1ec0$1001a8c0@NICKLEBY> You just want ret = (PyObject*) arr; I assume it is PyObject *foo() and you just didn't show it. -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Mathew Yeates Sent: Friday, February 22, 2002 4:19 PM To: numpy-discussion at lists.sourceforge.net Subject: [Numpy-discussion] memory not being freed Hi I'm having problems with garbage collection I wrote an extension which creates an array and returns it foo() { arr = (PyArrayObject *) PyArray_FromDims( ..... ret = Py_BuildValue("O", arr); return ret; } but now if I do while 1: a=foo() memory is never free'd. I've even tried explicitly calling gc.collect and adding del(a) after a=foo. Is the problem that Py_BuildValue increases the reference count? Mathew _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From mathew at fugue.jpl.nasa.gov Fri Feb 22 17:32:06 2002 From: mathew at fugue.jpl.nasa.gov (Mathew Yeates) Date: Fri Feb 22 17:32:06 2002 Subject: [Numpy-discussion] memory not being freed In-Reply-To: Your message of "Fri, 22 Feb 2002 17:27:44 PST." <000201c1bc09$4c1f1ec0$1001a8c0@NICKLEBY> Message-ID: <200202230131.RAA29048@fugue.jpl.nasa.gov> > You just want ret = (PyObject*) arr; > Paul okay, I tried that and it worked. But what about the circumstance where I want to return the array AND other values. So then I would build a list and return it. Again, won't I have too many references to may array? Thanks Mathew From cavallo at kip.uni-heidelberg.de Tue Feb 26 08:45:17 2002 From: cavallo at kip.uni-heidelberg.de (cavallo at kip.uni-heidelberg.de) Date: Tue Feb 26 08:45:17 2002 Subject: [Numpy-discussion] Interface for numpy C-API <-> simple C++ matrix class In-Reply-To: <20020215211700.GA22189@gibbs.physik.uni-konstanz.de> Message-ID: On Fri, 15 Feb 2002, Stefan Heinrichs wrote: > Hello, > > we would like to write routines processing numerical python arrays in > C++, so that at least boundary checking can be enabled at runtime. > While there are a lot of matrix libraries available for C++, I could > not find the glue that interfaces such a library to the C-API of > numerical python. Seamless access to a minimal C++ library would make > the C++ part of programming much easier. That's clear i had the same problem long time ago, and i written a small wrapper to the blitz library (www.oonumerics.org): i hadn't time to iron out all the details so it isn't ready for general release, but it works. I'm courrently using some sort of template skeleton to write modules in python for my phd (it is 4d digital image processing): if you like to help me to develop it in a usable way this should be useful for other people as well. I think someone else has developed such a glue, under scipy (isn't?): just look at that. regards, antonio > > Has anyone already written some wrapper/glue code? > > Thanks and best regards, > > Stefan > > -- > > > ------------------------------------------------------------------- > Email: Stefan.Heinrichs at uni-konstanz.de > Address: Fakulaet fuer Physik, Universitaet Konstanz, > Universitaetsstr.10, 78457 Konstanz, Germany > Phone: +49 7531 88 3814 > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From eric at enthought.com Tue Feb 26 11:48:13 2002 From: eric at enthought.com (eric) Date: Tue Feb 26 11:48:13 2002 Subject: [Numpy-discussion] Interface for numpy C-API <-> simple C++ matrix class References: Message-ID: <035a01c1bef5$c7f3c070$6b01a8c0@ericlaptop> Weave optionally uses blitz arrays to represent NumPy objects: www.scipy.org/site_content/weave Maybe something there will be of use. eric > > > On Fri, 15 Feb 2002, Stefan Heinrichs wrote: > > > Hello, > > > > we would like to write routines processing numerical python arrays in > > C++, so that at least boundary checking can be enabled at runtime. > > While there are a lot of matrix libraries available for C++, I could > > not find the glue that interfaces such a library to the C-API of > > numerical python. Seamless access to a minimal C++ library would make > > the C++ part of programming much easier. > > That's clear i had the same problem long time ago, > and i written a small wrapper to the blitz library (www.oonumerics.org): > i hadn't time to iron out all the details so it isn't ready for general > release, but it works. > I'm courrently using some sort of template skeleton to write modules in > python for my phd (it is 4d digital image processing): if you like to help > me to develop it in a usable way this should be useful for other people as > well. > > I think someone else has developed such a glue, under scipy (isn't?): just > look at that. > > regards, > antonio > > > > > Has anyone already written some wrapper/glue code? > > > > Thanks and best regards, > > > > Stefan > > > > -- > > > > > > ------------------------------------------------------------------- > > Email: Stefan.Heinrichs at uni-konstanz.de > > Address: Fakulaet fuer Physik, Universitaet Konstanz, > > Universitaetsstr.10, 78457 Konstanz, Germany > > Phone: +49 7531 88 3814 > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From oliphant at ee.byu.edu Tue Feb 26 17:08:15 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue Feb 26 17:08:15 2002 Subject: [Numpy-discussion] Addition of masked and indexed arrays. In-Reply-To: <200202230131.RAA29048@fugue.jpl.nasa.gov> Message-ID: I'm proposing the addition of the following semantics to Numeric Arrays in order to support indexing arrays using mask and index arrays. Notice that the change required to support this behavior is minimal and all the work can be backloaded to the user. Let the array_subscript C code check to see if the index object passed in is a Tuple with the last element being a callable method. If this is the case, then the C code calls the callable method to handle the array assignment or array indexing code. So, for example one might say. a[a<3,mask] = 10 where mask is a callable method (or Cobject) which does effectively but has the right calling convention. putmask(a, a<3, 10). This same semantics could also handle integer indexing of arrays (using multiple styles). All that's required is a simple test in Numeric that does not break any current code. What do you think? Could I add the necessary check to support this in Numeric? Do others have a better idea? I know numarray is the solution to all of our problems. But, something tells me that the current version of Numeric is going to be around for a little while and it would be nice for it to have some of these useful features. -Travis From jmiller at stsci.edu Wed Feb 27 07:09:03 2002 From: jmiller at stsci.edu (Todd Miller) Date: Wed Feb 27 07:09:03 2002 Subject: [Numpy-discussion] ANN: numarray-0.2 Message-ID: <3C7CF5F3.1090003@stsci.edu> Numarray 0.2 ------------ Numarray is a Numeric replacement which features c-code generated from python template scripts, the capacity to operate directly on arrays in files, and improved type promotion semantics. Numarray-0.2 incorporates bug fixes and one very significant change to the user interface, namely in how shape, real, imag and flat attributes are handled. Also, when numarray objects compared to None will (arr == None) no longer produce an execption and generate a false value (rather than a Boolean array). The next version will incorporate safety checks to prevent possible crashing of Python if a user misuses or otherwise changes private variables in numarray. There will also be a new memory object type to fix a problem discovered with buffer objects (which are currently used to used to allocate memory). We expect to have this version ready in 3 weeks. WHERE ----------- Numarray is hosted by Source Forge in the same project which hosts Numeric: http://sourceforge.net/projects/numpy/ The web page for Numarray information is at: http://stsdas.stsci.edu/numarray/index.html REQUIREMENTS -------------------------- numarray-0.2 requires Python 2.0 or greater. AUTHORS, LICENSE ------------------------------ Numarray was written by Perry Greenfield, Rick White, Todd Miller, JC Hsu, and Phil Hodge at the Space Telescope Science Institute. Numarray is made available under a BSD-style License. See LICENSE.txt in the source distribution for details. CHANGES-0.2: --------------------- 1. Added support for python-2.2 properties, specifically: .shape, getshape(), setshape() .flat, getflat() .real, getreal(), setreal() .imag, getimag(), setimag() .imaginary, getimaginary(), setimaginary() # aliases for imag numarray-0.2 is not 100% compatible with numarray-0.11. To port numarray-0.11 code to numarray-0.2: a. Instances of array.reshape(x) must be changed to array.setshape(x) or array.shape = x. Users with python versions prior to 2.2 must use array.setshape(x). b. Instances of array.real() must be replaced with array.real or array.getreal(). Users with python versions prior to 2.2 must use array.getreal(). c. Instances of array.imag() must be replaced with array.imag or array.getimag(). Users with python versions prior to 2.2 must use array.getimag(). 2. Fixed bugs in some of the numarray functions related to: a. Making copies of the input arrays when required to do so. b. Supporting nested sequences of python numbers in lieu of arrays. affected functions: reshape, transpose, resize, diagonal, trace, sort, argsort, ravel, ... 3. Fixed a bug in Complex arrays related to handling of type Complex64. This bug manifested as incorrect results for many operations with Complex64 arrays. "ones" in particular, returned obviously incorrect results for type=Complex64. 4. Fixed a bug in type conversion in "where" with y = where (equal (x, 0.), 1., x) on single precision array 'x' resulting in a double precision 'y'. This fix also affects "choose". 5. Added getrank() method and associated property "rank" for python2.2 and on. 6. Fixed a bug in nonzero where the "input screening" code was truncating small floating point values to 0. 7. Fixed a bug in all unary/binary ufuncs where output buffer offset was hard-coded to 0 for "fast" mode. This caused the following failure: >>> a=arange(10) >>> a[5:8] += 3 >>> a array([3, 4, 5, ... ]) 8. Fixed bug / added support for array([], type=ZZZ). 9. Added pi to the numarray namespace by importing it from math. 10. Added arrayrange alias for arange. 11. Improved numarray (in)equality testing by adding handling for None and re-implementing __nonzero__ as the bitwise-or of the element-wise comparison of the array with 0. 12. Fixed bug in iscontiguous() which assumed that _bytestride == _itemsize for contiguous arrays. This is not true when slicing occurs, but perhaps should be. 13. Added typecodes which are compatible with NumPy typecodes: "d":Float64,"f":Float32,"l":Int32,"D":Complex128,"F":Complex64, "b":UInt8, "c":Int8 Modified NumericType.__cmp__ to support comparisons against aliases. Modified NumericType.__repr__ to return the type name. 14. Modified the doctest for recarray so that it works correctly on win32. -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 From perry at stsci.edu Wed Feb 27 07:24:14 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Feb 27 07:24:14 2002 Subject: [Numpy-discussion] draft numarray manual available Message-ID: We have begun adapting the Numeric manual for numarray. An early draft of such a manual (in PDF format) is available from the numarray home page: http://stsdas.stsci.edu/numarray Comments and corrections are welcome. Over the next few months we will be adding chapters on how to interface C code to numarray as well as how to use non-numeric arrays. Perry Greenfield From jochen at unc.edu Wed Feb 27 07:50:13 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Wed Feb 27 07:50:13 2002 Subject: [Numpy-discussion] Re: draft numarray manual available In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 27 Feb 2002 10:22:48 -0500 Perry Greenfield wrote: Perry> We have begun adapting the Numeric manual for numarray. An Perry> early draft of such a manual (in PDF format) is available from Perry> the numarray home page: http://stsdas.stsci.edu/numarray Would it be possible to use the python doc format? That would allow to create the nice HTML as Python has and dvi/ps/pdf as well as info. I am willing to help with the transition. What is the current original format? Greetings, Jochen - -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6-cygwin-fcn-1 (Cygwin) Comment: Processed by Mailcrypt and GnuPG iD8DBQE8fQAaiJ/aUUS8zY4RAro0AJ0UQoyvvHHwgjJ+4tJXmwWyx18jRQCfY6v/ 1BBZoSU5KrzZ+4toNcYt7hk= =IOF2 -----END PGP SIGNATURE----- From perry at stsci.edu Wed Feb 27 08:06:09 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Feb 27 08:06:09 2002 Subject: [Numpy-discussion] Re: draft numarray manual available In-Reply-To: Message-ID: Hi Jochen, Indeed, we are considering converting the format at some time to the Python doc format (it would be a requirement in order to get it into the standard library). If you would do that it would be a great help (presuming you can access the current format...other than retyping it!) It is currently in Framemaker format (that's what the original Numeric manual was in). I was just beginning to mull over how best to convert formats. There is a Framemaker Interchange Format (SGML-based) that might be the easiest way to deal with it. Or perhaps the brute force method of converting it to ascii and adding Python doc markup would be the fastest. Any other ideas? Thanks, Perry Greenfield From jh at comunit.de Wed Feb 27 08:17:06 2002 From: jh at comunit.de (Janko) Date: Wed Feb 27 08:17:06 2002 Subject: [Numpy-discussion] Re: draft numarray manual available In-Reply-To: References: Message-ID: <20020227171625.265ea831.jh@comunit.de> On Wed, 27 Feb 2002 11:05:59 -0500 "Perry Greenfield" wrote: > Hi Jochen, > > Indeed, we are considering converting the format at some time > to the Python doc format (it would be a requirement in order to > get it into the standard library). > > If you would do that it would be a great help (presuming > you can access the current format...other than retyping it!) > It is currently in Framemaker format (that's what the original > Numeric manual was in). > > I was just beginning to mull over how best to convert formats. > There is a Framemaker Interchange Format (SGML-based) that might > be the easiest way to deal with it. Or perhaps the brute force > method of converting it to ascii and adding Python doc markup > would be the fastest. Any other ideas? > I have done exactly this, there are tools like pdf2text or so, which do help with this. Jochen I'm doing the same effort for the scipy docs at the moment, the first step is nearly done. So if you want to collaborate send me a note privatly. __Janko From jochen at unc.edu Wed Feb 27 09:07:05 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Wed Feb 27 09:07:05 2002 Subject: [Numpy-discussion] draft numarray manual available In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 27 Feb 2002 11:05:59 -0500 Perry Greenfield wrote: Perry> If you would do that it would be a great help (presuming you Perry> can access the current format...other than retyping it!) It is Perry> currently in Framemaker format (that's what the original Perry> Numeric manual was in). Framemake, huh:) Never used that... It shouldn't be too bad to put the raw ASCII into python-style LaTeX. But if you can create SGML one could try to use sgml2tex or similar to preserve the basic formatting. I am a little busy right now and will be out of town next week, but propose to do the work to get the doc converted after that (mid-March). If you could send me the sgml I could try out whether it actually helps or we have to go the raw-ASCII route. Greetings, Jochen - -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6-cygwin-fcn-1 (Cygwin) Comment: Processed by Mailcrypt and GnuPG iD8DBQE8fRIDiJ/aUUS8zY4RAhh+AJ9ZP0H4szek5W13makOEVSQnKYhngCglzzU T1zJa2gKx6uYn4VCnOXlsJQ= =AzS3 -----END PGP SIGNATURE----- From perry at stsci.edu Wed Feb 27 12:50:02 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Feb 27 12:50:02 2002 Subject: [Numpy-discussion] PyFITS 0.6.2 available Message-ID: We are announcing the availability of PyFITS, a Python module to provide a means of reading and writing FITS format files. The FITS (Flexible Image Transport System) format is standardized data format widely used for astronomical data (http://fits.gsfc.nasa.gov/). This module is based on the PyFITS module initially developed by Paul Barrett while he was at NASA/Goddard but has since been modified and adapted for use by the Space Telescope Science Institute. It has been developed primarily by Paul Barrett, Jin-Chung Hsu, and Todd Miller, with assistance from Warren Hack, Phil Hodge, and Michele De La Pena. It is being released under an Open Source License. A web page for PyFITS is available: http://stsdas.stsci.edu/pyfits There is a preliminary draft for a PyFITS manual available from that page. It is still in the early stages and is incomplete (particularly with regard to the details of how to manipulate table objects). However, there should be enough information to indicate how to perform basic operations with FITS files. PyFITS requires that numarray v0.2 be installed (http://stsdas.stsci.edu/numarray). Numarray is a relatively new replacement for the Numeric module (and largely backward- compatible at the Python level). It also currently lacks all of the 3rd-party libraries that Numeric has (this will begin to change within a couple of months). However, it is not compatible at the C-API level nor are numarray and Numeric arrays interchangeable. It is possible to have both modules loaded simultaneously and to convert between numarray and Numeric arrays using the tostring/fromstring mechanism each provides (at the expense of extra memory usage, of course) The PyFITS module is simply a single Python file (pyfits.py). Installation consists of placing that file in a directory in the Python search path. Besides the use of numarray, there are significant differences with the interface provided by the original PyFITS. We do not expect many future backward-incompatible changes to the interface of PyFITS. (Though more methods and functions will almost certainly be added.) This is an early version and undoubtably bugs will be discovered when used with a greater variety of FITS files. Currently PyFITS take a fairly strict interpretation of FITS files. There are likely to be problems with FITS data that do not strictly conform to the standard. We intend to accommodate such variances, particularly if they involve widely used or available data (so please let us know when such problems occur). Things not yet supported but are part of future development: 1) Verification and/or correction of FITS objects being written to disk so that they are legal FITS. This is being added now and should be available in about a month. Currently, one may construct FITS headers that are inconsistent with the data and write such FITS objects to disk. Future versions will provide options to either a) correct discrepancies and warn, b) correct discrepancies silently, c) throw a Python exception, or d) write illegal FITS (for test purposes!). 2) Support for ascii tables or random groups format. Support for ASCII tables will be done soon (~1 month). When random group support is added is uncertain. 3) Support for memory mapping FITS data (to reduce memory demands). We expect to provide this capability in about 3 months. 4) Support for columns in binary tables having scaled values (e.g. BSCALE or BZERO) or boolean values. Currently booleans are stored as Int8 arrays and users must explicitly convert them into a boolean array. Likewise, scaled columns must be copied with scaling and offset by testing for those attributes explicitly. Future versions will produce such copies automatically. 5) Support for tables with TNULL values. This awaits an enhancement to numarray to support mask arrays (planned). (At least a couple months off) Please contact help at stsci.edu for with questions about it usage, bug reports, or requests for enhancements. Perry Greenfield Science Software Group From paul at pfdubois.com Wed Feb 27 14:06:08 2002 From: paul at pfdubois.com (Paul Dubois) Date: Wed Feb 27 14:06:08 2002 Subject: [Numpy-discussion] last 21 beta Message-ID: <000701c1bfda$9923f1d0$09860cc0@CLENHAM> I've just put up Numeric-21.0b3.tar.gz; I intend this to be the final beta. Thanks to fellow developers for some recent bug fixes. Please give it a workout. Also, I need someone to try it with Python 2.1, as I didn't have one laying around and one of my fixes is for that. P.S. for Travis: Please put your name in the changes.txt if you fixed a bug, not the name of the person who complained unless they also supplied the fix. Thanks for knocking down some hard ones.) From perry at stsci.edu Wed Feb 27 15:39:04 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Feb 27 15:39:04 2002 Subject: [Numpy-discussion] draft numarray manual available In-Reply-To: Message-ID: > Perry> If you would do that it would be a great help (presuming you > Perry> can access the current format...other than retyping it!) It is > Perry> currently in Framemaker format (that's what the original > Perry> Numeric manual was in). > > Framemake, huh:) Never used that... It shouldn't be too bad to put > the raw ASCII into python-style LaTeX. But if you can create SGML one > could try to use sgml2tex or similar to preserve the basic formatting. > > I am a little busy right now and will be out of town next week, but > propose to do the work to get the doc converted after that > (mid-March). If you could send me the sgml I could try out whether it > actually helps or we have to go the raw-ASCII route. > > Greetings, > Jochen > That would be great. I'd check with Janko to see if either his tools or experience could be used to make it easier. But I'll send you the MIF format (and perhaps some documentation on what it means). Thanks, Perry From mdtrail1 at yahoo.com Thu Feb 28 01:39:03 2002 From: mdtrail1 at yahoo.com (paul) Date: Thu Feb 28 01:39:03 2002 Subject: [Numpy-discussion] SPECIAL INVITATION ! Message-ID:

mdtrail.com Compensation Plan is calculated to create high returns with a steady flow of residual income for the rest of your life on commission payouts!

Our PowerTrail referral program is designed to be easy for new participants to make back their investment very quickly and from then on...BIG PROFITS!

mdtrail.com Compensation Plan is intended to be equitable to all from the start!! This will ENSURE that members stick with the program... This means real IMPACT to YOUR SHORT-TERM and LONG-TERM EARNINGS. mdtrail.com payout ratio is more than 80%. This is very critical to your success.

This program is LAUNCHING SOON and offers state-of-the-art healthcare delivery systems, and a dedicated back end support team.

It has taken more than 3 year to develop.

AND a big investment was made in the preparation of this interactive health club,

ALL for the success of YOU and YOUR PEOPLE ! 

Wouldn't you like to get a BIG piece of the action this time around?

From a.schmolck at gmx.net Thu Feb 28 11:28:03 2002 From: a.schmolck at gmx.net (A.Schmolck) Date: Thu Feb 28 11:28:03 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) Message-ID: Hi, Numeric is an impressively powerful and in many respects easy and comfortable to use package (e.g. it's sophisticated slicing operations, not to mention the power and elegance of the underlying python language) and one would hope that it can one day replace Matlab (which is both expensive and a nightmare as a programming language) as a standard platform for numerical calculations. Two important drawbacks of Numeric (python) compared to Matlab currently seem to be the plotting functionality and the availability of specialist libraries (say for pattern recognition problems etc.). However python seems to be quickly catching up in this area (e.g. scipy and other projects are bustling with activity and, as far as I know, there is currently a promising undergraduate project on the way at my university to provide powerful plotting facilities for python). There is however a problem that, for the use to which I want to put Numeric, runs deeper and provides me with quite a headache: Two essential matrix operations (matrix-multiplication and transposition (which is what I am mainly using) are both considerably a) less efficient and b) less notationally elegant under Numeric than under Matlab. Before I expound on that, I'd better mention two things: Firstly, I, have to confess largely ignorant about the gritty details of implementing efficient numerical computation (and indeed, I can't even claim to possess a good grasp of the underlying Linear Algebra). Secondly, and partly as a consequence, I realize that I'm likely to view this from a rather narrow perspective. Nonetheless, I need to solve these problems at least for myself, even if such a solution might be inappropriate for a more general audience, so I'd be very keen to hear what suggestions or experiences from other Numeric users are. I also want to share what my phd supervisor, Richard Everson, and I have come up with so far in terms of a), as I think it might be quite useful for other people, too (see [2], more about this later). Ok, so now let me proceed to state clearly what I mean by a) and b): The following Matlab fragment M * (C' * C) * V' * u currently has to be written like this in python: dot(dot(dot(M, dot(transpose(C), C)), transpose(v)), u) Or, even worse if one doesn't want to pollute the namespace: Numeric.dot(Numeric.dot(Numeric.dot(Numeric.M, Numeric.dot(Numeric.transpose(C), C)), Numeric.transpose(v)), u) Part a) (efficiency) -------------------- Apart from the syntactic inconveniences, the Matlab above will execute __considerably__ faster if C is a rather larger matrix (say 1000x1000). There are AFAIK two chief reasons for this: 1. Matlab uses a highly optimized ATLAS[3] BLAS routine to compute the dot-product, whereas Numeric uses a straight-forward home-brewn solution. 2. Numeric performs unnecessary transpose operations (prior to 20.3, I think, more about this later). The transpose operation is really damaging with big matrices, because it creates a complete copy, rather than trying to do something lazy (if your memory is already almost half filled up with (matrix) C, then creating a (in principle superfluous) transposed copy is not going to do you any good). The above C' * C actually creates, AFAIK, _3_ versions of C, 2 of them transposed (prior to 20.3; dot(a,b) translates into innerproduct(a, swapaxes(b, -1, -2)) In newer versions of Numeric, this is replaced by multiarray.matrixproduct(a, b) which has the considerable advantage that it doesn't create an unnecessary copy and the considerable disadvantage that it seems to be factor 3 or so slower than the old (already not blazingly fast) version for large Matrix x Matrix multiplication, (see timing results [1])). Now fortunately, we made some significant progress on the performance issues. My supervisor valiantly patched Numeric's multiarrayobject.c to use ATLAS for {scalar,vector,matrix} * {scalar,vector,matrix} multiplications. As a consequence the multiplication of pair of 1000x1000 matrices is computed more than _40_ times faster on my athlon machine (see [1], it seems to work fine but both the timing results as well as the code should be treated with a bit of caution). The BLAS-enabled dot takes almost exactly the same time as Matlab doing the same thing. In my case, this makes the difference between waiting several minutes and a day, and between deciding to use python or something faster but more painful. Although to benefit from this patch, one has to install ATLAS this is straight-forward (see [2] and [3] for pointers). In addition it is easy to install Lapack at the same time, as ATLAS provides significantly optimised versions of some of the Lapack routines too. Using Lapack rather than the lapack_lite that comes with Numeric also has the advantage that it is extensively tested and works very reliably (e.g. the lapack_lite routine that is called by Heigenvalues is pretty broken and searching the net suggests that this has been the case for some time now). This still leaves point 2), transposes, as a potential source for improvement. As far as I understand, the Lapack routines operate on flattened arrays, i.e. the only difference on calling some routine on a matrix and its transpose is that one has to pass a different additional parameters in the second case that specifies the structure of the matrix. Therefore if one were to program C' * C in fortran, only C itself needed to exist in memory. Would it therefore be possible that operations like transpose, slicing etc. be carried out lazily (i.e. so that only _modifying_ the result of the transpose or slice or passing it to some function for some reason needs a real copy resulted in an actual copy operation being performed)? As far as I understand the slicing in numarray is no longer simple aliasing (which is a good thing, since it make arrays more compatible with lists), are there any plans for implementing such an on-demand scheme? Part b) (syntax) ________________ As I said, dot(dot(dot(M, dot(transpose(C), C)), transpose(v)), u) is pretty obscure compared to M * (C' * C) * V' * u) Although linear algebra is not the be all and end all of numerical calculations, many numerical algorithms end up involving a good deal of linear algebra. The awkwardness of python/Numeric's notation for common linear algebra operation severely impairs, in my opinion, the utility of writing, using and debugging algorithms in python/Numeric. This is a particular shame given python's generally elegant and expresssive syntax. So far, I've thought of the following possible solutions: 0. Do nothing: Just live with the awkward syntax. 1. Comment: add a comment like the Matlab line above above complex expressions. 2. Wait for fix. Hope that either: 2.1. numarray will overload '*' as matrix multiplication, rather than elementwise multiplication (highly unlikely, I suspect). 2.2. the core python team accepts one of the proposed operator additions (or even, fond phantasy, allows users to add new operators) 3. Wrap: create a DotMatrix class that overloads '*' to be dot and maybe self.t to return the transpose -- this also means that all the numerical libraries I frequently use need to be wrapped. 4. Preprocess: Write some preprocessor that translates e.g. A ~* B in dot(A,B). 5. Hack: create own customized version of array that overloads e.g. '<<' to mean dot-product. The possible downsides: 0. Do nothing: Seems unacceptable (unless as a temporary solution), because the code becomes much less readable (and consequently harder to understand and more error-prone), thus forgoing one of the main reasons to choose python in the first place. 1. Comment: Unsatisfactory, because there is no way to gurantee that comment and code are in synch, which very likely will lead to difficult to find bugs. 2. Wait for a fix: Either would be nice (I could live with having to write multiply(a,b) -- I suppose however some other people can't, because I can't think of another reason why Numeric didn't overload * for matrixproduct in the first place, moreover it would mean a significant interface change (and code breakage), so I guess it is rather unlikely). From what I gather from searching the net and looking at peps there is also not much of a chance to get a new operator anytime soon. 3. Wrap: Promises to be a maintenance nightmare (BTW: will the new array class be subclassable?), but otherwise looks feasible. Has anyone done this? 4. Preprocess: Would have the advantage that I could always get back to "standard" python/Numeric code, but would involve quite a bit of work and might also break tools that parse python. 5. Hack array: Seems evil, but in principle feasible, because AFAIK '<<' isn't used in Numeric itself and hopefully it wouldn't involve too much work. However, given a free choice '<<' is hardly the operator I would choose to represent the dot product. I am not completely sure what the policy and rationale behind the current division of labor between functions, methods and attributes in numarray is but as far as the lack of a missing transposition postfix operator is concerned, one reasonable approach to make the transpose operation more readable (without the need to change anything in python itself) would seem to me to provide arrays with an attribute .t or .T so that: a.t == transpose(a) While this would admittedly be a bit of a syntactic hack, the operation is so commonplace and the gain in readability (in my eyes) is so significant that it would seem to me to merit serious consideration (maybe a.cc or so might also be an idea for complex conjugate notation, but I'm less convinced about that?). If the addition of a single operator for the exclusive benefit of Numeric users is rejected by the core python team, maybe it's worthwhile lobbying for some mechanism that allows users to define new operators (like e.g. in Haskell)... OK, that's all -- thanks for bearing with me through this long email. Suggestions and comments about the patch [2] and possible solutions to issues raised are greatly appreciated, regards, alexander schmolck Footnotes: [1] timing results for the patched version of Numeric[2] comparing new and old 'dot' performance: http://www.dcs.ex.ac.uk/~aschmolc/Numeric/TimingsForAtlasDot.txt [2] A patch for Numeric 21.1b that speeds up Numeric's dot function by several factors for large matrices can be found here: http://www.dcs.ex.ac.uk/~aschmolc/Numeric/ [3] ATLAS (http://math-atlas.sourceforge.net/) is a project to provide a highly optimized version of BLAS (Basic Linear Algebra Subroutines, a standard and thouroughly tested implementation of basic linear algebra operations) and some LAPACK routines. The charm of ATLAS is that it is platform-independent and yet highly optimized, which it achieves by essentially fine tuning a number of parameters until optimum performance for the _particular_ machine on which it is built is reached. As a consequence complete builds can take some time, but binary versions for ATLAS for common processors are available from http://www.netlib.org/atlas/archives (moreover, even if one decides to build ATLAS oneself, the search space can be considerably cut down if one accepts the suggested "experience" values offered during the make process). -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.Schmolck at gmx.net http://www.dcs.ex.ac.uk/people/aschmolc/ From cgw at alum.mit.edu Thu Feb 28 12:28:41 2002 From: cgw at alum.mit.edu (Charles G Waldman) Date: Thu Feb 28 12:28:41 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) In-Reply-To: References: Message-ID: <15486.37346.523821.173804@nyx.dyndns.org> A.Schmolck writes: > > Two essential matrix operations (matrix-multiplication and transposition > (which is what I am mainly using) are both considerably > > a) less efficient and > b) less notationally elegant Your comments about efficiency are well-taken. I have (in a previous life) done work on efficient (in terms of virtual memory access / paging behavior) transposes of large arrays. (Divide and conquer). Anyhow - if there were support for the operation of A*B' (and A'*B) at the C level, you wouldn't need to ever actually have a copy of the transposed array in memory - you would just exchange the roles of "i" and "j" in the computation... > 3. Wrap: create a DotMatrix class that overloads '*' to be dot and maybe > self.t to return the transpose -- this also means that all the numerical > libraries I frequently use need to be wrapped. I guess you haven't yet stumbled across the Matrix.py that comes with Numeric - it overrides "*" to be the dot-product. Unfortunately I don't see a really easy way to simplify the Transpose operator - at the very least you could do T = Numeric.transpose and then you're just writing T(A) instead of the long-winded version. Interestingly, the "~" operator is available, but it calls the function "__invert__". I guess it would be too weird to have ~A denote the transpose? Right now you get an error - one could set things up so that ~A was the matrix inverse of A, but we already have the A**-1 notation (among others) for that... From tim.hochberg at ieee.org Thu Feb 28 13:13:17 2002 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Thu Feb 28 13:13:17 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) References: Message-ID: <198c01c1c09b$4e981c60$74460344@cx781526b> Hi Alexander, [SNIP] > Two essential matrix operations (matrix-multiplication and transposition > (which is what I am mainly using) are both considerably > > a) less efficient and > b) less notationally elegant [Interesting stuff about notation and efficiency] > Or, even worse if one doesn't want to pollute the namespace: > > Numeric.dot(Numeric.dot(Numeric.dot(Numeric.M, > Numeric.dot(Numeric.transpose(C), C)), Numeric.transpose(v)), u) I compromise and use np.dot, etc. myself, but that's not really relavant to the issue at hand. [More snippage] > 2. Numeric performs unnecessary transpose operations (prior to 20.3, I think, > more about this later). The transpose operation is really damaging with big > matrices, because it creates a complete copy, rather than trying to do > something lazy (if your memory is already almost half filled up with > (matrix) C, then creating a (in principle superfluous) transposed copy is > not going to do you any good). The above C' * C actually creates, AFAIK, > _3_ versions of C, 2 of them transposed (prior to 20.3; I think you're a little off track here. The transpose operation doesn't normally make a copy, it just creates a new object that points to the same data, but with different stride values. So the transpose shouldn't be slow or take up more space. Numarray may well make a copy on transpose, I haven't looked into that, but I assume that at this point your are still talking about the old Numeric from the look of the code you posted. > > dot(a,b) > > translates into > > innerproduct(a, swapaxes(b, -1, -2)) > > In newer versions of Numeric, this is replaced by > > multiarray.matrixproduct(a, b) > > which has the considerable advantage that it doesn't create an unnecessary > copy and the considerable disadvantage that it seems to be factor 3 or so > slower than the old (already not blazingly fast) version for large Matrix x > Matrix multiplication, (see timing results [1])). Like I said, I don't think either of these should be making an extra copy unless it's happening inside multiarray.innerproduct or multiarray.matrixproduct. I haven't looked at the code for those in a _long_ time and then only glancingly, so I have no idea about that. [Faster! with Atlas] Sounds very cool. > > > As I said, > > dot(dot(dot(M, dot(transpose(C), C)), transpose(v)), u) > > is pretty obscure compared to > > M * (C' * C) * V' * u) Of the options that don't require new operators I'm somewhat fond of defining __call__ to be matrix multiply. If you toss in .t notation that you mention below, you get: (M)( (C.t)(C) ) (V.t)(u) Not perfect, but not too bad either. Note that I've tossed in some extra parentheses to make the above look better. It could actually be written: M( C.t(C) )(V.t)(u) ) But I think that's more confusing as it looks too much like a function call. (Although there is some mathematical precedent for viewing matrix multiplication as a function.) I'm a little iffy on the '.t' notation as it could get out of hand. Personally I could use cunjugate as much as transpose, and it's a similar operation -- would we also add '.c'? And possibly '.s' and '.h' for skew and Hermetian matrices? That might be a little much. The __call__ idea was not particularly popular last time, but I figured I'd toss at it out again as an easy to implement possibility. -tim From pearu at cens.ioc.ee Thu Feb 28 13:32:09 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu Feb 28 13:32:09 2002 Subject: [Numpy-discussion] Re: [SciPy-user] numarray interface and performance issues (for dot product and transpose) In-Reply-To: Message-ID: Hi, On 28 Feb 2002, A.Schmolck wrote: > So far, I've thought of the following possible solutions: > > 0. Do nothing: > Just live with the awkward syntax. Let me add a subsolution here: 0.1 Wait for scipy to mature (or better yet, help it to do that). Scipy already provides wrappers to both, Fortran and C, LAPACK and BLAS routines, though currently they are under revision. With the new wrappers to these routines you can optimize your code fragments as flexible as if using them from C or Fortran. In principle, one could mix Fortran and C routines (i.e. the corresponding wrappers) so that one avoids all unneccasary transpositions. All matrix operations can be performed in-situ if so desired. Regards, Pearu From neelk at cswcasa.com Thu Feb 28 13:49:29 2002 From: neelk at cswcasa.com (Krishnaswami, Neel) Date: Thu Feb 28 13:49:29 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) Message-ID: a.schmolck at gmx.net [mailto:a.schmolck at gmx.net] wrote: > > Numeric is an impressively powerful and in many respects easy and > comfortable to use package (e.g. it's sophisticated slicing > operations, not to mention the power and elegance of the underlying > python language) and one would hope that it can one day replace Matlab > (which is both expensive and a nightmare as a programming language) as > a standard platform for numerical calculations. I'm in much the same boat, only with Gauss as the language I want to replace. > There is however a problem that, for the use to which I want > to put Numeric, runs deeper and provides me with quite a headache: > > Two essential matrix operations (matrix-multiplication and > transposition (which is what I am mainly using) are both considerably > > a) less efficient and > b) less notationally elegant > > under Numeric than under Matlab. These are my two problems as well. I can live with the clumsy function call interface to the matrix ops, but the loss of efficiency is a real killer for me. In my code, Gauss is about 8-10x faster than Numpy, which is a killer speed loss. (And Gauss is modestly slower than C, though I don't care about this because the Gauss is fast enough.) Right now, I have a data-mining program that I prototyped in Numpy and am now rewriting in C. Because Numpy isn't fast enough, I have wasted close to a week on this rewrite. This sounds bitter, but it's not meant to. I have to deploy on VMS, and after we had gotten Numpy working on OpenVMS I really hoped that the Alpha would fast enough that I could just use the Python prototype. -- Neel Krishnaswami neelk at cswcasa.com From oliphant at ee.byu.edu Thu Feb 28 14:32:26 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Thu Feb 28 14:32:26 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) In-Reply-To: Message-ID: On 28 Feb 2002, A.Schmolck wrote: > Two essential matrix operations (matrix-multiplication and transposition > (which is what I am mainly using) are both considerably > > a) less efficient and > b) less notationally elegant You are not alone in your concerns. The developers of SciPy are quite concerned about speed, hence the required linking to ATLAS. As Pearu mentioned all of the BLAS will be available (much of it is). This will enable very efficient algorithms. The question of notational elegance is stickier because we just can't add new operators. The solution I see is to use other classes. Right now, the Numeric array is an array of numbers (it is not a vector or a matrix) and that is why it has the operations it does. The Matrix class (delivered with Numeric) creates a Matrix object that uses the array of numbers of Numeric arrays. It overloads the * operator and defines .T, and .H for transpose and Hermitian transpose respectively. This requires explictly making your objects matrices (not a bad thing in my book as not all 2-D arrays fit perfectly in a matrix algebra). > The following Matlab fragment > > M * (C' * C) * V' * u > This becomes (using SciPy which defines Mat = Matrix.Matrix and could later redefine it to use the ATLAS libraries for matrix multiplication). C, V, u, M = apply(Mat, (C, V, u, M)) M * (C.H * C) * V.H * M not bad.. and with a Mat class that uses the ATLAS blas (not a very hard thing to do now.), this could be made as fast as MATLAB. Perhaps, as as start we could look at how you make the current Numeric use blas if it is installed to do dot on real and complex arrays (I know you can get rid of lapack_lite and use your own lapack) but, the dot function is defined in multiarray and would have to be modified to use the BLAS instead of its own homegrown algorithm. -Travis From jmiller at stsci.edu Thu Feb 28 16:01:03 2002 From: jmiller at stsci.edu (Todd Miller) Date: Thu Feb 28 16:01:03 2002 Subject: [Numpy-discussion] Numarray-0.2 issues with GCC and SPARC Message-ID: <3C7EC434.30904@stsci.edu> Hi, I'm Todd Miller and I work at STSCI on Numarray. Two people have reported problems with compiling Numarray-0.11 or 0.2 with GCC on SPARC. There are two problems: 1. Compiling the _ufuncmodule.c using gcc-2.95 on a SPARC (with the default switches) uses tons of virtual memory and typically fails. a. This can be avoided by adding the compilation flags: EXTRA_COMPILE_ARGS=["-O0", "-Wno-return-type"] to your setup.py when compiling *numarray*. b. Alternately, you can wait for numarray-0.3 which will partition the ufuncmodule into smaller compilation units. We suspect these will avoid the problem naturally and permit the use of optimization. c. Lastly, if you have Sun cc, you might want to try using it instead of gcc. This is what we do at STSCI. You need to recompile Python itself if you want to do this and your python was already compiled with gcc. 2. Python compiled with gcc generates misaligned storage within buffer objects. Numarray-0.2 is dependent on the problematic variant of the buffer object so if you want to use Float64 or Complex128 on a SPARC you may experience core dumps. a. I have a non-portable patch which worked for me with gcc-2.95 on SPARC. I can e-mail this to anyone interested. Apply this patch and recompile *python*. b. You might be able to fix this with gcc compilation switches for Python: try -munaligned-doubles and recompile *python*. c. Numarray-0.3 will address this issue by providing its own minimal memory object which features correctly aligned storage. This solution will not require recompiling python, but won't be available until numarray-0.3. d. Python compiled with Sun cc using the default switches doesn't manifest this bug. If you have Sun cc, you may want to recompile *python* using that. In general, I think the "better part of valor" is probably to wait 3 weeks for numarray-0.3 when both issues should be addressed. If you want to try numarray-0.2 now with GCC on a SPARC, I hope some of these ideas work for you. Todd -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 From nwagner at mecha.uni-stuttgart.de Thu Feb 7 01:00:05 2002 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Thu Feb 7 01:00:05 2002 Subject: [Numpy-discussion] Simulating white noise with Numpy Message-ID: <3C625102.5863F1EE@mecha.uni-stuttgart.de> Hi, How can I simulate a stationary, white noise, Gaussian random process (zero mean and unit variance) with Numpy ? Nils From jsaenz at wm.lc.ehu.es Thu Feb 7 02:14:02 2002 From: jsaenz at wm.lc.ehu.es (Jon Saenz) Date: Thu Feb 7 02:14:02 2002 Subject: [Numpy-discussion] Simulating white noise with Numpy In-Reply-To: <3C625102.5863F1EE@mecha.uni-stuttgart.de> Message-ID: import RandomArray mean=0.0 stdev=1.0 RandomArray.normal(mean,stdev,shape=ReturnFloat) Manual of Numpy, chapter 17, page 97. Edition October 200. Jon Saenz. | Tfno: +34 946012445 Depto. Fisica Aplicada II | Fax: +34 944648500 Facultad de Ciencias. \\ Universidad del Pais Vasco \\ Apdo. 644 \\ 48080 - Bilbao \\ SPAIN On Thu, 7 Feb 2002, Nils Wagner wrote: > Hi, > > How can I simulate a stationary, white noise, Gaussian random process > (zero mean and unit variance) with Numpy ? > > Nils > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From paul at pfdubois.com Thu Feb 14 10:06:22 2002 From: paul at pfdubois.com (Paul F. Dubois) Date: Thu Feb 14 10:06:22 2002 Subject: [Numpy-discussion] Proposal, recoded average. Message-ID: <000001c1b581$82612de0$0a01a8c0@freedom> I recoded the new average function in both Numeric and MA. This version ensures that in the returned=1 case the weights returned will be of the same shape as the average. The new version is in CVS and labeled as 21.0b2. The average now allows weights that, rather than having to be the same size as the input, can be the same size as the input's axis'th dimension. This relieves the user of the rather tricky code needed to make an array of a given shape broadcasting a vector along a certain axis. This is a relatively frequent need and so I will write a separate routine for it soon. My proposed calling sequence is: def broadcast(v, axis=0, s) """Returns an array of the type of v and the shape s, such that setting the axis'th subscript to j results in an array whose values are all v[j]. """ Comments about this idea or the way I explain it would be welcome. From mathew at fugue.jpl.nasa.gov Thu Feb 14 18:25:01 2002 From: mathew at fugue.jpl.nasa.gov (Mathew Yeates) Date: Thu Feb 14 18:25:01 2002 Subject: [Numpy-discussion] subsampling Message-ID: <200202150224.SAA22330@fugue.jpl.nasa.gov> Hi- Maybe this is a dumb question .... but.... how can I take a matrix and subsample it down to an arbitrary size? It seems that using slices is limited to constant integer step sizes while, if I'm converting between arbitrary sizes, I would like to subsample at a nonuniform rate. Am I missing something? Mathew From cgw at alum.mit.edu Thu Feb 14 18:43:01 2002 From: cgw at alum.mit.edu (Charles G Waldman) Date: Thu Feb 14 18:43:01 2002 Subject: [Numpy-discussion] subsampling In-Reply-To: <200202150224.SAA22330@fugue.jpl.nasa.gov> References: <200202150224.SAA22330@fugue.jpl.nasa.gov> Message-ID: <15468.30210.93588.763998@dragonfly.sportsdatabase.com> Mathew Yeates writes: > how can I take a matrix and subsample it down to > an arbitrary size? It seems that using slices is > limited to constant integer step sizes while, if > I'm converting between arbitrary sizes, I would like > to subsample at a nonuniform rate. I'm not sure exactly what you're trying to do, but maybe the following Python session will show you the way. >>> from Numeric import * >>> take.__doc__ 'take(a, indices, axis=0). Selects the elements in indices from array a along t he given axis.' >>> a = fromfunction(lambda i,j: 10*i+j, (10,5)) >>> a array([[ 0, 1, 2, 3, 4], [10, 11, 12, 13, 14], [20, 21, 22, 23, 24], [30, 31, 32, 33, 34], [40, 41, 42, 43, 44], [50, 51, 52, 53, 54], [60, 61, 62, 63, 64], [70, 71, 72, 73, 74], [80, 81, 82, 83, 84], [90, 91, 92, 93, 94]]) >>> b = take(a,(0,2,3,5,7)) >>> b array([[ 0, 1, 2, 3, 4], [20, 21, 22, 23, 24], [30, 31, 32, 33, 34], [50, 51, 52, 53, 54], [70, 71, 72, 73, 74]]) >>> c = take(b,(0,2,3), 1) >>> c array([[ 0, 2, 3], [20, 22, 23], [30, 32, 33], [50, 52, 53], [70, 72, 73]]) From jjl at pobox.com Fri Feb 15 12:08:17 2002 From: jjl at pobox.com (John J. Lee) Date: Fri Feb 15 12:08:17 2002 Subject: [Numpy-discussion] subsampling In-Reply-To: <15468.30210.93588.763998@dragonfly.sportsdatabase.com> Message-ID: On Thu, 14 Feb 2002, Charles G Waldman wrote: > Mathew Yeates writes: > > > how can I take a matrix and subsample it down to > > an arbitrary size? It seems that using slices is > > limited to constant integer step sizes while, if > > I'm converting between arbitrary sizes, I would like > > to subsample at a nonuniform rate. > > I'm not sure exactly what you're trying to do, but maybe the following > Python session will show you the way. [examples of take() usage snipped] I think -- correct me if I'm wrong -- he was asking about interpolation. If a Python loop is too slow, you can probably do it with the standard numpy functions, with some experimentation. John From cgw at alum.mit.edu Fri Feb 15 12:36:56 2002 From: cgw at alum.mit.edu (Charles G Waldman) Date: Fri Feb 15 12:36:56 2002 Subject: [Numpy-discussion] subsampling In-Reply-To: References: <15468.30210.93588.763998@dragonfly.sportsdatabase.com> Message-ID: <15469.28286.394897.690559@dragonfly.sportsdatabase.com> John J. Lee writes: > I think -- correct me if I'm wrong -- he was asking about interpolation. I think that only Mr Yeates knows for sure! I read the query a few times and I think he's looking for straight-ahead subsampling without interpolation. > If a Python loop is too slow, you can probably do it with the standard > numpy functions, with some experimentation. If interpolation is desired, it might also be worthwhile looking at the Python Imaging Library (PIL), I believe it has routines for resizing images with bilinear interpolation. Since PIL supports many different data formats (even floating point) and it's fairly easy, if inelegant, to inter-convert PIL images and NumPy arrays using "fromstring/tostring" methods (there should be tighter integration between these packages, maybe somebody has done this since last I looked), you might find that the PIL image resizing does what you're looking for. From mathew at fugue.jpl.nasa.gov Fri Feb 15 12:50:15 2002 From: mathew at fugue.jpl.nasa.gov (mathew at fugue.jpl.nasa.gov) Date: Fri Feb 15 12:50:15 2002 Subject: [Numpy-discussion] subsampling Message-ID: <200202152041.MAA14166@fugue.jpl.nasa.gov> John J. Lee writes: > I think -- correct me if I'm wrong -- he was asking about interpolation. I think that only Mr Yeates knows for sure! I read the query a few times and I think he's looking for straight-ahead subsampling without interpolation. Yes, I was just asking about subsampling although I did start to think about doing bilinear interpolation. But I was hoping to subsample using built in Numpy functions and .... your solution fits the bill!! Thank you! Mathew From Stefan.Heinrichs at uni-konstanz.de Fri Feb 15 13:21:54 2002 From: Stefan.Heinrichs at uni-konstanz.de (Stefan Heinrichs) Date: Fri Feb 15 13:21:54 2002 Subject: [Numpy-discussion] Interface for numpy C-API <-> simple C++ matrix class Message-ID: <20020215211700.GA22189@gibbs.physik.uni-konstanz.de> Hello, we would like to write routines processing numerical python arrays in C++, so that at least boundary checking can be enabled at runtime. While there are a lot of matrix libraries available for C++, I could not find the glue that interfaces such a library to the C-API of numerical python. Seamless access to a minimal C++ library would make the C++ part of programming much easier. Has anyone already written some wrapper/glue code? Thanks and best regards, Stefan -- ------------------------------------------------------------------- Email: Stefan.Heinrichs at uni-konstanz.de Address: Fakulaet fuer Physik, Universitaet Konstanz, Universitaetsstr.10, 78457 Konstanz, Germany Phone: +49 7531 88 3814 From rob at pythonemproject.com Sat Feb 16 10:04:03 2002 From: rob at pythonemproject.com (Rob) Date: Sat Feb 16 10:04:03 2002 Subject: [Numpy-discussion] Converting from FORTRAN Equivalence to a Numpy alternative Message-ID: <3C6E9E25.7C556248@pythonemproject.com> I hoped there might be a combo FORTRAN/Numpy hacker that might help me with this. I have the following statements in a FORTRAN routine that I am converting to Numpy. I am wondering how to handle the EQUIVALENCE statement. As I understand it, ARL1[1] should be equal to the first item in the memory block for AR1. But not knowing how FORTRAN allocates arrays, its hard to tell what to do in Numpy. I tried ARL1=ravel(AR1) but that didn't work. Similarly ARL=AR[:,1,1} didn't work. I'm lost. Thanks in advance for any help. Rob. #COMMON /GGRID/ AR1(11,10,4),AR2(17,5,4),AR3(9,8,4),EPSCF,DXA(3),& # &DYA(3),XSA(3),YSA(3),NXA(3),NYA(3 #DIMENSION ARL1(1), ARL2(1), ARL3(1) #EQUIVALENCE (ARL1,AR1), (ARL2,AR2), (ARL3,AR3), (XS2,XSA(2)), (YS3,YSA(3)) -- The Numeric Python EM Project www.pythonemproject.com From rob at pythonemproject.com Sat Feb 16 11:40:01 2002 From: rob at pythonemproject.com (Rob) Date: Sat Feb 16 11:40:01 2002 Subject: [Numpy-discussion] I solved my fortran equivalence to Python conversion Message-ID: <3C6EB4B5.6E67EECE@pythonemproject.com> Here is the code: ARL1=zeros((441),Complex) ARL2=zeros((341),Complex) ARL3=zeros((289),Complex) #DIMENSION ARL1(1), ARL2(1), ARL3(1) #EQUIVALENCE (ARL1,AR1), (ARL2,AR2), (ARL3,AR3), (XS2,XSA(2)), (YS3,YSA(3)) AR1=swapaxes(AR1,0,2) AR2=swapaxes(AR2,0,2) AR3=swapaxes(AR3,0,2) ARL1[1:]=ravel(AR1[1:,1:,1:]) ARL2[1:]=ravel(AR2[1:,1:,1:]) ARL3[1:]=ravel(AR3[1:,1:,1:]) -- The Numeric Python EM Project www.pythonemproject.com From kragen at pobox.com Mon Feb 18 12:13:19 2002 From: kragen at pobox.com (Kragen Sitaker) Date: Mon Feb 18 12:13:19 2002 Subject: [Numpy-discussion] memory-mapped Numeric arrays: arrayfrombuffer version 2 Message-ID: <20020218200719.AB961BDC5@panacea.canonical.org> (I thought I had sent this mail on January 30, but I guess I was mistaken.) Eric Nodwell writes: > Since I have a 2.4GB data file handy, I thought I'd try this > package with it. (Normally I process this data file by reading > it in a chunk at a time, which is perfectly adequate.) Not > surprisingly, it chokes: Yep, that's pretty much what I expected. I think that adding code to support mapping some arbitrary part of a file should be fairly straightforward --- do you want to run the tests if I write the code? > File "/home/eric/lib/python2.2/site-packages/maparray.py", line 15, > in maparray > m = mmap.mmap(fn, os.fstat(fn)[stat.ST_SIZE]) > OverflowError: memory mapped size is too large (limited by C int) This error message's wording led me to something that was *not* what I expected. That's a sort of alarming message --- it suggests that it won't work on >2G files even on LP64 systems, where longs and pointers are 64 bits but ints are 32 bits. The comments in the mmap module say: The map size is restricted to [0, INT_MAX] because this is the current Python limitation on object sizes. Although the mmap object *could* handle a larger map size, there is no point because all the useful operations (len(), slicing(), sequence indexing) are limited by a C int. Horrifyingly, this is true. Even the buffer interface function arrayfrombuffer uses to get the size of the buffer return int sizes, not size_t sizes. This is a serious bug in the buffer interface, IMO, and I doubt it will be fixed --- the buffer interface is apparently due for a revamp soon at any rate, so little changes won't be welcomed, especially if they break binary backwards compatibility, as this one would on LP64 platforms. Fixing this, so that LP64 Pythons can mmap >2G files (their birthright!), is a bit of work --- probably a matter of writing a modified mmap() module that supports a saner version of the buffer interface (with named methods instead of a type object slot), and can't be close()d, to boot. Until then, this module only lets you memory-map files up to two gigs. > (details: Python 2.2, numpy 20.3, Pentium III, Debian Woody, Linux > kernel 2.4.13, gcc 2.95.4) My kernel is 2.4.13 too, but I don't have any large files, and I don't know whether any of my kernel, my libc, or my Python even support them. > I'm not a big C programmer, but I wonder if there is some way for > this package to overcome the 2GB limit on 32-bit systems. That > could be useful in some situations. I don't know, but I think it would probably require extensive code changes throughout Numpy. -- Kragen Sitaker The sages do not believe that making no mistakes is a blessing. They believe, rather, that the great virtue of man lies in his ability to correct his mistakes and continually make a new man of himself. -- Wang Yang-Ming From mathew at fugue.jpl.nasa.gov Mon Feb 18 12:26:58 2002 From: mathew at fugue.jpl.nasa.gov (Mathew Yeates) Date: Mon Feb 18 12:26:58 2002 Subject: [Numpy-discussion] memory-mapped Numeric arrays: arrayfrombuffer version 2 In-Reply-To: Your message of "Mon, 18 Feb 2002 15:07:19 EST." <20020218200719.AB961BDC5@panacea.canonical.org> Message-ID: <200202182023.MAA25723@fugue.jpl.nasa.gov> Has anyone checked out VMaps at http://snafu.freedom.org/Vmaps/ ?? This might be what you're looking for. Mathew > (I thought I had sent this mail on January 30, but I guess I was > mistaken.) > > Eric Nodwell writes: > > Since I have a 2.4GB data file handy, I thought I'd try this > > package with it. (Normally I process this data file by reading > > it in a chunk at a time, which is perfectly adequate.) Not > > surprisingly, it chokes: > > Yep, that's pretty much what I expected. I think that adding code to > support mapping some arbitrary part of a file should be fairly > straightforward --- do you want to run the tests if I write the code? > > > File "/home/eric/lib/python2.2/site-packages/maparray.py", line 15, > > in maparray > > m = mmap.mmap(fn, os.fstat(fn)[stat.ST_SIZE]) > > OverflowError: memory mapped size is too large (limited by C int) > > This error message's wording led me to something that was *not* what I > expected. > > That's a sort of alarming message --- it suggests that it won't work > on >2G files even on LP64 systems, where longs and pointers are 64 > bits but ints are 32 bits. The comments in the mmap module say: > > The map size is restricted to [0, INT_MAX] because this is the current > Python limitation on object sizes. Although the mmap object *could* handle > a larger map size, there is no point because all the useful operations > (len(), slicing(), sequence indexing) are limited by a C int. > > Horrifyingly, this is true. Even the buffer interface function > arrayfrombuffer uses to get the size of the buffer return int sizes, > not size_t sizes. This is a serious bug in the buffer interface, IMO, > and I doubt it will be fixed --- the buffer interface is apparently > due for a revamp soon at any rate, so little changes won't be > welcomed, especially if they break binary backwards compatibility, as > this one would on LP64 platforms. > > Fixing this, so that LP64 Pythons can mmap >2G files (their > birthright!), is a bit of work --- probably a matter of writing a > modified mmap() module that supports a saner version of the buffer > interface (with named methods instead of a type object slot), and > can't be close()d, to boot. > > Until then, this module only lets you memory-map files up to two gigs. > > > (details: Python 2.2, numpy 20.3, Pentium III, Debian Woody, Linux > > kernel 2.4.13, gcc 2.95.4) > > My kernel is 2.4.13 too, but I don't have any large files, and I don't > know whether any of my kernel, my libc, or my Python even support > them. > > > I'm not a big C programmer, but I wonder if there is some way for > > this package to overcome the 2GB limit on 32-bit systems. That > > could be useful in some situations. > > I don't know, but I think it would probably require extensive code > changes throughout Numpy. > > -- > Kragen Sitaker > The sages do not believe that making no mistakes is a blessing. They believe, > rather, that the great virtue of man lies in his ability to correct his > mistakes and continually make a new man of himself. -- Wang Yang-Ming > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From nwagner at mecha.uni-stuttgart.de Tue Feb 19 05:33:05 2002 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Tue Feb 19 05:33:05 2002 Subject: [Numpy-discussion] Hilbert transform Message-ID: <3C7262DA.D1A551B7@mecha.uni-stuttgart.de> Hi, I am looking for a Numpy function that performs the Hilbert transform of a scalar time series. Thanks in advance. Nils Wagner From mathew at fugue.jpl.nasa.gov Fri Feb 22 16:20:02 2002 From: mathew at fugue.jpl.nasa.gov (Mathew Yeates) Date: Fri Feb 22 16:20:02 2002 Subject: [Numpy-discussion] memory not being freed Message-ID: <200202230019.QAA27573@fugue.jpl.nasa.gov> Hi I'm having problems with garbage collection I wrote an extension which creates an array and returns it foo() { arr = (PyArrayObject *) PyArray_FromDims( ..... ret = Py_BuildValue("O", arr); return ret; } but now if I do while 1: a=foo() memory is never free'd. I've even tried explicitly calling gc.collect and adding del(a) after a=foo. Is the problem that Py_BuildValue increases the reference count? Mathew From reggie at merfinllc.com Fri Feb 22 17:26:06 2002 From: reggie at merfinllc.com (Reggie Dugard) Date: Fri Feb 22 17:26:06 2002 Subject: [Numpy-discussion] memory not being freed In-Reply-To: <200202230019.QAA27573@fugue.jpl.nasa.gov> References: <200202230019.QAA27573@fugue.jpl.nasa.gov> Message-ID: <20020223.1251000@auk.merfinllc.com> I believe PyArray_FromDims() returns a new reference to the object (arr) and that the Py_BuildValue creates another reference so that you've got two references to that array and python is only going to DECREF one when it's done. I would suggest either 1) using the 'N' format character to Py_BuildValue so that another reference isn't created or 2) explicitly calling Py_DECREF(arr) just before you return. Reggie >>>>>>>>>>>>>>>>>> Original Message <<<<<<<<<<<<<<<<<< On 2/22/02, 4:19:15 PM, Mathew Yeates wrote regarding [Numpy-discussion] memory not being freed: > Hi > I'm having problems with garbage collection > I wrote an extension which creates an array > and returns it > foo() { > arr = (PyArrayObject *) PyArray_FromDims( ..... > ret = Py_BuildValue("O", arr); > return ret; > } > but now if I do > while 1: > a=foo() > memory is never free'd. I've even tried explicitly calling gc.collect and > adding del(a) after a=foo. > Is the problem that Py_BuildValue increases the reference count? > Mathew > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From paul at pfdubois.com Fri Feb 22 17:29:06 2002 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Feb 22 17:29:06 2002 Subject: [Numpy-discussion] memory not being freed In-Reply-To: <200202230019.QAA27573@fugue.jpl.nasa.gov> Message-ID: <000201c1bc09$4c1f1ec0$1001a8c0@NICKLEBY> You just want ret = (PyObject*) arr; I assume it is PyObject *foo() and you just didn't show it. -----Original Message----- From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of Mathew Yeates Sent: Friday, February 22, 2002 4:19 PM To: numpy-discussion at lists.sourceforge.net Subject: [Numpy-discussion] memory not being freed Hi I'm having problems with garbage collection I wrote an extension which creates an array and returns it foo() { arr = (PyArrayObject *) PyArray_FromDims( ..... ret = Py_BuildValue("O", arr); return ret; } but now if I do while 1: a=foo() memory is never free'd. I've even tried explicitly calling gc.collect and adding del(a) after a=foo. Is the problem that Py_BuildValue increases the reference count? Mathew _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From mathew at fugue.jpl.nasa.gov Fri Feb 22 17:32:06 2002 From: mathew at fugue.jpl.nasa.gov (Mathew Yeates) Date: Fri Feb 22 17:32:06 2002 Subject: [Numpy-discussion] memory not being freed In-Reply-To: Your message of "Fri, 22 Feb 2002 17:27:44 PST." <000201c1bc09$4c1f1ec0$1001a8c0@NICKLEBY> Message-ID: <200202230131.RAA29048@fugue.jpl.nasa.gov> > You just want ret = (PyObject*) arr; > Paul okay, I tried that and it worked. But what about the circumstance where I want to return the array AND other values. So then I would build a list and return it. Again, won't I have too many references to may array? Thanks Mathew From cavallo at kip.uni-heidelberg.de Tue Feb 26 08:45:17 2002 From: cavallo at kip.uni-heidelberg.de (cavallo at kip.uni-heidelberg.de) Date: Tue Feb 26 08:45:17 2002 Subject: [Numpy-discussion] Interface for numpy C-API <-> simple C++ matrix class In-Reply-To: <20020215211700.GA22189@gibbs.physik.uni-konstanz.de> Message-ID: On Fri, 15 Feb 2002, Stefan Heinrichs wrote: > Hello, > > we would like to write routines processing numerical python arrays in > C++, so that at least boundary checking can be enabled at runtime. > While there are a lot of matrix libraries available for C++, I could > not find the glue that interfaces such a library to the C-API of > numerical python. Seamless access to a minimal C++ library would make > the C++ part of programming much easier. That's clear i had the same problem long time ago, and i written a small wrapper to the blitz library (www.oonumerics.org): i hadn't time to iron out all the details so it isn't ready for general release, but it works. I'm courrently using some sort of template skeleton to write modules in python for my phd (it is 4d digital image processing): if you like to help me to develop it in a usable way this should be useful for other people as well. I think someone else has developed such a glue, under scipy (isn't?): just look at that. regards, antonio > > Has anyone already written some wrapper/glue code? > > Thanks and best regards, > > Stefan > > -- > > > ------------------------------------------------------------------- > Email: Stefan.Heinrichs at uni-konstanz.de > Address: Fakulaet fuer Physik, Universitaet Konstanz, > Universitaetsstr.10, 78457 Konstanz, Germany > Phone: +49 7531 88 3814 > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From eric at enthought.com Tue Feb 26 11:48:13 2002 From: eric at enthought.com (eric) Date: Tue Feb 26 11:48:13 2002 Subject: [Numpy-discussion] Interface for numpy C-API <-> simple C++ matrix class References: Message-ID: <035a01c1bef5$c7f3c070$6b01a8c0@ericlaptop> Weave optionally uses blitz arrays to represent NumPy objects: www.scipy.org/site_content/weave Maybe something there will be of use. eric > > > On Fri, 15 Feb 2002, Stefan Heinrichs wrote: > > > Hello, > > > > we would like to write routines processing numerical python arrays in > > C++, so that at least boundary checking can be enabled at runtime. > > While there are a lot of matrix libraries available for C++, I could > > not find the glue that interfaces such a library to the C-API of > > numerical python. Seamless access to a minimal C++ library would make > > the C++ part of programming much easier. > > That's clear i had the same problem long time ago, > and i written a small wrapper to the blitz library (www.oonumerics.org): > i hadn't time to iron out all the details so it isn't ready for general > release, but it works. > I'm courrently using some sort of template skeleton to write modules in > python for my phd (it is 4d digital image processing): if you like to help > me to develop it in a usable way this should be useful for other people as > well. > > I think someone else has developed such a glue, under scipy (isn't?): just > look at that. > > regards, > antonio > > > > > Has anyone already written some wrapper/glue code? > > > > Thanks and best regards, > > > > Stefan > > > > -- > > > > > > ------------------------------------------------------------------- > > Email: Stefan.Heinrichs at uni-konstanz.de > > Address: Fakulaet fuer Physik, Universitaet Konstanz, > > Universitaetsstr.10, 78457 Konstanz, Germany > > Phone: +49 7531 88 3814 > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From oliphant at ee.byu.edu Tue Feb 26 17:08:15 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue Feb 26 17:08:15 2002 Subject: [Numpy-discussion] Addition of masked and indexed arrays. In-Reply-To: <200202230131.RAA29048@fugue.jpl.nasa.gov> Message-ID: I'm proposing the addition of the following semantics to Numeric Arrays in order to support indexing arrays using mask and index arrays. Notice that the change required to support this behavior is minimal and all the work can be backloaded to the user. Let the array_subscript C code check to see if the index object passed in is a Tuple with the last element being a callable method. If this is the case, then the C code calls the callable method to handle the array assignment or array indexing code. So, for example one might say. a[a<3,mask] = 10 where mask is a callable method (or Cobject) which does effectively but has the right calling convention. putmask(a, a<3, 10). This same semantics could also handle integer indexing of arrays (using multiple styles). All that's required is a simple test in Numeric that does not break any current code. What do you think? Could I add the necessary check to support this in Numeric? Do others have a better idea? I know numarray is the solution to all of our problems. But, something tells me that the current version of Numeric is going to be around for a little while and it would be nice for it to have some of these useful features. -Travis From jmiller at stsci.edu Wed Feb 27 07:09:03 2002 From: jmiller at stsci.edu (Todd Miller) Date: Wed Feb 27 07:09:03 2002 Subject: [Numpy-discussion] ANN: numarray-0.2 Message-ID: <3C7CF5F3.1090003@stsci.edu> Numarray 0.2 ------------ Numarray is a Numeric replacement which features c-code generated from python template scripts, the capacity to operate directly on arrays in files, and improved type promotion semantics. Numarray-0.2 incorporates bug fixes and one very significant change to the user interface, namely in how shape, real, imag and flat attributes are handled. Also, when numarray objects compared to None will (arr == None) no longer produce an execption and generate a false value (rather than a Boolean array). The next version will incorporate safety checks to prevent possible crashing of Python if a user misuses or otherwise changes private variables in numarray. There will also be a new memory object type to fix a problem discovered with buffer objects (which are currently used to used to allocate memory). We expect to have this version ready in 3 weeks. WHERE ----------- Numarray is hosted by Source Forge in the same project which hosts Numeric: http://sourceforge.net/projects/numpy/ The web page for Numarray information is at: http://stsdas.stsci.edu/numarray/index.html REQUIREMENTS -------------------------- numarray-0.2 requires Python 2.0 or greater. AUTHORS, LICENSE ------------------------------ Numarray was written by Perry Greenfield, Rick White, Todd Miller, JC Hsu, and Phil Hodge at the Space Telescope Science Institute. Numarray is made available under a BSD-style License. See LICENSE.txt in the source distribution for details. CHANGES-0.2: --------------------- 1. Added support for python-2.2 properties, specifically: .shape, getshape(), setshape() .flat, getflat() .real, getreal(), setreal() .imag, getimag(), setimag() .imaginary, getimaginary(), setimaginary() # aliases for imag numarray-0.2 is not 100% compatible with numarray-0.11. To port numarray-0.11 code to numarray-0.2: a. Instances of array.reshape(x) must be changed to array.setshape(x) or array.shape = x. Users with python versions prior to 2.2 must use array.setshape(x). b. Instances of array.real() must be replaced with array.real or array.getreal(). Users with python versions prior to 2.2 must use array.getreal(). c. Instances of array.imag() must be replaced with array.imag or array.getimag(). Users with python versions prior to 2.2 must use array.getimag(). 2. Fixed bugs in some of the numarray functions related to: a. Making copies of the input arrays when required to do so. b. Supporting nested sequences of python numbers in lieu of arrays. affected functions: reshape, transpose, resize, diagonal, trace, sort, argsort, ravel, ... 3. Fixed a bug in Complex arrays related to handling of type Complex64. This bug manifested as incorrect results for many operations with Complex64 arrays. "ones" in particular, returned obviously incorrect results for type=Complex64. 4. Fixed a bug in type conversion in "where" with y = where (equal (x, 0.), 1., x) on single precision array 'x' resulting in a double precision 'y'. This fix also affects "choose". 5. Added getrank() method and associated property "rank" for python2.2 and on. 6. Fixed a bug in nonzero where the "input screening" code was truncating small floating point values to 0. 7. Fixed a bug in all unary/binary ufuncs where output buffer offset was hard-coded to 0 for "fast" mode. This caused the following failure: >>> a=arange(10) >>> a[5:8] += 3 >>> a array([3, 4, 5, ... ]) 8. Fixed bug / added support for array([], type=ZZZ). 9. Added pi to the numarray namespace by importing it from math. 10. Added arrayrange alias for arange. 11. Improved numarray (in)equality testing by adding handling for None and re-implementing __nonzero__ as the bitwise-or of the element-wise comparison of the array with 0. 12. Fixed bug in iscontiguous() which assumed that _bytestride == _itemsize for contiguous arrays. This is not true when slicing occurs, but perhaps should be. 13. Added typecodes which are compatible with NumPy typecodes: "d":Float64,"f":Float32,"l":Int32,"D":Complex128,"F":Complex64, "b":UInt8, "c":Int8 Modified NumericType.__cmp__ to support comparisons against aliases. Modified NumericType.__repr__ to return the type name. 14. Modified the doctest for recarray so that it works correctly on win32. -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576 From perry at stsci.edu Wed Feb 27 07:24:14 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Feb 27 07:24:14 2002 Subject: [Numpy-discussion] draft numarray manual available Message-ID: We have begun adapting the Numeric manual for numarray. An early draft of such a manual (in PDF format) is available from the numarray home page: http://stsdas.stsci.edu/numarray Comments and corrections are welcome. Over the next few months we will be adding chapters on how to interface C code to numarray as well as how to use non-numeric arrays. Perry Greenfield From jochen at unc.edu Wed Feb 27 07:50:13 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Wed Feb 27 07:50:13 2002 Subject: [Numpy-discussion] Re: draft numarray manual available In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 27 Feb 2002 10:22:48 -0500 Perry Greenfield wrote: Perry> We have begun adapting the Numeric manual for numarray. An Perry> early draft of such a manual (in PDF format) is available from Perry> the numarray home page: http://stsdas.stsci.edu/numarray Would it be possible to use the python doc format? That would allow to create the nice HTML as Python has and dvi/ps/pdf as well as info. I am willing to help with the transition. What is the current original format? Greetings, Jochen - -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6-cygwin-fcn-1 (Cygwin) Comment: Processed by Mailcrypt and GnuPG iD8DBQE8fQAaiJ/aUUS8zY4RAro0AJ0UQoyvvHHwgjJ+4tJXmwWyx18jRQCfY6v/ 1BBZoSU5KrzZ+4toNcYt7hk= =IOF2 -----END PGP SIGNATURE----- From perry at stsci.edu Wed Feb 27 08:06:09 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Feb 27 08:06:09 2002 Subject: [Numpy-discussion] Re: draft numarray manual available In-Reply-To: Message-ID: Hi Jochen, Indeed, we are considering converting the format at some time to the Python doc format (it would be a requirement in order to get it into the standard library). If you would do that it would be a great help (presuming you can access the current format...other than retyping it!) It is currently in Framemaker format (that's what the original Numeric manual was in). I was just beginning to mull over how best to convert formats. There is a Framemaker Interchange Format (SGML-based) that might be the easiest way to deal with it. Or perhaps the brute force method of converting it to ascii and adding Python doc markup would be the fastest. Any other ideas? Thanks, Perry Greenfield From jh at comunit.de Wed Feb 27 08:17:06 2002 From: jh at comunit.de (Janko) Date: Wed Feb 27 08:17:06 2002 Subject: [Numpy-discussion] Re: draft numarray manual available In-Reply-To: References: Message-ID: <20020227171625.265ea831.jh@comunit.de> On Wed, 27 Feb 2002 11:05:59 -0500 "Perry Greenfield" wrote: > Hi Jochen, > > Indeed, we are considering converting the format at some time > to the Python doc format (it would be a requirement in order to > get it into the standard library). > > If you would do that it would be a great help (presuming > you can access the current format...other than retyping it!) > It is currently in Framemaker format (that's what the original > Numeric manual was in). > > I was just beginning to mull over how best to convert formats. > There is a Framemaker Interchange Format (SGML-based) that might > be the easiest way to deal with it. Or perhaps the brute force > method of converting it to ascii and adding Python doc markup > would be the fastest. Any other ideas? > I have done exactly this, there are tools like pdf2text or so, which do help with this. Jochen I'm doing the same effort for the scipy docs at the moment, the first step is nearly done. So if you want to collaborate send me a note privatly. __Janko From jochen at unc.edu Wed Feb 27 09:07:05 2002 From: jochen at unc.edu (Jochen =?iso-8859-1?q?K=FCpper?=) Date: Wed Feb 27 09:07:05 2002 Subject: [Numpy-discussion] draft numarray manual available In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 27 Feb 2002 11:05:59 -0500 Perry Greenfield wrote: Perry> If you would do that it would be a great help (presuming you Perry> can access the current format...other than retyping it!) It is Perry> currently in Framemaker format (that's what the original Perry> Numeric manual was in). Framemake, huh:) Never used that... It shouldn't be too bad to put the raw ASCII into python-style LaTeX. But if you can create SGML one could try to use sgml2tex or similar to preserve the basic formatting. I am a little busy right now and will be out of town next week, but propose to do the work to get the doc converted after that (mid-March). If you could send me the sgml I could try out whether it actually helps or we have to go the raw-ASCII route. Greetings, Jochen - -- University of North Carolina phone: +1-919-962-4403 Department of Chemistry phone: +1-919-962-1579 Venable Hall CB#3290 (Kenan C148) fax: +1-919-843-6041 Chapel Hill, NC 27599, USA GnuPG key: 44BCCD8E -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6-cygwin-fcn-1 (Cygwin) Comment: Processed by Mailcrypt and GnuPG iD8DBQE8fRIDiJ/aUUS8zY4RAhh+AJ9ZP0H4szek5W13makOEVSQnKYhngCglzzU T1zJa2gKx6uYn4VCnOXlsJQ= =AzS3 -----END PGP SIGNATURE----- From perry at stsci.edu Wed Feb 27 12:50:02 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Feb 27 12:50:02 2002 Subject: [Numpy-discussion] PyFITS 0.6.2 available Message-ID: We are announcing the availability of PyFITS, a Python module to provide a means of reading and writing FITS format files. The FITS (Flexible Image Transport System) format is standardized data format widely used for astronomical data (http://fits.gsfc.nasa.gov/). This module is based on the PyFITS module initially developed by Paul Barrett while he was at NASA/Goddard but has since been modified and adapted for use by the Space Telescope Science Institute. It has been developed primarily by Paul Barrett, Jin-Chung Hsu, and Todd Miller, with assistance from Warren Hack, Phil Hodge, and Michele De La Pena. It is being released under an Open Source License. A web page for PyFITS is available: http://stsdas.stsci.edu/pyfits There is a preliminary draft for a PyFITS manual available from that page. It is still in the early stages and is incomplete (particularly with regard to the details of how to manipulate table objects). However, there should be enough information to indicate how to perform basic operations with FITS files. PyFITS requires that numarray v0.2 be installed (http://stsdas.stsci.edu/numarray). Numarray is a relatively new replacement for the Numeric module (and largely backward- compatible at the Python level). It also currently lacks all of the 3rd-party libraries that Numeric has (this will begin to change within a couple of months). However, it is not compatible at the C-API level nor are numarray and Numeric arrays interchangeable. It is possible to have both modules loaded simultaneously and to convert between numarray and Numeric arrays using the tostring/fromstring mechanism each provides (at the expense of extra memory usage, of course) The PyFITS module is simply a single Python file (pyfits.py). Installation consists of placing that file in a directory in the Python search path. Besides the use of numarray, there are significant differences with the interface provided by the original PyFITS. We do not expect many future backward-incompatible changes to the interface of PyFITS. (Though more methods and functions will almost certainly be added.) This is an early version and undoubtably bugs will be discovered when used with a greater variety of FITS files. Currently PyFITS take a fairly strict interpretation of FITS files. There are likely to be problems with FITS data that do not strictly conform to the standard. We intend to accommodate such variances, particularly if they involve widely used or available data (so please let us know when such problems occur). Things not yet supported but are part of future development: 1) Verification and/or correction of FITS objects being written to disk so that they are legal FITS. This is being added now and should be available in about a month. Currently, one may construct FITS headers that are inconsistent with the data and write such FITS objects to disk. Future versions will provide options to either a) correct discrepancies and warn, b) correct discrepancies silently, c) throw a Python exception, or d) write illegal FITS (for test purposes!). 2) Support for ascii tables or random groups format. Support for ASCII tables will be done soon (~1 month). When random group support is added is uncertain. 3) Support for memory mapping FITS data (to reduce memory demands). We expect to provide this capability in about 3 months. 4) Support for columns in binary tables having scaled values (e.g. BSCALE or BZERO) or boolean values. Currently booleans are stored as Int8 arrays and users must explicitly convert them into a boolean array. Likewise, scaled columns must be copied with scaling and offset by testing for those attributes explicitly. Future versions will produce such copies automatically. 5) Support for tables with TNULL values. This awaits an enhancement to numarray to support mask arrays (planned). (At least a couple months off) Please contact help at stsci.edu for with questions about it usage, bug reports, or requests for enhancements. Perry Greenfield Science Software Group From paul at pfdubois.com Wed Feb 27 14:06:08 2002 From: paul at pfdubois.com (Paul Dubois) Date: Wed Feb 27 14:06:08 2002 Subject: [Numpy-discussion] last 21 beta Message-ID: <000701c1bfda$9923f1d0$09860cc0@CLENHAM> I've just put up Numeric-21.0b3.tar.gz; I intend this to be the final beta. Thanks to fellow developers for some recent bug fixes. Please give it a workout. Also, I need someone to try it with Python 2.1, as I didn't have one laying around and one of my fixes is for that. P.S. for Travis: Please put your name in the changes.txt if you fixed a bug, not the name of the person who complained unless they also supplied the fix. Thanks for knocking down some hard ones.) From perry at stsci.edu Wed Feb 27 15:39:04 2002 From: perry at stsci.edu (Perry Greenfield) Date: Wed Feb 27 15:39:04 2002 Subject: [Numpy-discussion] draft numarray manual available In-Reply-To: Message-ID: > Perry> If you would do that it would be a great help (presuming you > Perry> can access the current format...other than retyping it!) It is > Perry> currently in Framemaker format (that's what the original > Perry> Numeric manual was in). > > Framemake, huh:) Never used that... It shouldn't be too bad to put > the raw ASCII into python-style LaTeX. But if you can create SGML one > could try to use sgml2tex or similar to preserve the basic formatting. > > I am a little busy right now and will be out of town next week, but > propose to do the work to get the doc converted after that > (mid-March). If you could send me the sgml I could try out whether it > actually helps or we have to go the raw-ASCII route. > > Greetings, > Jochen > That would be great. I'd check with Janko to see if either his tools or experience could be used to make it easier. But I'll send you the MIF format (and perhaps some documentation on what it means). Thanks, Perry From mdtrail1 at yahoo.com Thu Feb 28 01:39:03 2002 From: mdtrail1 at yahoo.com (paul) Date: Thu Feb 28 01:39:03 2002 Subject: [Numpy-discussion] SPECIAL INVITATION ! Message-ID:

mdtrail.com Compensation Plan is calculated to create high returns with a steady flow of residual income for the rest of your life on commission payouts!

Our PowerTrail referral program is designed to be easy for new participants to make back their investment very quickly and from then on...BIG PROFITS!

mdtrail.com Compensation Plan is intended to be equitable to all from the start!! This will ENSURE that members stick with the program... This means real IMPACT to YOUR SHORT-TERM and LONG-TERM EARNINGS. mdtrail.com payout ratio is more than 80%. This is very critical to your success.

This program is LAUNCHING SOON and offers state-of-the-art healthcare delivery systems, and a dedicated back end support team.

It has taken more than 3 year to develop.

AND a big investment was made in the preparation of this interactive health club,

ALL for the success of YOU and YOUR PEOPLE ! 

Wouldn't you like to get a BIG piece of the action this time around?

From a.schmolck at gmx.net Thu Feb 28 11:28:03 2002 From: a.schmolck at gmx.net (A.Schmolck) Date: Thu Feb 28 11:28:03 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) Message-ID: Hi, Numeric is an impressively powerful and in many respects easy and comfortable to use package (e.g. it's sophisticated slicing operations, not to mention the power and elegance of the underlying python language) and one would hope that it can one day replace Matlab (which is both expensive and a nightmare as a programming language) as a standard platform for numerical calculations. Two important drawbacks of Numeric (python) compared to Matlab currently seem to be the plotting functionality and the availability of specialist libraries (say for pattern recognition problems etc.). However python seems to be quickly catching up in this area (e.g. scipy and other projects are bustling with activity and, as far as I know, there is currently a promising undergraduate project on the way at my university to provide powerful plotting facilities for python). There is however a problem that, for the use to which I want to put Numeric, runs deeper and provides me with quite a headache: Two essential matrix operations (matrix-multiplication and transposition (which is what I am mainly using) are both considerably a) less efficient and b) less notationally elegant under Numeric than under Matlab. Before I expound on that, I'd better mention two things: Firstly, I, have to confess largely ignorant about the gritty details of implementing efficient numerical computation (and indeed, I can't even claim to possess a good grasp of the underlying Linear Algebra). Secondly, and partly as a consequence, I realize that I'm likely to view this from a rather narrow perspective. Nonetheless, I need to solve these problems at least for myself, even if such a solution might be inappropriate for a more general audience, so I'd be very keen to hear what suggestions or experiences from other Numeric users are. I also want to share what my phd supervisor, Richard Everson, and I have come up with so far in terms of a), as I think it might be quite useful for other people, too (see [2], more about this later). Ok, so now let me proceed to state clearly what I mean by a) and b): The following Matlab fragment M * (C' * C) * V' * u currently has to be written like this in python: dot(dot(dot(M, dot(transpose(C), C)), transpose(v)), u) Or, even worse if one doesn't want to pollute the namespace: Numeric.dot(Numeric.dot(Numeric.dot(Numeric.M, Numeric.dot(Numeric.transpose(C), C)), Numeric.transpose(v)), u) Part a) (efficiency) -------------------- Apart from the syntactic inconveniences, the Matlab above will execute __considerably__ faster if C is a rather larger matrix (say 1000x1000). There are AFAIK two chief reasons for this: 1. Matlab uses a highly optimized ATLAS[3] BLAS routine to compute the dot-product, whereas Numeric uses a straight-forward home-brewn solution. 2. Numeric performs unnecessary transpose operations (prior to 20.3, I think, more about this later). The transpose operation is really damaging with big matrices, because it creates a complete copy, rather than trying to do something lazy (if your memory is already almost half filled up with (matrix) C, then creating a (in principle superfluous) transposed copy is not going to do you any good). The above C' * C actually creates, AFAIK, _3_ versions of C, 2 of them transposed (prior to 20.3; dot(a,b) translates into innerproduct(a, swapaxes(b, -1, -2)) In newer versions of Numeric, this is replaced by multiarray.matrixproduct(a, b) which has the considerable advantage that it doesn't create an unnecessary copy and the considerable disadvantage that it seems to be factor 3 or so slower than the old (already not blazingly fast) version for large Matrix x Matrix multiplication, (see timing results [1])). Now fortunately, we made some significant progress on the performance issues. My supervisor valiantly patched Numeric's multiarrayobject.c to use ATLAS for {scalar,vector,matrix} * {scalar,vector,matrix} multiplications. As a consequence the multiplication of pair of 1000x1000 matrices is computed more than _40_ times faster on my athlon machine (see [1], it seems to work fine but both the timing results as well as the code should be treated with a bit of caution). The BLAS-enabled dot takes almost exactly the same time as Matlab doing the same thing. In my case, this makes the difference between waiting several minutes and a day, and between deciding to use python or something faster but more painful. Although to benefit from this patch, one has to install ATLAS this is straight-forward (see [2] and [3] for pointers). In addition it is easy to install Lapack at the same time, as ATLAS provides significantly optimised versions of some of the Lapack routines too. Using Lapack rather than the lapack_lite that comes with Numeric also has the advantage that it is extensively tested and works very reliably (e.g. the lapack_lite routine that is called by Heigenvalues is pretty broken and searching the net suggests that this has been the case for some time now). This still leaves point 2), transposes, as a potential source for improvement. As far as I understand, the Lapack routines operate on flattened arrays, i.e. the only difference on calling some routine on a matrix and its transpose is that one has to pass a different additional parameters in the second case that specifies the structure of the matrix. Therefore if one were to program C' * C in fortran, only C itself needed to exist in memory. Would it therefore be possible that operations like transpose, slicing etc. be carried out lazily (i.e. so that only _modifying_ the result of the transpose or slice or passing it to some function for some reason needs a real copy resulted in an actual copy operation being performed)? As far as I understand the slicing in numarray is no longer simple aliasing (which is a good thing, since it make arrays more compatible with lists), are there any plans for implementing such an on-demand scheme? Part b) (syntax) ________________ As I said, dot(dot(dot(M, dot(transpose(C), C)), transpose(v)), u) is pretty obscure compared to M * (C' * C) * V' * u) Although linear algebra is not the be all and end all of numerical calculations, many numerical algorithms end up involving a good deal of linear algebra. The awkwardness of python/Numeric's notation for common linear algebra operation severely impairs, in my opinion, the utility of writing, using and debugging algorithms in python/Numeric. This is a particular shame given python's generally elegant and expresssive syntax. So far, I've thought of the following possible solutions: 0. Do nothing: Just live with the awkward syntax. 1. Comment: add a comment like the Matlab line above above complex expressions. 2. Wait for fix. Hope that either: 2.1. numarray will overload '*' as matrix multiplication, rather than elementwise multiplication (highly unlikely, I suspect). 2.2. the core python team accepts one of the proposed operator additions (or even, fond phantasy, allows users to add new operators) 3. Wrap: create a DotMatrix class that overloads '*' to be dot and maybe self.t to return the transpose -- this also means that all the numerical libraries I frequently use need to be wrapped. 4. Preprocess: Write some preprocessor that translates e.g. A ~* B in dot(A,B). 5. Hack: create own customized version of array that overloads e.g. '<<' to mean dot-product. The possible downsides: 0. Do nothing: Seems unacceptable (unless as a temporary solution), because the code becomes much less readable (and consequently harder to understand and more error-prone), thus forgoing one of the main reasons to choose python in the first place. 1. Comment: Unsatisfactory, because there is no way to gurantee that comment and code are in synch, which very likely will lead to difficult to find bugs. 2. Wait for a fix: Either would be nice (I could live with having to write multiply(a,b) -- I suppose however some other people can't, because I can't think of another reason why Numeric didn't overload * for matrixproduct in the first place, moreover it would mean a significant interface change (and code breakage), so I guess it is rather unlikely). From what I gather from searching the net and looking at peps there is also not much of a chance to get a new operator anytime soon. 3. Wrap: Promises to be a maintenance nightmare (BTW: will the new array class be subclassable?), but otherwise looks feasible. Has anyone done this? 4. Preprocess: Would have the advantage that I could always get back to "standard" python/Numeric code, but would involve quite a bit of work and might also break tools that parse python. 5. Hack array: Seems evil, but in principle feasible, because AFAIK '<<' isn't used in Numeric itself and hopefully it wouldn't involve too much work. However, given a free choice '<<' is hardly the operator I would choose to represent the dot product. I am not completely sure what the policy and rationale behind the current division of labor between functions, methods and attributes in numarray is but as far as the lack of a missing transposition postfix operator is concerned, one reasonable approach to make the transpose operation more readable (without the need to change anything in python itself) would seem to me to provide arrays with an attribute .t or .T so that: a.t == transpose(a) While this would admittedly be a bit of a syntactic hack, the operation is so commonplace and the gain in readability (in my eyes) is so significant that it would seem to me to merit serious consideration (maybe a.cc or so might also be an idea for complex conjugate notation, but I'm less convinced about that?). If the addition of a single operator for the exclusive benefit of Numeric users is rejected by the core python team, maybe it's worthwhile lobbying for some mechanism that allows users to define new operators (like e.g. in Haskell)... OK, that's all -- thanks for bearing with me through this long email. Suggestions and comments about the patch [2] and possible solutions to issues raised are greatly appreciated, regards, alexander schmolck Footnotes: [1] timing results for the patched version of Numeric[2] comparing new and old 'dot' performance: http://www.dcs.ex.ac.uk/~aschmolc/Numeric/TimingsForAtlasDot.txt [2] A patch for Numeric 21.1b that speeds up Numeric's dot function by several factors for large matrices can be found here: http://www.dcs.ex.ac.uk/~aschmolc/Numeric/ [3] ATLAS (http://math-atlas.sourceforge.net/) is a project to provide a highly optimized version of BLAS (Basic Linear Algebra Subroutines, a standard and thouroughly tested implementation of basic linear algebra operations) and some LAPACK routines. The charm of ATLAS is that it is platform-independent and yet highly optimized, which it achieves by essentially fine tuning a number of parameters until optimum performance for the _particular_ machine on which it is built is reached. As a consequence complete builds can take some time, but binary versions for ATLAS for common processors are available from http://www.netlib.org/atlas/archives (moreover, even if one decides to build ATLAS oneself, the search space can be considerably cut down if one accepts the suggested "experience" values offered during the make process). -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.Schmolck at gmx.net http://www.dcs.ex.ac.uk/people/aschmolc/ From cgw at alum.mit.edu Thu Feb 28 12:28:41 2002 From: cgw at alum.mit.edu (Charles G Waldman) Date: Thu Feb 28 12:28:41 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) In-Reply-To: References: Message-ID: <15486.37346.523821.173804@nyx.dyndns.org> A.Schmolck writes: > > Two essential matrix operations (matrix-multiplication and transposition > (which is what I am mainly using) are both considerably > > a) less efficient and > b) less notationally elegant Your comments about efficiency are well-taken. I have (in a previous life) done work on efficient (in terms of virtual memory access / paging behavior) transposes of large arrays. (Divide and conquer). Anyhow - if there were support for the operation of A*B' (and A'*B) at the C level, you wouldn't need to ever actually have a copy of the transposed array in memory - you would just exchange the roles of "i" and "j" in the computation... > 3. Wrap: create a DotMatrix class that overloads '*' to be dot and maybe > self.t to return the transpose -- this also means that all the numerical > libraries I frequently use need to be wrapped. I guess you haven't yet stumbled across the Matrix.py that comes with Numeric - it overrides "*" to be the dot-product. Unfortunately I don't see a really easy way to simplify the Transpose operator - at the very least you could do T = Numeric.transpose and then you're just writing T(A) instead of the long-winded version. Interestingly, the "~" operator is available, but it calls the function "__invert__". I guess it would be too weird to have ~A denote the transpose? Right now you get an error - one could set things up so that ~A was the matrix inverse of A, but we already have the A**-1 notation (among others) for that... From tim.hochberg at ieee.org Thu Feb 28 13:13:17 2002 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Thu Feb 28 13:13:17 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) References: Message-ID: <198c01c1c09b$4e981c60$74460344@cx781526b> Hi Alexander, [SNIP] > Two essential matrix operations (matrix-multiplication and transposition > (which is what I am mainly using) are both considerably > > a) less efficient and > b) less notationally elegant [Interesting stuff about notation and efficiency] > Or, even worse if one doesn't want to pollute the namespace: > > Numeric.dot(Numeric.dot(Numeric.dot(Numeric.M, > Numeric.dot(Numeric.transpose(C), C)), Numeric.transpose(v)), u) I compromise and use np.dot, etc. myself, but that's not really relavant to the issue at hand. [More snippage] > 2. Numeric performs unnecessary transpose operations (prior to 20.3, I think, > more about this later). The transpose operation is really damaging with big > matrices, because it creates a complete copy, rather than trying to do > something lazy (if your memory is already almost half filled up with > (matrix) C, then creating a (in principle superfluous) transposed copy is > not going to do you any good). The above C' * C actually creates, AFAIK, > _3_ versions of C, 2 of them transposed (prior to 20.3; I think you're a little off track here. The transpose operation doesn't normally make a copy, it just creates a new object that points to the same data, but with different stride values. So the transpose shouldn't be slow or take up more space. Numarray may well make a copy on transpose, I haven't looked into that, but I assume that at this point your are still talking about the old Numeric from the look of the code you posted. > > dot(a,b) > > translates into > > innerproduct(a, swapaxes(b, -1, -2)) > > In newer versions of Numeric, this is replaced by > > multiarray.matrixproduct(a, b) > > which has the considerable advantage that it doesn't create an unnecessary > copy and the considerable disadvantage that it seems to be factor 3 or so > slower than the old (already not blazingly fast) version for large Matrix x > Matrix multiplication, (see timing results [1])). Like I said, I don't think either of these should be making an extra copy unless it's happening inside multiarray.innerproduct or multiarray.matrixproduct. I haven't looked at the code for those in a _long_ time and then only glancingly, so I have no idea about that. [Faster! with Atlas] Sounds very cool. > > > As I said, > > dot(dot(dot(M, dot(transpose(C), C)), transpose(v)), u) > > is pretty obscure compared to > > M * (C' * C) * V' * u) Of the options that don't require new operators I'm somewhat fond of defining __call__ to be matrix multiply. If you toss in .t notation that you mention below, you get: (M)( (C.t)(C) ) (V.t)(u) Not perfect, but not too bad either. Note that I've tossed in some extra parentheses to make the above look better. It could actually be written: M( C.t(C) )(V.t)(u) ) But I think that's more confusing as it looks too much like a function call. (Although there is some mathematical precedent for viewing matrix multiplication as a function.) I'm a little iffy on the '.t' notation as it could get out of hand. Personally I could use cunjugate as much as transpose, and it's a similar operation -- would we also add '.c'? And possibly '.s' and '.h' for skew and Hermetian matrices? That might be a little much. The __call__ idea was not particularly popular last time, but I figured I'd toss at it out again as an easy to implement possibility. -tim From pearu at cens.ioc.ee Thu Feb 28 13:32:09 2002 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu Feb 28 13:32:09 2002 Subject: [Numpy-discussion] Re: [SciPy-user] numarray interface and performance issues (for dot product and transpose) In-Reply-To: Message-ID: Hi, On 28 Feb 2002, A.Schmolck wrote: > So far, I've thought of the following possible solutions: > > 0. Do nothing: > Just live with the awkward syntax. Let me add a subsolution here: 0.1 Wait for scipy to mature (or better yet, help it to do that). Scipy already provides wrappers to both, Fortran and C, LAPACK and BLAS routines, though currently they are under revision. With the new wrappers to these routines you can optimize your code fragments as flexible as if using them from C or Fortran. In principle, one could mix Fortran and C routines (i.e. the corresponding wrappers) so that one avoids all unneccasary transpositions. All matrix operations can be performed in-situ if so desired. Regards, Pearu From neelk at cswcasa.com Thu Feb 28 13:49:29 2002 From: neelk at cswcasa.com (Krishnaswami, Neel) Date: Thu Feb 28 13:49:29 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) Message-ID: a.schmolck at gmx.net [mailto:a.schmolck at gmx.net] wrote: > > Numeric is an impressively powerful and in many respects easy and > comfortable to use package (e.g. it's sophisticated slicing > operations, not to mention the power and elegance of the underlying > python language) and one would hope that it can one day replace Matlab > (which is both expensive and a nightmare as a programming language) as > a standard platform for numerical calculations. I'm in much the same boat, only with Gauss as the language I want to replace. > There is however a problem that, for the use to which I want > to put Numeric, runs deeper and provides me with quite a headache: > > Two essential matrix operations (matrix-multiplication and > transposition (which is what I am mainly using) are both considerably > > a) less efficient and > b) less notationally elegant > > under Numeric than under Matlab. These are my two problems as well. I can live with the clumsy function call interface to the matrix ops, but the loss of efficiency is a real killer for me. In my code, Gauss is about 8-10x faster than Numpy, which is a killer speed loss. (And Gauss is modestly slower than C, though I don't care about this because the Gauss is fast enough.) Right now, I have a data-mining program that I prototyped in Numpy and am now rewriting in C. Because Numpy isn't fast enough, I have wasted close to a week on this rewrite. This sounds bitter, but it's not meant to. I have to deploy on VMS, and after we had gotten Numpy working on OpenVMS I really hoped that the Alpha would fast enough that I could just use the Python prototype. -- Neel Krishnaswami neelk at cswcasa.com From oliphant at ee.byu.edu Thu Feb 28 14:32:26 2002 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Thu Feb 28 14:32:26 2002 Subject: [Numpy-discussion] numarray interface and performance issues (for dot product and transpose) In-Reply-To: Message-ID: On 28 Feb 2002, A.Schmolck wrote: > Two essential matrix operations (matrix-multiplication and transposition > (which is what I am mainly using) are both considerably > > a) less efficient and > b) less notationally elegant You are not alone in your concerns. The developers of SciPy are quite concerned about speed, hence the required linking to ATLAS. As Pearu mentioned all of the BLAS will be available (much of it is). This will enable very efficient algorithms. The question of notational elegance is stickier because we just can't add new operators. The solution I see is to use other classes. Right now, the Numeric array is an array of numbers (it is not a vector or a matrix) and that is why it has the operations it does. The Matrix class (delivered with Numeric) creates a Matrix object that uses the array of numbers of Numeric arrays. It overloads the * operator and defines .T, and .H for transpose and Hermitian transpose respectively. This requires explictly making your objects matrices (not a bad thing in my book as not all 2-D arrays fit perfectly in a matrix algebra). > The following Matlab fragment > > M * (C' * C) * V' * u > This becomes (using SciPy which defines Mat = Matrix.Matrix and could later redefine it to use the ATLAS libraries for matrix multiplication). C, V, u, M = apply(Mat, (C, V, u, M)) M * (C.H * C) * V.H * M not bad.. and with a Mat class that uses the ATLAS blas (not a very hard thing to do now.), this could be made as fast as MATLAB. Perhaps, as as start we could look at how you make the current Numeric use blas if it is installed to do dot on real and complex arrays (I know you can get rid of lapack_lite and use your own lapack) but, the dot function is defined in multiarray and would have to be modified to use the BLAS instead of its own homegrown algorithm. -Travis From jmiller at stsci.edu Thu Feb 28 16:01:03 2002 From: jmiller at stsci.edu (Todd Miller) Date: Thu Feb 28 16:01:03 2002 Subject: [Numpy-discussion] Numarray-0.2 issues with GCC and SPARC Message-ID: <3C7EC434.30904@stsci.edu> Hi, I'm Todd Miller and I work at STSCI on Numarray. Two people have reported problems with compiling Numarray-0.11 or 0.2 with GCC on SPARC. There are two problems: 1. Compiling the _ufuncmodule.c using gcc-2.95 on a SPARC (with the default switches) uses tons of virtual memory and typically fails. a. This can be avoided by adding the compilation flags: EXTRA_COMPILE_ARGS=["-O0", "-Wno-return-type"] to your setup.py when compiling *numarray*. b. Alternately, you can wait for numarray-0.3 which will partition the ufuncmodule into smaller compilation units. We suspect these will avoid the problem naturally and permit the use of optimization. c. Lastly, if you have Sun cc, you might want to try using it instead of gcc. This is what we do at STSCI. You need to recompile Python itself if you want to do this and your python was already compiled with gcc. 2. Python compiled with gcc generates misaligned storage within buffer objects. Numarray-0.2 is dependent on the problematic variant of the buffer object so if you want to use Float64 or Complex128 on a SPARC you may experience core dumps. a. I have a non-portable patch which worked for me with gcc-2.95 on SPARC. I can e-mail this to anyone interested. Apply this patch and recompile *python*. b. You might be able to fix this with gcc compilation switches for Python: try -munaligned-doubles and recompile *python*. c. Numarray-0.3 will address this issue by providing its own minimal memory object which features correctly aligned storage. This solution will not require recompiling python, but won't be available until numarray-0.3. d. Python compiled with Sun cc using the default switches doesn't manifest this bug. If you have Sun cc, you may want to recompile *python* using that. In general, I think the "better part of valor" is probably to wait 3 weeks for numarray-0.3 when both issues should be addressed. If you want to try numarray-0.2 now with GCC on a SPARC, I hope some of these ideas work for you. Todd -- Todd Miller jmiller at stsci.edu STSCI / SSG (410) 338 4576