From faheem at email.unc.edu Fri Oct 1 10:19:02 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Fri Oct 1 10:19:02 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs In-Reply-To: <982cfc7f.8876956d.8220100@expms6.cites.uiuc.edu> References: <982cfc7f.8876956d.8220100@expms6.cites.uiuc.edu> Message-ID: On Fri, 1 Oct 2004, Bruce Southey wrote: > Hi, > > I presume that you have R and can build the standalone library. I have > attached my SWIG Smath.i , the SWIG Smath_wrap.c and the > Smath.py files. With these last two files, you shouldn't need SWIG. > > Note that I have not touched the void functions here as I have yet to check > how these work in SWIG. Also, there are a few function in the R header that > are only headers. Eventually someone has to fixed these and add suitable > documentation in some package. I'm not sure what you mean by void functions. > If you have SWIG you can directly use the Smath.i file - while SWIG can take > a .h file directly it would not work in Python. So I just edited the header > file into a .i file. > > The following is my process using Linux (I don't know about other platforms): > > 0) Have swig installed and built the R math library > 1) $ swig -python Smath.i > 2) $ gcc -c Smath_wrap.c -I/usr/local/include/python2.3 > -I/home/bsouthey/Rproject/R-1.9.1/src/nmath > -I/home/bsouthey/Rproject/R-1.9.1/include > 3) $ ld -shared Smath_wrap.o -o _Smath.so -lm -lRmath > -L/home/bsouthey/Rproject/R-1.9.1/src/nmath/standalone > > Of course you must change the include (-I) and library (-L) paths to where > python lives and standard alone Rmath library lives. Thanks. I'm particularly interested in knowing how you interface with the random number generator at the top (Python) level. Can you supply an example? Specifically, I'm looking for the following method. 1) When C/C++ code called, reads seed from python random state. 2) Does its stuff. 3) Writes seed back to python level when it exits. R has this built it, but here one needs to build ones own mechanism. This is complicated by the fact that Numarray and the base Python random library use different RNG mechanisms, so one has to chose which one to use. Which one did you use? Faheem. From jmiller at stsci.edu Fri Oct 1 10:21:04 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Oct 1 10:21:04 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] Message-ID: <1096651226.9400.25.camel@halloween.stsci.edu> -- -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From fccoelho at fiocruz.br Fri Oct 1 13:06:10 2004 From: fccoelho at fiocruz.br (=?iso-8859-1?q?Fl=E1vio_Code=E7o_Coelho?=) Date: Fri, 1 Oct 2004 17:06:10 +0000 Subject: [Matplotlib-users] warning: Numeric and amd64 Message-ID: <200410011706.10524.fccoelho@fiocruz.br> Hi, look at this: >>> from RandomArray import * >>> normal(2,2,10) array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit P4 and it ran fine. Has anyone else seen this before? For those that didn't understand, the normal function as called above, is supposed to give me ten samples form a normal distribution with mean = 2 and standard deviation = 2 luckily: >>> from numarray.random_array import * >>> normal(2,2,10) array([-0.04525638, 4.31467819, -0.17468357, 5.29377031, 0.84202135, 5.29593539, 4.69651532, 1.61354655, 1.10839236, 1.7743317 ]) If anybody still needed a reason for switching to numarray, there you go! I anybody here subscribes the numeric or numarray mailing lists (i.e. if they even exist) could you please forward this message to them? Flavio ------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Matplotlib-users mailing list Matplotlib-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users --=-f+ARSKyzBPwKnxDSn4zh-- From jdhunter at ace.bsd.uchicago.edu Fri Oct 1 10:33:02 2004 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Fri Oct 1 10:33:02 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: <1096651226.9400.25.camel@halloween.stsci.edu> (Todd Miller's message of "01 Oct 2004 13:20:26 -0400") References: <1096651226.9400.25.camel@halloween.stsci.edu> Message-ID: >>>>> "Todd" == Todd Miller writes: >>>> from RandomArray import * >>>> normal(2,2,10) Todd> array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) I get this too on a 64bit Opteron 250. The root of the problem appears to be >>> from RandomArray import standard_normal >>> standard_normal(10) array([ 5.31046164e-315, 1.57997427e-314, 5.16421382e-315, 5.22924144e-315, 1.59247813e-314, 1.58920141e-314, 5.23691141e-315, 5.24305935e-315, 5.20686204e-315, 1.58739568e-314]) But MLab.randn, which uses a different approach, works fine. I've have this gnawing feeling I've seen this before, but I can't remember .... JDH From a.schmolck at gmx.net Fri Oct 1 11:34:01 2004 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Fri Oct 1 11:34:01 2004 Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present In-Reply-To: <4159BCA5.6090101@colorado.edu> (Fernando Perez's message of "Tue, 28 Sep 2004 13:33:57 -0600") References: <4159BCA5.6090101@colorado.edu> Message-ID: Fernando Perez writes: > Hi all, > > I found something today a bit unpleasant: if you install numeric without > any BLAS support, 'matrixmultiply is dot==True', so they are fully > interchangeable. However, to my surprise, if you build numeric with the blas > optimizations, they are NOT identical. Oops, my bad (I submitted the patch and while pretty much all the real coding was done by Richard Everson this is my oversight). > The reason is a bug in Numeric.py. After defining dot, the code reads: > > #This is obsolete, don't use in new code > matrixmultiply = dot On the other hand, it gently nudges people to no longer use the obsoleted matrixmultiply ;) > In [4]: timing 1,dot,a,b > ------> timing(1,dot,a,b) > Out[4]: 0.55591500000000005 > > In [5]: timing 1,matrixmultiply,a,b > ------> timing(1,matrixmultiply,a,b) > Out[5]: 68.142640999999998 > > In [6]: _/__ > Out[6]: 122.57744619231356 > > Pretty significant difference... Yup, someone should incorporate optional atlas dot support into numarray if it hasn't happened already (won't be me, IIRC it took some convincing to get this into Numeric and I won't be using numarray for anything real in the near future). cheers, alex From stephen.walton at csun.edu Fri Oct 1 11:37:01 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Fri Oct 1 11:37:01 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: References: <1096651226.9400.25.camel@halloween.stsci.edu> Message-ID: <1096655567.2678.2.camel@localhost.localdomain> On Fri, 2004-10-01 at 09:43, John Hunter wrote: > The root of the problem appears to be > > >>> from RandomArray import standard_normal > >>> standard_normal(10) > array([ 5.31046164e-315, 1.57997427e-314, > I've have this gnawing feeling I've seen this before, but I can't > remember .... Those values look suspiciously like what one sees if one reads a big-endian Float as little-endian or vice versa. I saw similar numbers recently when using pytables on a big-endian HDF5 (which generated a bug report for numarray if you recall). Is the Opteron big-endian? From stephen.walton at csun.edu Fri Oct 1 11:40:01 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Fri Oct 1 11:40:01 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: References: <1096651226.9400.25.camel@halloween.stsci.edu> Message-ID: <1096655567.2678.3.camel@localhost.localdomain> On Fri, 2004-10-01 at 09:43, John Hunter wrote: > The root of the problem appears to be > > >>> from RandomArray import standard_normal > >>> standard_normal(10) > array([ 5.31046164e-315, 1.57997427e-314, > I've have this gnawing feeling I've seen this before, but I can't > remember .... Those values look suspiciously like what one sees if one reads a big-endian Float as little-endian or vice versa. I saw similar numbers recently when using pytables on a big-endian HDF5 (which generated a bug report for numarray if you recall). Is the Opteron big-endian? From stephen.walton at csun.edu Fri Oct 1 11:43:06 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Fri Oct 1 11:43:06 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: References: <1096651226.9400.25.camel@halloween.stsci.edu> Message-ID: <1096655567.2678.4.camel@localhost.localdomain> On Fri, 2004-10-01 at 09:43, John Hunter wrote: > The root of the problem appears to be > > >>> from RandomArray import standard_normal > >>> standard_normal(10) > array([ 5.31046164e-315, 1.57997427e-314, > I've have this gnawing feeling I've seen this before, but I can't > remember .... Those values look suspiciously like what one sees if one reads a big-endian Float as little-endian or vice versa. I saw similar numbers recently when using pytables on a big-endian HDF5 (which generated a bug report for numarray if you recall). Is the Opteron big-endian? From Fernando.Perez at colorado.edu Fri Oct 1 11:51:00 2004 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Fri Oct 1 11:51:00 2004 Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present In-Reply-To: References: <4159BCA5.6090101@colorado.edu> Message-ID: <415DA6D7.4070407@colorado.edu> Alexander Schmolck schrieb: > Fernando Perez writes: > > >>Hi all, >> >>I found something today a bit unpleasant: if you install numeric without >>any BLAS support, 'matrixmultiply is dot==True', so they are fully >>interchangeable. However, to my surprise, if you build numeric with the blas >>optimizations, they are NOT identical. > > > Oops, my bad (I submitted the patch and while pretty much all the real coding > was done by Richard Everson this is my oversight). No prob. It's been fixed in Numeric 23.5, so no more worries. >>Pretty significant difference... > > > Yup, someone should incorporate optional atlas dot support into numarray if it > hasn't happened already (won't be me, IIRC it took some convincing to get this > into Numeric and I won't be using numarray for anything real in the near > future). I'll leave that question to the numarray guys, I have no idea where it stands in terms of blas/atlas support. I certainly hope it has it or that this optimization can be brought in, as it makes a huge difference for the large array case. Best, f From perry at stsci.edu Fri Oct 1 11:57:02 2004 From: perry at stsci.edu (Perry Greenfield) Date: Fri Oct 1 11:57:02 2004 Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present In-Reply-To: <415DA6D7.4070407@colorado.edu> References: <4159BCA5.6090101@colorado.edu> <415DA6D7.4070407@colorado.edu> Message-ID: <52083A9C-13DB-11D9-B931-000A95B68E50@stsci.edu> On Oct 1, 2004, at 2:49 PM, Fernando Perez wrote: > Alexander Schmolck schrieb: >>> Pretty significant difference... >> Yup, someone should incorporate optional atlas dot support into >> numarray if it >> hasn't happened already (won't be me, IIRC it took some convincing to >> get this >> into Numeric and I won't be using numarray for anything real in the >> near >> future). > > I'll leave that question to the numarray guys, I have no idea where it > stands in terms of blas/atlas support. I certainly hope it has it or > that this optimization can be brought in, as it makes a huge > difference for the large array case. > > Best, > > f I'm not sure when it will get done, but we are working on the early stages of getting scipy working with numarray. You should see visible signs of that within a month (i.e., at least some parts of scipy working with numarray). It will probably take months to finish though. Perry From pearu at cens.ioc.ee Fri Oct 1 12:44:58 2004 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri Oct 1 12:44:58 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: <1096651226.9400.25.camel@halloween.stsci.edu> Message-ID: On 1 Oct 2004, Todd Miller wrote: > look at this: > > >>> from RandomArray import * > > >>> normal(2,2,10) > array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) > > This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a > 32bit P4 and it ran fine. > Has anyone else seen this before? Yes. I just fixed a similar issue in scipy.stats.rand module. Below is the corresponding patch for Numeric Src/ranlibmodule.c that fixes the issue for Opteron. Regards, Pearu *** ranlibmodule.c Fri Oct 1 22:29:57 2004 --- ranlibmodule.c.orig Fri Oct 1 22:12:13 2004 *************** *** 47,49 **** case 0: ! *out_ptr = (double) ((float (*)(void)) fun)(); break; --- 47,49 ---- case 0: ! *out_ptr = (double) ((double (*)()) fun)(); break; *************** *** 81,83 **** case 1: ! if( !PyArg_ParseTuple(args, "lf|i", &int_arg, &float_arg, &n) ) { return NULL; --- 81,83 ---- case 1: ! if( !PyArg_ParseTuple(args, "if|i", &int_arg, &float_arg, &n) ) { return NULL; *************** *** 213,215 **** ! if( !PyArg_ParseTuple(args, "lO|i", &num_trials, &priors_object, &n) ) { return NULL; --- 213,215 ---- ! if( !PyArg_ParseTuple(args, "iO|i", &num_trials, &priors_object, &n) ) { return NULL; From jmiller at stsci.edu Fri Oct 1 13:35:07 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Oct 1 13:35:07 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: References: Message-ID: <1096662489.15037.1.camel@halloween.stsci.edu> Thanks Pearu. For some unknown reason, numarray.random_array already had the fixes, but I applied the patch to Numeric CVS. Regards, Todd On Fri, 2004-10-01 at 15:38, Pearu Peterson wrote: > On 1 Oct 2004, Todd Miller wrote: > > > look at this: > > > > >>> from RandomArray import * > > > > >>> normal(2,2,10) > > array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) > > > > This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a > > 32bit P4 and it ran fine. > > Has anyone else seen this before? > > Yes. I just fixed a similar issue in scipy.stats.rand module. Below is the > corresponding patch for Numeric Src/ranlibmodule.c that fixes the issue > for Opteron. > > Regards, > Pearu > > *** ranlibmodule.c Fri Oct 1 22:29:57 2004 > --- ranlibmodule.c.orig Fri Oct 1 22:12:13 2004 > *************** > *** 47,49 **** > case 0: > ! *out_ptr = (double) ((float (*)(void)) fun)(); > break; > --- 47,49 ---- > case 0: > ! *out_ptr = (double) ((double (*)()) fun)(); > break; > *************** > *** 81,83 **** > case 1: > ! if( !PyArg_ParseTuple(args, "lf|i", &int_arg, &float_arg, &n) ) { > return NULL; > --- 81,83 ---- > case 1: > ! if( !PyArg_ParseTuple(args, "if|i", &int_arg, &float_arg, &n) ) { > return NULL; > *************** > *** 213,215 **** > > ! if( !PyArg_ParseTuple(args, "lO|i", &num_trials, &priors_object, &n) ) { > return NULL; > --- 213,215 ---- > > ! if( !PyArg_ParseTuple(args, "iO|i", &num_trials, &priors_object, &n) ) { > return NULL; > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From faheem at email.unc.edu Fri Oct 1 22:28:41 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Fri Oct 1 22:28:41 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code Message-ID: Dear People, I want to write some C++ code to link with Python, using the Boost.Python interface. I need to generate random numbers in the C++ code, and I was wondering as to the best way of doing this. Note that it is important that the random number generation interoperate seamlessly with Python, in the sense that the behavior of the calls to the RNG is the same whether calls are made at the C level or the Python level. I hope the reasons why this is important are obvious. I was thinking that the method should go like this. 1) When C/C++ code called, reads seed from python random state. 2) Does its stuff. 3) Writes seed back to python level when it exits. After doing a little investigation of the numarray.random_array python library and associated extension modules, it seems possible that the answer is simpler than I had supposed. However, I would appreciate it if someone would tell me if my understanding is incorrect in some places. Summary: It seems that I can just call all the C entry point routines defined in ranlib.h, without worrying about getting or setting seeds. Rationale: The structure of this random number facility has three parts, all files in Packages/RandomArray2/Src. 1) low-level C routines: Packages/RandomArray2/Src/com.c and Packages/RandomArray2/Src/ranlib.c. com.c: basic RNG stuff; getting and setting seeds etc. ranlib.c: Random number generator algorithms for different distributions etc. 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c. This interfaces the stuff in com.c and ranlib.c. 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py. This wraps the C interface. In most cases it does not do much else besides some basic argument error checking. >From my perspective, the important thing is that the random number seed is only defined at C level as a static object, all the RNG stuff happens at C level, and the Python code just calls the C code as necessary. (I'm sketchy about the details of what is defined as the seed etc.) This is in contrast with the R RNG facility (the only other RNG facility I am familiar with), which uses macros SetRNGstate() and GetRNGstate() to read and write the seed, which is defined at R level. Therefore, the upshot is that the C routines in ranlib.h read and write the same seed as the python level functions do, so no special action is necessary with regard to the seed. Is this correct? In any case, it would be nice if something like the above was documented, so lost souls like myself don't have to go trawling through the source code to figure out what is going on. Of course it is nice that the source code is available, otherwise even that would be impossible. R documents this stuff in the "Writing R Extensions" manual, online at http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray manual could have a small section about this too. Regards, Faheem. From fccoelho at gmail.com Mon Oct 4 07:59:12 2004 From: fccoelho at gmail.com (Flavio Coelho) Date: Mon Oct 4 07:59:12 2004 Subject: [Numpy-discussion] Bug Compiling Numeric on amd64 Message-ID: Hi, look at this: >>> from RandomArray import * >>> normal(2,2,10) array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit P4 and it ran fine. Has anyone else seen this before? luckily: >>> from numarray.random_array import * >>> normal(2,2,10) array([-0.04525638, 4.31467819, -0.17468357, 5.29377031, 0.84202135, 5.29593539, 4.69651532, 1.61354655, 1.10839236, 1.7743317 ]) Both modules were compiled on my gentoo box with: gcc version 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6) any comments? Flavio -- I use Linux daily to UP my productivity -- Microsoft, UP yours! From jmiller at stsci.edu Mon Oct 4 09:21:32 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Oct 4 09:21:32 2004 Subject: [Numpy-discussion] Bug Compiling Numeric on amd64 In-Reply-To: References: Message-ID: <1096906220.7641.55.camel@localhost.localdomain> On Mon, 2004-10-04 at 10:48, Flavio Coelho wrote: > Hi, > > look at this: > > >>> from RandomArray import * > > >>> normal(2,2,10) > array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) > > This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit > P4 and it ran fine. > Has anyone else seen this before? > This was discussed here briefly last week after I forwarded your post from matplotlib-users. Pearu Peterson posted a patch which he had already performed for SciPy and I applied it to Numeric on Source Forge. Thanks for raising the issue. Regards, Todd > > luckily: > > >>> from numarray.random_array import * > > >>> normal(2,2,10) > array([-0.04525638, 4.31467819, -0.17468357, 5.29377031, 0.84202135, > 5.29593539, 4.69651532, 1.61354655, 1.10839236, 1.7743317 ]) > > Both modules were compiled on my gentoo box with: > > gcc version 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6) > > any comments? > > Flavio -- From Fernando.Perez at colorado.edu Mon Oct 4 10:59:49 2004 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Mon Oct 4 10:59:49 2004 Subject: [Numpy-discussion] Small bug in MA with arrays of rank > 1 Message-ID: <41618DFD.7030106@colorado.edu> Hi all, a while back I noticed a small problem with MA for rank 2 (and larger) arrays. Here's a simple example: In [1]: a=RA.random((3,3)) In [2]: a Out[2]: array([[ 0.002542, 0.70301 , 0.705466], [ 0.467305, 0.381492, 0.655857], [ 0.103372, 0.776988, 0.466528]]) In [3]: import MA In [4]: a Out[4]: [[ 0.002542, 0.70301 , 0.705466,] [ 0.467305, 0.381492, 0.655857,] [ 0.103372, 0.776988, 0.466528,]] The bug is that the commas at the end of each line are coming _before_ the closing bracket, instead of after. This seemingly trivial problem turns out to be pretty serious for me, because I use this string representation to export python arrays into Mathematica files, by simply replacing [] with {} (and playing some other tricks). Unfortunately, this bug means I can't use MA, which is otherwise great because of the way it gracefully handles the case where you accidentally say A when A is some monster array. With MA, instead of your CPU getting killed for 10 minutes, you get a nice summary of A's dimensions and typecode. Anyway, it would be great if one of the gurus had a chance to fix this one. Best, f From graik at web.de Tue Oct 5 10:44:13 2004 From: graik at web.de (Raik =?iso-8859-1?q?Gr=FCnberg?=) Date: Tue Oct 5 10:44:13 2004 Subject: [Numpy-discussion] Numeric to numarray experiences Message-ID: <200410051941.29807.graik@web.de> Hi there, I've just translated a package for molecular modelling, which makes extensive use of Numeric, from Numeric to numarray. The outcome is somewhat negative - for now we are basically going to postpone the transition - the reasons might be interesting for the list and the numarray developpers out there (who are doing a brave job!). Speed: A typical task in our package is the least-square fitting of a large array of coordinate frames ( N1 x N2 x 3) onto a set of reference or average coordinates (using a sub-set of coordinates for the matching). The example I looked at (500 x 876 x 3 items) took 1.3 s with Numeric and 4.7 s with numarray. The main culprits for the slow-down were: * compress() - factor 10 * average() - factor 7 (average() is missing from Numeric and I hence had to write a little function myself) * LinearAlgebra.singular_value_decomposition() - factor 10 but a lot of extra time is also spent in uufunc.py and various numarraycore.py routines. Memory efficiency: I hoped numarray would solve some of the Out-of-memory problems that I get with Numeric but it turns out that it is rather less memory efficient for my kind of applications. Slicing an array that takes up 800MB on disc just about runs through with Numeric (and heavy swapping) but gives an Out-of-memory with numarray. Suggestions: OK, it's easy to make clever comments without contributing any real work... - compress(), take(), etc, really need some optimization - a C-coded average() routine would be helpful - faster LinearAlgebra routines are necessary Our sysadmin noted that unlike Numeric, numarray is not using any external math libraries (like LAPACK) that have been speed-optimized for decades and are available in CPU-optimized variants (e.g. ATLAS). It's probably difficult to match this efficiency with any new code ... Greetings Raik PS: I didn't find any useful HowTo for the translation from Numeric to numarray. The practical issues were the different nonzero() return value, the more restrictive boolean comparison, that take doesn't support 'O' arrays any longer, and the missing average(). -- ----------------------------------------------------- Raik Gr?nberg | Bioinformatique Structurale | Institut Pasteur | Paris, France ----------------------------------------------------- From southey at uiuc.edu Tue Oct 5 11:33:27 2004 From: southey at uiuc.edu (Bruce Southey) Date: Tue Oct 5 11:33:27 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code Message-ID: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu> Hi, It is rather hard to suggest anything without more detail on what you want to actually do. As you describe it, why do you need the 'seed' returned? It would only make sense if you were going in and out of Python multiple times - a somewhat undesirable situation due to the overhead costs. I see at least three options: 1) Do everything in Python/numarray. 2) Do parts in Python and the other in C/C++. For example, pass a matrix of random numbers to your code from Python. The 'seed' never needs to leave Python. 3) Do it all in C/C++ - pass the 'seed' into your code that includes the random number generator(s) - there is C/C++ code around for this. Do you stuff and then return the 'seed' back with whatever else is required. You can email me privately if you want. Bruce ---- Original message ---- >Date: Sat, 2 Oct 2004 01:23:21 -0400 (EDT) >From: Faheem Mitha >Subject: [Numpy-discussion] numarray.random_array number generation in C code >To: numpy-discussion > > >Dear People, > >I want to write some C++ code to link with Python, using the >Boost.Python interface. I need to generate random numbers in the C++ >code, and I was wondering as to the best way of doing this. > >Note that it is important that the random number generation interoperate >seamlessly with Python, in the sense that the behavior of the calls to >the RNG is the same whether calls are made at the C level or the Python >level. I hope the reasons why this is important are obvious. > >I was thinking that the method should go like this. > >1) When C/C++ code called, reads seed from python random state. > >2) Does its stuff. > >3) Writes seed back to python level when it exits. > >After doing a little investigation of the numarray.random_array python >library and associated extension modules, it seems possible that the >answer is simpler than I had supposed. However, I would appreciate it if >someone would tell me if my understanding is incorrect in some places. > >Summary: It seems that I can just call all the C entry point routines >defined in ranlib.h, without worrying about getting or setting seeds. > >Rationale: > >The structure of this random number facility has three parts, all files in >Packages/RandomArray2/Src. > >1) low-level C routines: Packages/RandomArray2/Src/com.c and >Packages/RandomArray2/Src/ranlib.c. > >com.c: basic RNG stuff; getting and setting seeds etc. >ranlib.c: Random number generator algorithms for different distributions >etc. > >2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c. > >This interfaces the stuff in com.c and ranlib.c. > >3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py. > >This wraps the C interface. In most cases it does not do much else besides >some basic argument error checking. > >From my perspective, the important thing is that the random number seed is >only defined at C level as a static object, all the RNG stuff happens at C >level, and the Python code just calls the C code as necessary. (I'm >sketchy about the details of what is defined as the seed etc.) > >This is in contrast with the R RNG facility (the only other RNG facility I >am familiar with), which uses macros SetRNGstate() and GetRNGstate() to >read and write the seed, which is defined at R level. > >Therefore, the upshot is that the C routines in ranlib.h read and write >the same seed as the python level functions do, so no special action is >necessary with regard to the seed. > >Is this correct? > >In any case, it would be nice if something like the above was documented, >so lost souls like myself don't have to go trawling through the source >code to figure out what is going on. Of course it is nice that the source >code is available, otherwise even that would be impossible. > >R documents this stuff in the "Writing R Extensions" manual, online at >http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray >manual could have a small section about this too. > > Regards, Faheem. > > > >------------------------------------------------------- >This SF.net email is sponsored by: IT Product Guide on ITManagersJournal >Use IT products in your business? Tell us what you think of them. Give us >Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more >http://productguide.itmanagersjournal.com/guidepromo.tmpl >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion From stephen.walton at csun.edu Tue Oct 5 12:20:01 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Tue Oct 5 12:20:01 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: <200410051941.29807.graik@web.de> References: <200410051941.29807.graik@web.de> Message-ID: <1097003873.13715.17.camel@freyer.sfo.csun.edu> On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote: > Our sysadmin noted that unlike Numeric, numarray is not using any external > math libraries (like LAPACK) that have been speed-optimized for decades and > are available in CPU-optimized variants (e.g. ATLAS). It's probably difficult > to match this efficiency with any new code ... This is a key point. Have a look at addons.py in numarray, some previous comments on this list, and build numarray with the line env USE_LAPACK=1 python setup.py build after editing addons.py appropriately. You should see a major speed improvement. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From dd55 at cornell.edu Tue Oct 5 13:02:01 2004 From: dd55 at cornell.edu (Darren Dale) Date: Tue Oct 5 13:02:01 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: <1097003873.13715.17.camel@freyer.sfo.csun.edu> References: <200410051941.29807.graik@web.de> <1097003873.13715.17.camel@freyer.sfo.csun.edu> Message-ID: <200410051600.38254.dd55@cornell.edu> On Tuesday 05 October 2004 03:17 pm, Stephen Walton wrote: > On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote: > > Our sysadmin noted that unlike Numeric, numarray is not using any > > external math libraries (like LAPACK) that have been speed-optimized for > > decades and are available in CPU-optimized variants (e.g. ATLAS). It's > > probably difficult to match this efficiency with any new code ... > > This is a key point. Have a look at addons.py in numarray, some > previous comments on this list, and build numarray with the line > > env USE_LAPACK=1 python setup.py build > > after editing addons.py appropriately. You should see a major speed > improvement. I would kindly suggest updating the numarray documentation. In the section on installation, it is easy to overlook the option to compile againist existing libraries. That is explained in section 16, which appears to be out of date. The code listed in Packages/LinearAlgebra2/setup.py has been moved to addons.py, correct? -- Darren From jmiller at stsci.edu Tue Oct 5 13:37:42 2004 From: jmiller at stsci.edu (Todd Miller) Date: Tue Oct 5 13:37:42 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: <200410051600.38254.dd55@cornell.edu> References: <200410051941.29807.graik@web.de> <1097003873.13715.17.camel@freyer.sfo.csun.edu> <200410051600.38254.dd55@cornell.edu> Message-ID: <1097008567.27149.140.camel@halloween.stsci.edu> On Tue, 2004-10-05 at 16:00, Darren Dale wrote: > On Tuesday 05 October 2004 03:17 pm, Stephen Walton wrote: > > On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote: > > > Our sysadmin noted that unlike Numeric, numarray is not using any > > > external math libraries (like LAPACK) that have been speed-optimized for > > > decades and are available in CPU-optimized variants (e.g. ATLAS). It's > > > probably difficult to match this efficiency with any new code ... > > > > This is a key point. Have a look at addons.py in numarray, some > > previous comments on this list, and build numarray with the line > > > > env USE_LAPACK=1 python setup.py build > > > > after editing addons.py appropriately. You should see a major speed > > improvement. > > I would kindly suggest updating the numarray documentation. Thanks, will do. > In the section on > installation, it is easy to overlook the option to compile againist existing > libraries. That is explained in section 16, which appears to be out of date. > The code listed in Packages/LinearAlgebra2/setup.py has been moved to > addons.py, correct? That's correct. Regards, Todd From faheem at email.unc.edu Tue Oct 5 15:44:36 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Oct 5 15:44:36 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code In-Reply-To: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu> References: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu> Message-ID: On Tue, 5 Oct 2004, Bruce Southey wrote: > Hi, > It is rather hard to suggest anything without more detail on what you want to > actually do. I could give you more details if you were interested. > As you describe it, why do you need the 'seed' returned? It would only > make sense if you were going in and out of Python multiple times - a > somewhat undesirable situation due to the overhead costs. Not really. One might (and I frequently do) want to run the same function (which in this case might be all in C++ code), interactively with different parameters. The kind of thing that I'm doing is akin to exploratory data analysis, and the specific code in question is a stochastic search algorithm. Doing all this in C++ would not be very interactive. Also, one often wants to postprocess data output using Python scripts. This involves multiple calls to C++ code, and would be impossible to do using C++, since one has to call other Python libraries. > I see at least three options: > 1) Do everything in Python/numarray. That's my current situation. > 2) Do parts in Python and the other in C/C++. > For example, pass a matrix of random numbers to your code from Python. The > 'seed' never needs to leave Python. This doesn't work very well unless you know in advance how many random numbers are needed (not the case, for example, for stochastic search algorithms), and in any case is a rather clumsy way to do things. No offense intended. > 3) Do it all in C/C++ - pass the 'seed' into your code that includes the > random number generator(s) - there is C/C++ code around for this. Do you stuff > and then return the 'seed' back with whatever else is required. Yes, but part of the point of mixed programming is that you have an interpreted front end which can easily hook into other routines. Also, in this case, you would not be passing the seed in, since there is nothing to pass it in from. One would simply call system time or something similar to obtain the seed. > You can email me privately if you want. I'll keep sending this to the list unless someone objects, since I think this is of some general interest. Really, my main question was to whether my understanding of how to use the Numarray random number facilities in C was correct or not. Faheem. From stephen.walton at csun.edu Tue Oct 5 16:15:31 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Tue Oct 5 16:15:31 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: References: <200410051941.29807.graik@web.de> <1097003873.13715.17.camel@freyer.sfo.csun.edu> Message-ID: <1097018077.22092.15.camel@freyer.sfo.csun.edu> On Tue, 2004-10-05 at 16:00, Flavio Coelho wrote: > I wrote > > env USE_LAPACK=1 python setup.py build > > > > after editing addons.py appropriately. You should see a major speed > > improvement. > > > > > If that is the case, why is it not the default?, at least when LAPACK > is installed? Well, I won't pretend to speak for the developers on this one. But I strongly suspect it is just too hard to find all possible LAPACK distributions; the default numarray setup should be self contained even if somewhat slower. The current version of Numeric also defaults to its own built-in BLAS and requires editing setup.py to use a different one. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From perry at stsci.edu Tue Oct 5 17:30:58 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Oct 5 17:30:58 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: <1097018077.22092.15.camel@freyer.sfo.csun.edu> Message-ID: Steve Walton wrote: > On Tue, 2004-10-05 at 16:00, Flavio Coelho wrote: > > I wrote > > > env USE_LAPACK=1 python setup.py build > > > > > > after editing addons.py appropriately. You should see a major speed > > > improvement. > > > > > > > > > If that is the case, why is it not the default?, at least when LAPACK > > is installed? > > Well, I won't pretend to speak for the developers on this one. But I > strongly suspect it is just too hard to find all possible LAPACK > distributions; the default numarray setup should be self contained even > if somewhat slower. The current version of Numeric also defaults to its > own built-in BLAS and requires editing setup.py to use a different one. > Well, it's been a while, and Todd handled that aspect of porting those from Numeric, but if I recall correctly, the situation was the same there, and I think Steve is correct. It was to provide the basic functionality as part of the distribution without requiring other installations. If you needed better performance, you jump through a couple more hoops. But requiring it to use LAPACK makes life more difficult for those who were looking for a self contained and easy to install solution. Perry From perry at stsci.edu Tue Oct 5 17:40:51 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Oct 5 17:40:51 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: <200410051941.29807.graik@web.de> Message-ID: I hadn't seen this until now. It's hard for us to understand exactly the reasons for the slower performance with such large arrays. Could you send us the code and an indication of the what inputs and parameters were used so we could try to figure out why some of these problems exist (we can check the specific functions you mention, but I want to make sure you aren't iterating over array slices or such). It's not obvious to me why you are having out of memory errors and this may help. Perry Greenfield > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Raik > Gr?nberg > Sent: Tuesday, October 05, 2004 1:41 PM > To: numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] Numeric to numarray experiences > > > Hi there, > > I've just translated a package for molecular modelling, which > makes extensive > use of Numeric, from Numeric to numarray. The outcome is somewhat > negative - > for now we are basically going to postpone the transition - the > reasons might > be interesting for the list and the numarray developpers out > there (who are > doing a brave job!). > > Speed: > A typical task in our package is the least-square fitting of a > large array of > coordinate frames ( N1 x N2 x 3) onto a set of reference or average > coordinates (using a sub-set of coordinates for the matching). > The example I > looked at (500 x 876 x 3 items) took 1.3 s with Numeric and 4.7 s with > numarray. The main culprits for the slow-down were: > * compress() - factor 10 > * average() - factor 7 (average() is missing from Numeric and I > hence had to > write a little function myself) > * LinearAlgebra.singular_value_decomposition() - factor 10 > but a lot of extra time is also spent in uufunc.py and various > numarraycore.py > routines. > > Memory efficiency: > I hoped numarray would solve some of the Out-of-memory problems > that I get > with Numeric but it turns out that it is rather less memory > efficient for my > kind of applications. Slicing an array that takes up 800MB on > disc just about > runs through with Numeric (and heavy swapping) but gives an Out-of-memory > with numarray. > > Suggestions: > OK, it's easy to make clever comments without contributing any > real work... > - compress(), take(), etc, really need some optimization > - a C-coded average() routine would be helpful > - faster LinearAlgebra routines are necessary > > Our sysadmin noted that unlike Numeric, numarray is not using any > external > math libraries (like LAPACK) that have been speed-optimized for > decades and > are available in CPU-optimized variants (e.g. ATLAS). It's > probably difficult > to match this efficiency with any new code ... > > Greetings > Raik > > PS: > I didn't find any useful HowTo for the translation from Numeric > to numarray. > The practical issues were the different nonzero() return value, the more > restrictive boolean comparison, that take doesn't support 'O' arrays any > longer, and the missing average(). > > -- > ----------------------------------------------------- > Raik Gr?nberg | Bioinformatique Structurale > | Institut Pasteur > | Paris, France > ----------------------------------------------------- > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to > find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From perry at stsci.edu Tue Oct 5 18:14:00 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Oct 5 18:14:00 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code In-Reply-To: Message-ID: Faheem Mitha wrote: > Dear People, > > I want to write some C++ code to link with Python, using the > Boost.Python interface. I need to generate random numbers in the C++ > code, and I was wondering as to the best way of doing this. > > Note that it is important that the random number generation interoperate > seamlessly with Python, in the sense that the behavior of the calls to > the RNG is the same whether calls are made at the C level or the Python > level. I hope the reasons why this is important are obvious. > > I was thinking that the method should go like this. > > 1) When C/C++ code called, reads seed from python random state. > > 2) Does its stuff. > > 3) Writes seed back to python level when it exits. > > After doing a little investigation of the numarray.random_array python > library and associated extension modules, it seems possible that the > answer is simpler than I had supposed. However, I would appreciate it if > someone would tell me if my understanding is incorrect in some places. > > Summary: It seems that I can just call all the C entry point routines > defined in ranlib.h, without worrying about getting or setting seeds. > > Rationale: > > The structure of this random number facility has three parts, all > files in > Packages/RandomArray2/Src. > > 1) low-level C routines: Packages/RandomArray2/Src/com.c and > Packages/RandomArray2/Src/ranlib.c. > > com.c: basic RNG stuff; getting and setting seeds etc. > ranlib.c: Random number generator algorithms for different distributions > etc. > > 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c. > > This interfaces the stuff in com.c and ranlib.c. > > 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py. > > This wraps the C interface. In most cases it does not do much > else besides > some basic argument error checking. > > From my perspective, the important thing is that the random > number seed is > only defined at C level as a static object, all the RNG stuff > happens at C > level, and the Python code just calls the C code as necessary. (I'm > sketchy about the details of what is defined as the seed etc.) > > This is in contrast with the R RNG facility (the only other RNG > facility I > am familiar with), which uses macros SetRNGstate() and GetRNGstate() to > read and write the seed, which is defined at R level. > > Therefore, the upshot is that the C routines in ranlib.h read and write > the same seed as the python level functions do, so no special action is > necessary with regard to the seed. > > Is this correct? > > In any case, it would be nice if something like the above was documented, > so lost souls like myself don't have to go trawling through the source > code to figure out what is going on. Of course it is nice that the source > code is available, otherwise even that would be impossible. > > R documents this stuff in the "Writing R Extensions" manual, online at > http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray > manual could have a small section about this too. > > Regards, Faheem. > I'm not sure I understand what you want to do. Do you want to link directly to the extension code from your C++ code? If so I'm wondering why. It would make the most sense if the C++ code needed obtain small numbers of random numbers in some iterative loop, and you wish to use the same random number library that that numarray is using. Otherwise, I would normally obtain the random number array in python, then call the C++ extension. Perhaps I didn't read carefully enough. Normally linking to an extension module involves some hacks that I'm not sure were done for the randomarray module (the gory details are in the python docs for extension modules), Todd can check on that, I'm not sure I will have time (a superficial check seems to indicate that it doesn't support direct linking, though one could link to the underlying library I suppose). As an aside, it is likely that a better module can be done as some have suggested, we just took what Numeric had at the time. Doing that is not a high priority with us at the moment (anyone else want to tackle that?). Right now integration with scipy is our biggest priority so things like this will have to take a back seat for a while. Furthermore, we did what we needed to to port these modules from Numeric, but that didn't necessarily make us experts in how they worked. I wish we were, but we've generally been directing our energy elsewhere. I'd presume that the sensible way for the module to work is to initialize its seed from a time-based seed in the absence of any other seed initialization, and to keep the seed state in the extension module, but I could be wrong. Perry From faheem at email.unc.edu Tue Oct 5 18:41:02 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Oct 5 18:41:02 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code In-Reply-To: References: Message-ID: On Tue, 5 Oct 2004, Perry Greenfield wrote: > I'm not sure I understand what you want to do. Do you want to link > directly to the extension code from your C++ code? Yes. > If so I'm wondering why. It would make the most sense if the C++ code > needed obtain small numbers of random numbers in some iterative loop, > and you wish to use the same random number library that that numarray is > using. I need to obtain an arbitrary (not known in advance) number of random numbers in the C++ code. I'm thinking of using the same random number library mostly because I assumed that using the same seed across the python/C interface would be supported. This is how it works in R (the only other place I have used this). Also, I had been using the same routines in the Python code I'm trying to convert to C++, so it would be a relatively smooth transfer. If I was to use a pure C/C++ library, I'd have to worry about copying the seed back and forth between Python and C. Is this what I'll have to do then? > Otherwise, I would normally obtain the random number array in python, > then call the C++ extension. Yes, this is what everyone suggests. But in my case, the number of random variates required is not known in advance. I get the feeling this situation does not arise very often for most people, but I work with stochastic processes which terminate according to some stopping criterion, and that is the standard situation in this case. Also generating these numbers in Python would give rise to serious performance issues. > Perhaps I didn't read carefully enough. Normally linking to an extension > module involves some hacks that I'm not sure were done for the > randomarray module (the gory details are in the python docs for > extension modules), Todd can check on that, I'm not sure I will have > time (a superficial check seems to indicate that it doesn't support > direct linking, though one could link to the underlying library I > suppose). Hmm. Well, this is unwelcome news. You mean I cannot link to ranlib.so? I assumed that including the ranlib.h header and linking my C++ module against ranlib.so would be enough. I suppose that was too optimistic. > As an aside, it is likely that a better module can be done as some > have suggested, we just took what Numeric had at the time. Doing that > is not a high priority with us at the moment (anyone else want to > tackle that?). Right now integration with scipy is our biggest > priority so things like this will have to take a back seat for > a while. > Furthermore, we did what we needed to to port these modules from > Numeric, but that didn't necessarily make us experts in how they > worked. I wish we were, but we've generally been directing our > energy elsewhere. I'd presume that the sensible way for the module > to work is to initialize its seed from a time-based seed in the > absence of any other seed initialization, and to keep the seed > state in the extension module, but I could be wrong. Yes. That is how R does it, anyway. Specifically, you declare the seed static, and then it persists across the Python/C interface. That is what I thought you had in the numarray code. Would it be hard to make it work like this? I'm no expert either. Faheem. From southey at uiuc.edu Wed Oct 6 07:01:38 2004 From: southey at uiuc.edu (Bruce Southey) Date: Wed Oct 6 07:01:38 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code Message-ID: Hi, My understanding is that you can use the Ranlib, R math, and GNU Scientific libraries in the manner you suggest or directly include the random number generator in your code. Usually you define the seed that should provide the same psuedo-random number stream every time these are used. If you don't use a seed then it is usually impossible to get the same stream of psuedo-random numbers. So I do not understand what you need to keep the same random number state. Not to mention that the common generators do repeat, some sooner than others. In your response to Perry, you indicate that you do not need an array of random numbers but rather the stream of random numbers. This is very different and I think you need to refine your algorithm to identify what parts need to be C/C++ and what need to be in Python/numarray. Since you currently have Python code, I would profile it to see what parts actually need extending - some times Python is rather surprising on how quick some things can be done (like using dictionaries). Providing those parts may be more fruitful to you than my vague responses. Regards Bruce ---- Original message ---- >Date: Tue, 5 Oct 2004 18:43:48 -0400 (EDT) >From: Faheem Mitha >Subject: Re: [Numpy-discussion] numarray.random_array number generation in C code >To: Bruce Southey >Cc: numpy-discussion > > > >On Tue, 5 Oct 2004, Bruce Southey wrote: > >> Hi, >> It is rather hard to suggest anything without more detail on what you want to >> actually do. > >I could give you more details if you were interested. > >> As you describe it, why do you need the 'seed' returned? It would only >> make sense if you were going in and out of Python multiple times - a >> somewhat undesirable situation due to the overhead costs. > >Not really. One might (and I frequently do) want to run the same function >(which in this case might be all in C++ code), interactively with >different parameters. The kind of thing that I'm doing is akin to >exploratory data analysis, and the specific code in question is a >stochastic search algorithm. Doing all this in C++ would not be very >interactive. Also, one often wants to postprocess data output using Python >scripts. This involves multiple calls to C++ code, and would be impossible >to do using C++, since one has to call other Python libraries. > > > I see at least three options: > >> 1) Do everything in Python/numarray. > >That's my current situation. > >> 2) Do parts in Python and the other in C/C++. >> For example, pass a matrix of random numbers to your code from Python. The >> 'seed' never needs to leave Python. > >This doesn't work very well unless you know in advance how many random >numbers are needed (not the case, for example, for stochastic search >algorithms), and in any case is a rather clumsy way to do things. No >offense intended. > >> 3) Do it all in C/C++ - pass the 'seed' into your code that includes the >> random number generator(s) - there is C/C++ code around for this. Do you stuff >> and then return the 'seed' back with whatever else is required. > >Yes, but part of the point of mixed programming is that you have an >interpreted front end which can easily hook into other routines. Also, in >this case, you would not be passing the seed in, since there is nothing to >pass it in from. One would simply call system time or something similar to >obtain the seed. > >> You can email me privately if you want. > >I'll keep sending this to the list unless someone objects, since I think >this is of some general interest. > >Really, my main question was to whether my understanding of how to use the >Numarray random number facilities in C was correct or not. > > Faheem. From jmiller at stsci.edu Wed Oct 6 23:47:31 2004 From: jmiller at stsci.edu (Todd Miller) Date: Wed Oct 6 23:47:31 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code In-Reply-To: References: Message-ID: <1097073394.31512.76.camel@halloween.stsci.edu> On Tue, 2004-10-05 at 21:10, Perry Greenfield wrote: > Faheem Mitha wrote: > > > Dear People, > > > > I want to write some C++ code to link with Python, using the > > Boost.Python interface. I need to generate random numbers in the C++ > > code, and I was wondering as to the best way of doing this. > > > > Note that it is important that the random number generation interoperate > > seamlessly with Python, in the sense that the behavior of the calls to > > the RNG is the same whether calls are made at the C level or the Python > > level. I hope the reasons why this is important are obvious. > > > > I was thinking that the method should go like this. > > > > 1) When C/C++ code called, reads seed from python random state. > > > > 2) Does its stuff. > > > > 3) Writes seed back to python level when it exits. > > > > After doing a little investigation of the numarray.random_array python > > library and associated extension modules, it seems possible that the > > answer is simpler than I had supposed. However, I would appreciate it if > > someone would tell me if my understanding is incorrect in some places. > > > > Summary: It seems that I can just call all the C entry point routines > > defined in ranlib.h, without worrying about getting or setting seeds. > > > > Rationale: > > > > The structure of this random number facility has three parts, all > > files in > > Packages/RandomArray2/Src. > > > > 1) low-level C routines: Packages/RandomArray2/Src/com.c and > > Packages/RandomArray2/Src/ranlib.c. > > > > com.c: basic RNG stuff; getting and setting seeds etc. > > ranlib.c: Random number generator algorithms for different distributions > > etc. > > > > 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c. > > > > This interfaces the stuff in com.c and ranlib.c. > > > > 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py. > > > > This wraps the C interface. In most cases it does not do much > > else besides > > some basic argument error checking. > > > > From my perspective, the important thing is that the random > > number seed is > > only defined at C level as a static object, all the RNG stuff > > happens at C > > level, and the Python code just calls the C code as necessary. (I'm > > sketchy about the details of what is defined as the seed etc.) > > > > This is in contrast with the R RNG facility (the only other RNG > > facility I > > am familiar with), which uses macros SetRNGstate() and GetRNGstate() to > > read and write the seed, which is defined at R level. > > > > Therefore, the upshot is that the C routines in ranlib.h read and write > > the same seed as the python level functions do, so no special action is > > necessary with regard to the seed. > > > > Is this correct? > > > > In any case, it would be nice if something like the above was documented, > > so lost souls like myself don't have to go trawling through the source > > code to figure out what is going on. Of course it is nice that the source > > code is available, otherwise even that would be impossible. > > > > R documents this stuff in the "Writing R Extensions" manual, online at > > http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray > > manual could have a small section about this too. > > > > Regards, Faheem. > > > I'm not sure I understand what you want to do. Do you want to link > directly to the extension code from your C++ code? If so I'm wondering > why. It would make the most sense if the C++ code needed obtain > small numbers of random numbers in some iterative loop, and you wish > to use the same random number library that that numarray is using. > Otherwise, I would normally obtain the random number array > in python, then call the C++ extension. Perhaps I didn't read carefully > enough. Normally linking to an extension module involves some hacks > that I'm not sure were done for the randomarray module (the gory > details are in the python docs for extension modules), Todd can > check on that, I checked and there's no C level export of the ranlib interface, at least not in the "hacked" sense of an extension module C-API where the linkage is made indirect via an API pointer and bizarre macros. > I'm not sure I will have time (a superficial check > seems to indicate that it doesn't support direct linking, though > one could link to the underlying library I suppose). Ordinary C linkage to numarray.random_array.ranlib2 may be supported since as an extension it is also a shared library, but I've never tried it myself and I wonder if it would actually work. If anyone has tried something like that I'd be interested in hearing how it turned out. Without a really compelling reason, I'd avoid it myself. Regards, Todd From dd55 at cornell.edu Sun Oct 10 12:51:58 2004 From: dd55 at cornell.edu (Darren Dale) Date: Sun Oct 10 12:51:58 2004 Subject: [Numpy-discussion] ieeespecial Message-ID: <200410101547.18413.dd55@cornell.edu> Hello, I am getting invalid numeric result exceptions when dividing a complex array by zero. Is this the desired behavior? Also, while trying to find a way around the above problem, I ran ieeespecial.test and got the following output. I am running numarray 1.1 on python 2.3.3. Todd, this might be correlated with the numerix package in matplotlib. I tried importing numarray and ieeespecial without matplotlib and the ieeespecial.test was successful. Thanks, Darren In [31]: ieeespecial.test() Out[31]: inf ***************************************************************** Failure in example: inf # the repr() of inf may vary from platform to platform from line #6 of numarray.ieeespecial Expected: inf Got: Out[31]: nan ***************************************************************** Failure in example: nan # the repr() of nan may vary from platform to platform from line #8 of numarray.ieeespecial Expected: nan Got: Out[31]: (array([0, 2]), array([0, 3])) ***************************************************************** Failure in example: getinf(b) from line #20 of numarray.ieeespecial Expected: (array([0, 2]), array([0, 3])) Got: Out[31]: array([[ 999., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 999.], [ 12., 13., 14., 15.]]) ***************************************************************** Failure in example: a from line #26 of numarray.ieeespecial Expected: array([[ 999., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 999.], [ 12., 13., 14., 15.]]) Got: Out[31]: (array([0, 1, 2]), array([1, 2, 3])) ***************************************************************** Failure in example: getnan(a) from line #35 of numarray.ieeespecial Expected: (array([0, 1, 2]), array([1, 2, 3])) Got: ***************************************************************** 1 items had failures: 5 of 11 in numarray.ieeespecial ***Test Failed*** 5 failures. Out[31]: (5, 11) -- Darren From dd55 at cornell.edu Sun Oct 10 13:57:43 2004 From: dd55 at cornell.edu (Darren Dale) Date: Sun Oct 10 13:57:43 2004 Subject: [Numpy-discussion] ieeespecial In-Reply-To: <200410101547.18413.dd55@cornell.edu> References: <200410101547.18413.dd55@cornell.edu> Message-ID: <200410101653.51172.dd55@cornell.edu> On Sunday 10 October 2004 03:47 pm, Darren Dale wrote: > Hello, > > I am getting invalid numeric result exceptions when dividing a complex > array by zero. Is this the desired behavior? > > Also, while trying to find a way around the above problem, I ran > ieeespecial.test and got the following output. I am running numarray 1.1 on > python 2.3.3. Todd, this might be correlated with the numerix package in > matplotlib. I tried importing numarray and ieeespecial without matplotlib > and the ieeespecial.test was successful. > On a related note, ieeespecial.getnan appears to be incompatible with complex arrays, see below. I didnt mention in my last email that I built numarray for my existing blas/lapack libraries, will this change the behavior on my system from the default? Thanks, Darren >>> from numarray import * >>> from numarray.ieeespecial import * >>> b=arange(10,typecode=Complex64) >>> a=b/0 Warning: Encountered invalid numeric result(s) in divide >>> a array([ nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj]) >>> getnan(a) Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 117, ingetnan return _spec.index(a, _spec.NAN) File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 95, in index return _na.nonzero(mask(a, msk)) File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 87, in mask f = _na.ieeemask(a, m) File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 883, in _cache_miss2 mode, win1, win2, wout, cfunc, ufargs = \ File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 929, in _setup convtype1, convtype2, outtype, ucfunc \ File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 471, in _typematch newInputSignature = (self._typePromoter(intype, atypelist),)*2 File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 498, in _typePromoter raise TypeError("unable to find type to promote to") TypeError: unable to find type to promote to >>> getnan(a.real) (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),) >>> From aisaac at american.edu Sun Oct 10 15:57:18 2004 From: aisaac at american.edu (Alan G Isaac) Date: Sun Oct 10 15:57:18 2004 Subject: [Numpy-discussion] documentation error Message-ID: In the Numeric manual, there are two different defintions of the 'diagonal' function. The second definition appears to be incorrect. p.39: diagonal(a, k=0, axis1=0, axis2 = 1) returns the entries along the k th diagonal of a (k is an offset from the main diagonal). This is designed for 2d arrays. For larger arrays, it will return the diagonal of each 2d sub-array. p.44 diagonal(a, offset=0, axis1=0, axis2=1) The diagonal function takes an array a, and returns an array of rank 1 containing all of the elements of a such that the difference between their indices along the specified axes is equal to the specified offset. With the default values, this corresponds to all of the elements of the diagonal of a along the last two axes. fwiw, Alan Isaac From jmiller at stsci.edu Sun Oct 10 17:43:34 2004 From: jmiller at stsci.edu (Todd Miller) Date: Sun Oct 10 17:43:34 2004 Subject: [Numpy-discussion] ieeespecial In-Reply-To: <200410101653.51172.dd55@cornell.edu> References: <200410101547.18413.dd55@cornell.edu> <200410101653.51172.dd55@cornell.edu> Message-ID: <1097454870.3741.48.camel@localhost.localdomain> On Sun, 2004-10-10 at 16:53, Darren Dale wrote: > On Sunday 10 October 2004 03:47 pm, Darren Dale wrote: > > Hello, > > > > I am getting invalid numeric result exceptions when dividing a complex > > array by zero. Is this the desired behavior? > > > > Also, while trying to find a way around the above problem, I ran > > ieeespecial.test and got the following output. I am running numarray 1.1 on > > python 2.3.3. Todd, this might be correlated with the numerix package in > > matplotlib. I tried importing numarray and ieeespecial without matplotlib > > and the ieeespecial.test was successful. > > > > On a related note, ieeespecial.getnan appears to be incompatible with complex > arrays, see below. Thanks for pointing this out. It's an oversight in the implementation of ieeespecial and I'll fix it. > I didnt mention in my last email that I built numarray for > my existing blas/lapack libraries, will this change the behavior on my system > from the default? Regarding ieeespecial and complex division by zero, I am pretty sure blas/lapack linkage is irrelevant. But... I very rarely link with an external blas/lapack, so if there is an issue, I'm unlikely to have come across it myself. Still, off the top of my head, blas/lapack is unrelated. Regards, Todd > Thanks, > Darren > > >>> from numarray import * > >>> from numarray.ieeespecial import * > >>> b=arange(10,typecode=Complex64) > >>> a=b/0 > Warning: Encountered invalid numeric result(s) in divide > >>> a > array([ nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj]) > >>> getnan(a) > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 117, > ingetnan > return _spec.index(a, _spec.NAN) > File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 95, in > index > return _na.nonzero(mask(a, msk)) > File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 87, in > mask > f = _na.ieeemask(a, m) > File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 883, in > _cache_miss2 > mode, win1, win2, wout, cfunc, ufargs = \ > File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 929, in > _setup > convtype1, convtype2, outtype, ucfunc \ > File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 471, in > _typematch > newInputSignature = (self._typePromoter(intype, atypelist),)*2 > File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 498, in > _typePromoter > raise TypeError("unable to find type to promote to") > TypeError: unable to find type to promote to > > >>> getnan(a.real) > (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),) > >>> From dd55 at cornell.edu Sun Oct 10 18:08:10 2004 From: dd55 at cornell.edu (Darren Dale) Date: Sun Oct 10 18:08:10 2004 Subject: [Numpy-discussion] ieeespecial In-Reply-To: <1097454560.3741.41.camel@localhost.localdomain> References: <200410101547.18413.dd55@cornell.edu> <1097454560.3741.41.camel@localhost.localdomain> Message-ID: <200410102103.42221.dd55@cornell.edu> On Sunday 10 October 2004 08:29 pm, you wrote: > On Sun, 2004-10-10 at 15:47, Darren Dale wrote: > > Hello, > > > > I am getting invalid numeric result exceptions when dividing a complex > > array by zero. Is this the desired behavior? > > This is what I would have expected, and examining the definition I have > for complex division in numarray/Include/numarray/numcomplex.h, I don't > see a problem. The definition should probably be checked by an extra > set of eyes. Looks OK to me. Hi Todd, Sorry, I wasnt clear. I was wondering if it should raise a divide by zero exception and return an inf, as the real datatypes do, instead of an invalid numeric result and a nan. As it stands now, we have to handle divide by zero differently for different data types, if we need to filter/replace such values. Thanks, Darren From jmiller at stsci.edu Sun Oct 10 18:44:38 2004 From: jmiller at stsci.edu (Todd Miller) Date: Sun Oct 10 18:44:38 2004 Subject: [Numpy-discussion] ieeespecial In-Reply-To: <200410101547.18413.dd55@cornell.edu> References: <200410101547.18413.dd55@cornell.edu> Message-ID: <1097454560.3741.41.camel@localhost.localdomain> On Sun, 2004-10-10 at 15:47, Darren Dale wrote: > Hello, > > I am getting invalid numeric result exceptions when dividing a complex array > by zero. Is this the desired behavior? This is what I would have expected, and examining the definition I have for complex division in numarray/Include/numarray/numcomplex.h, I don't see a problem. The definition should probably be checked by an extra set of eyes. Looks OK to me. > Also, while trying to find a way around the above problem, I ran > ieeespecial.test and got the following output. I am running numarray 1.1 on > python 2.3.3. Todd, this might be correlated with the numerix package in > matplotlib. I tried importing numarray and ieeespecial without matplotlib and > the ieeespecial.test was successful. > I tried this with an ordinary Python shell and ieeespecial.test() completed without errors. Looking at your test output, I noticed it was skewed, and guessed there was an I/O synchronization issue messing up doctest. I tried the same test under IPython w/o matplotlib and duplicated your results, so I think the problem is an IPython/doctest issue. Regards, Todd > Thanks, > > Darren > > > In [31]: ieeespecial.test() > Out[31]: inf > ***************************************************************** > Failure in example: > inf # the repr() of inf may vary from platform to platform > from line #6 of numarray.ieeespecial > Expected: inf > Got: > Out[31]: nan > ***************************************************************** > Failure in example: > nan # the repr() of nan may vary from platform to platform > from line #8 of numarray.ieeespecial > Expected: nan > Got: > Out[31]: (array([0, 2]), array([0, 3])) > ***************************************************************** > Failure in example: getinf(b) > from line #20 of numarray.ieeespecial > Expected: (array([0, 2]), array([0, 3])) > Got: > Out[31]: > array([[ 999., 1., 2., 3.], > [ 4., 5., 6., 7.], > [ 8., 9., 10., 999.], > [ 12., 13., 14., 15.]]) > ***************************************************************** > Failure in example: a > from line #26 of numarray.ieeespecial > Expected: > array([[ 999., 1., 2., 3.], > [ 4., 5., 6., 7.], > [ 8., 9., 10., 999.], > [ 12., 13., 14., 15.]]) > Got: > Out[31]: (array([0, 1, 2]), array([1, 2, 3])) > ***************************************************************** > Failure in example: getnan(a) > from line #35 of numarray.ieeespecial > Expected: (array([0, 1, 2]), array([1, 2, 3])) > Got: > ***************************************************************** > 1 items had failures: > 5 of 11 in numarray.ieeespecial > ***Test Failed*** 5 failures. > Out[31]: (5, 11) -- From aisaac at american.edu Sun Oct 10 18:59:17 2004 From: aisaac at american.edu (Alan G Isaac) Date: Sun Oct 10 18:59:17 2004 Subject: [Numpy-discussion] location of tutorial Message-ID: p.29 of the Numeric manual refers to http://www.python.org/doc/tut/functional.html which no longer exists. I suggest substituting http://docs.python.org/tut/tut.html fwiw, Alan Isaac From jmiller at stsci.edu Mon Oct 11 04:28:51 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Oct 11 04:28:51 2004 Subject: [Numpy-discussion] ieeespecial In-Reply-To: <200410102103.42221.dd55@cornell.edu> References: <200410101547.18413.dd55@cornell.edu> <1097454560.3741.41.camel@localhost.localdomain> <200410102103.42221.dd55@cornell.edu> Message-ID: <1097493501.2619.26.camel@localhost.localdomain> On Sun, 2004-10-10 at 21:03, Darren Dale wrote: > On Sunday 10 October 2004 08:29 pm, you wrote: > > On Sun, 2004-10-10 at 15:47, Darren Dale wrote: > > > Hello, > > > > > > I am getting invalid numeric result exceptions when dividing a complex > > > array by zero. Is this the desired behavior? > > > > > > This is what I would have expected, and examining the definition I have > > for complex division in numarray/Include/numarray/numcomplex.h, I don't > > see a problem. The definition should probably be checked by an extra > > set of eyes. Looks OK to me. > > Hi Todd, > > Sorry, I wasn't clear. I was wondering if it should raise a divide by zero > exception and return an inf, as the real data types do, instead of an invalid > numeric result and a nan. As it stands now, we have to handle divide by zero > differently for different data types, if we need to filter/replace such > values. Numarray's error handling system is pretty flexible, and can raise exceptions on divide by zero if configured properly, or can ignore them altogether. See section 4.9 in the numarray-1.1 manual here: http://prdownloads.sourceforge.net/numpy/numarray-1.1.pdf?download It's an interesting question regarding the inf vs. nan. Looking at the complex division macro (NUM_CDIV) in numcomplex.h, I don't understand why we're getting nans now and not infs; it might be a bug in the macro, but I don't see it. Regards, Todd From stephen.walton at csun.edu Mon Oct 11 20:16:55 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Mon Oct 11 20:16:55 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: References: Message-ID: <1097550159.2568.5.camel@localhost.localdomain> On Sun, 2004-10-10 at 11:33, Alan G Isaac wrote: > In the Numeric manual, there are two different defintions of the > 'diagonal' function. The second definition appears to be incorrect. > > p.39: > diagonal(a, k=0, axis1=0, axis2 = 1) > p.44 > diagonal(a, offset=0, axis1=0, axis2=1) Are you sure? On my system, it appears that the second definition is correct in both Numeric 23.3 and numarray 1.1. From a.schmolck at gmx.net Tue Oct 12 02:40:55 2004 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Tue Oct 12 02:40:55 2004 Subject: [Numpy-discussion] A disconnected numarray rant Message-ID: Hi, I'm taking a 1 month break from computers (i.e. I will be completely off-line), and I have to catch a train in an hour; but I've recently bitten the bullet and made a matrix class I've been using for some time work with numarray; I've written down a number of things that occured to me while I was doing it, including some things which I think are bugs in numarray, so I thought at least posting the bugs would be a useful service; the rest is very raw and essentially unedited cut-and-paste of these notes -- sorry about that and I hope it doesn't contain anything particularly offensive. P.S. just dumped the code for the matrix class (nummat) at http://www.dcs.ex.ac.uk/~aschmolc/Stuff/ 'as The following are my notes: Things that fairly clearly seem to be bugs: - numarray.Int32 etc. can't be pickled - ``a = array(1+0j); a.imag = a.real * 10`` => IndexError - array(0, type=Float64) + 1e3000 => `inf` with right error modes but array(0, type=Float32) + 1e3000 => `OverflowError` - numarray.array(10)/numarray.array(0) => 0 - numarray.array(10000000000000L) => array(1316134912) - numarray.where(0,1,0) => array([0]) - l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l => [1, 2, 3] a = array([1,2,3]); numarray.put(a,numarray.array([1,2,0]),[0,0,0]); a => array([0, 0, 0]) - repr(numarray.array([],typecode='i')) (etc. etc.) => "numarray.array([])" - getattr(array([1,2,3]), '_aligned') => SystemError - obscure: numarray.where(0, matrix(568, convert_scalars=True),2) => ValueError (tries __len__ which fails, as len(array(568)) also fails) Numeric incompatiblilities (that are either undocumented or bug-like) - numarray.array('a', typecode='O') => TypeError (object arrays) - for extra fun try: numarray.array(1, type=numarray.Object) -=> RuntimeError something entirely different - nonzero is completely incompatible - shape(None) etc. no longer works (IMHO a bug) - cross_correlate & average missing - left_shift et al missing - numarray.sqrt(a,a) is None (*not* the result, as it used to be) - num.put(a, [0,1,2,3], [10,20]) style behavior seems unavailable (without numarray.numeric) put(array([[ 0., 1., 2.], [ 3., 4., 5.]]), [1, 4], [10,40]) fails - boolean testing (not even bool(array(0)) works; I'm not sure this is good) - Generally different handling of rank0-arrays; e.g. ``type(num.array(1.0) + 0) is float``; one potentially very nasty gotcha are inplace operations (e.g. a**=2) which have totally different semantics for python scalars and rank0 arrays, which, unlike Attribute errors on ``a.shape``, can lead to nasty bugs in corner cases (e.g. when a reduction just infrequently yields scalar ``a``) -- I think this should be mentioned in a gotchas section (another possible entry would be the need to use .copy() to **save** memory on slicing and 1xN, Nx1 matrices versus vectors (people are not used to thinking properly about rank from mathematical training or matlab exposure)). - asarray downcasts arrays (e.g.: asarray(array([1.,2.,3.]),'i')) - numarray.ones(-5) => MemoryError (ValueError would be nicer) - numarray.ones(2.0), numarray.ones([2]) fail (cf. numarray.range(2.0)) b=num.array([[1,2,3,4],[5,6,7,8]]*2) assert eq(num.diagonal(b), [1,6,3,8]) assert eq(num.diagonal(b, -1), [5,2,7]) c = num.array([b,b]) assert eq(num.diagonal(c,1), [[2,7,4], [2,7,4]]) - no a.toscalar() !!! - matrixmultiply in the docs - what's the point of swapaxes (i.e. why not have a generalized in-place transpose?) - what's the point of innerproduct? - indexing by a list is different from indexing by tuple (I haven't had time to look closely at the docs whether that's intentional) - doesn't know about Numeric's bizzarre '\x0b' typecode - numarray.sqrt.reduce([]) raises (sensibly) TypeError, not ValueError - len(array(1)) or array(1)[0] won't work anymore (understandable, but should be documented) - (should maximim, minimum reduce to -inf and inf?) - is not a very helpful repr; should be possible to get to the ufunc itself - as in Numeric numarray.maximum.reduce(numarray.array([0,-0.])) => -0.0 - __array__ protocol no longer supported (how can a non-derived class convert itself efficiently to an array?) Documentation Gotchas - p. 34 IMO row vector is used incorrectly; row and column vectors are really matrices (i.e. have rank 2) so ``array([[1,2,3]])`` would be a row vector - No proper explanation of differences between Numeric and numarray, or numarray.numeric module differences to proper (e.g. argmin) - No migration and best-practice advice (e.g. there should be a standard way for packages which work with both numarray and numeric as backends to let the user choose his preference; how about setting an environment var NumPy or something?) Waffle ------ - there *really* ought to be an array equality function (with optional tolerance); it's quite difficult to get right for are normal user (nans; zero-size arrays etc.) and it's often required, especially for testing - rank preserving reduction seems useful as an option would be nice -- e.g. to subtract out or divide by the reduced portion (which currently won't e.g. work for columns without adding a unit-dimension by hand). Design The (AFAICS) benefit-free but downside-rich introduction of `type` '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' Is there any reason that Typecode objects that compare as desired to the relevant strings ("i", "d") wouldn't have done? Now there is an explosion and confusion of interfaces -- some numpy code will now only except type(code)s as "typecode" keyword parameter (even in numarray! see numarray.mlab!) and other stuff Never mind that type already is a highly overused word in the python world. The big method bloat. ''''''''''''''''''''' As it says in the Numeric manual introductions there were "good reasons" for "very few array methods" -- now there are **56** public methods and 8 public attributes (public == not starting with '_'); of those 56 methods about 11 are accessors and of the rest about half are redundant or worse (i.e. they either also exist as numarray functions (argmin, argmax, diagonal, ...) or they really ought to be functions (mean, stddev) or they are quite confusing (``a.min``, ``a.max`` which behave quite differenlty from ``a.argmin`` and ``a.argmax``, never mind ``numarray.minimum``) or simply utterly pointless (``a.nelements`` == ``a.size``)). - argmin, argmax : what's wrong with numarray.argmin, numarray.argmax??? Why do argmin/argmax and max/min have completely different interfaces??? If there really is a need for these (there isn't) anything a.min and a.max should be called a.flatmin, a.flatmax - diagonal, mean, nelements, nonzero, ... - perversely the **only** function that I can think off that could have sensibly become a method hasn't: ``put`` (it used to work only on arrays under Numeric and not without reason, so making it a method would have been sensible; numarray.put of course also "works" on non-arrays, it just doesn't do anything with them) Test Code ''''''''' numtest.py doesn't inspire full confidence (it's about 1000 lines of actual code but it doesn't seem that clearly structured and AFAICT contains no single loop (and that despite the diversity of shapes, types etc. that exist in numarray -- why not try something slightly more systematic?)). From avhot at email.msn.com Tue Oct 12 06:11:30 2004 From: avhot at email.msn.com (Shelia Mendez) Date: Tue Oct 12 06:11:30 2004 Subject: [Numpy-discussion] Cheap software for you please. 6610536 Message-ID: <43647672541191164755429@email.msn.com> An HTML attachment was scrubbed... URL: From aisaac at american.edu Tue Oct 12 07:03:18 2004 From: aisaac at american.edu (Alan G Isaac) Date: Tue Oct 12 07:03:18 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: <1097550159.2568.5.camel@localhost.localdomain> References: <1097550159.2568.5.camel@localhost.localdomain> Message-ID: > On Sun, 2004-10-10 at 11:33, Alan G Isaac wrote: >> In the Numeric manual, there are two different defintions of the >> 'diagonal' function. The second definition appears to be incorrect. On Mon, 11 Oct 2004, Stephen Walton apparently wrote: > Are you sure? On my system, it appears that the second definition is > correct in both Numeric 23.3 and numarray 1.1. You did not quote the problematic portion: The diagonal function takes an array a, and returns an array of rank 1 ... With the default values, this corresponds to all of the elements of the diagonal of a along the last two axes. Contrast: >>> import Numeric >>> Numeric.__version__ '23.1' >>> x=[[[1,2],[3,4]],[[5,6],[7,8]]] >>> Numeric.diagonal(x) array([[1, 4], [5, 8]]) fwiw, Alan Isaac From stephen.walton at csun.edu Tue Oct 12 08:42:04 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Tue Oct 12 08:42:04 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: References: <1097550159.2568.5.camel@localhost.localdomain> Message-ID: <1097595580.24491.4.camel@freyer.sfo.csun.edu> On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote: > On Mon, 11 Oct 2004, Stephen Walton apparently wrote: > > Are you sure? On my system, it appears that the second definition is > > correct in both Numeric 23.3 and numarray 1.1. > > > You did not quote the problematic portion: > The diagonal function takes an array a, and returns > an array of rank 1 ... Ah, I thought you were referring to the fact that, in the first version in the documentation, the second, named argument is given as "k" but in the second version it is "offset". A look at the source reveals the second keyword name is the correct one. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From aisaac at american.edu Tue Oct 12 12:25:01 2004 From: aisaac at american.edu (Alan G Isaac) Date: Tue Oct 12 12:25:01 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: <1097595580.24491.4.camel@freyer.sfo.csun.edu> References: <1097550159.2568.5.camel@localhost.localdomain><1097595580.24491.4.camel@freyer.sfo.csun.edu> Message-ID: > On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote: >> You did not quote the problematic portion: >> The diagonal function takes an array a, and returns >> an array of rank 1 ... On Tue, 12 Oct 2004, Stephen Walton apparently wrote: > A look at the source reveals the > second keyword name is the correct one. OK then, we have a double problem. The first version gives the correct description but uses the wrong keyword. The second version gives the wrong description but uses the correct keyword. So, how do we file a documentation bug? Cheers, Alan Isaac From perry at stsci.edu Tue Oct 12 12:31:17 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Oct 12 12:31:17 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: Message-ID: > So, how do we file a documentation bug? > > Cheers, > Alan Isaac > I'd say just like any other kind of bug. Perry From jmiller at stsci.edu Tue Oct 12 12:40:19 2004 From: jmiller at stsci.edu (Todd Miller) Date: Tue Oct 12 12:40:19 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: References: <1097550159.2568.5.camel@localhost.localdomain> <1097595580.24491.4.camel@freyer.sfo.csun.edu> Message-ID: <1097609991.30171.556.camel@halloween.stsci.edu> On Tue, 2004-10-12 at 12:40, Alan G Isaac wrote: > > On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote: > >> You did not quote the problematic portion: > >> The diagonal function takes an array a, and returns > >> an array of rank 1 ... > > > > On Tue, 12 Oct 2004, Stephen Walton apparently wrote: > > A look at the source reveals the > > second keyword name is the correct one. > > > OK then, we have a double problem. > The first version gives the correct description > but uses the wrong keyword. > The second version gives the wrong description > but uses the correct keyword. > > So, how do we file a documentation bug? > Go here: http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse then "Submit New", and set the "category" to "documentation. Regards, Todd > Cheers, > Alan Isaac > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From pearu at scipy.org Wed Oct 13 06:02:48 2004 From: pearu at scipy.org (Pearu Peterson) Date: Wed Oct 13 06:02:48 2004 Subject: [Numpy-discussion] ANN: SciPy 0.3.2 Released Message-ID: Hi, Scipy 0.3.2 has been released and binaries are available from the scipy.org site: http://www.scipy.org Scipy 0.3.2 is a bug fix release of Scipy 0.3 including the following new features: - wxPython 2.5 support - reading/writing dense/sparse matrices in Matrix Market format - iterative solvers, new functions sqrtm, hessenberg - Constrained Optimization BY Linear Approximation - discrete Boltzmann, Planck, Levy distributions - Scipy tests pass now also on 64-bit systems and Mac OSX etc. The complete release notes can be found here: http://www.scipy.org/download/scipy_release_notes_0.3.2.html Best regards, Pearu BTW Scipy is: ------------- Scipy is an open source library of scientific tools for Python. Scipy supplements the popular Numeric module, gathering a variety of high level science and engineering modules together as a single package. Scipy includes modules for graphics and plotting, optimization, integration, special functions, signal and image processing, genetic algorithms, ODE solvers, and others. From jmiller at stsci.edu Wed Oct 13 14:35:08 2004 From: jmiller at stsci.edu (Todd Miller) Date: Wed Oct 13 14:35:08 2004 Subject: [Numpy-discussion] A disconnected numarray rant In-Reply-To: References: Message-ID: <1097703239.631.923.camel@halloween.stsci.edu> Hi Alexander, Thanks for taking the time to provide us with feedback. I've responded to many of your points below. [and in the interest of keeping the text bloat down, I've interjected my own comments in brackets--Perry] On Tue, 2004-10-12 at 05:37, Alexander Schmolck wrote: > Hi, > > I'm taking a 1 month break from computers (i.e. I will be completely > off-line), and I have to catch a train in an hour; but I've recently > bitten > the bullet and made a matrix class I've been using for some time work > with > numarray; I've written down a number of things that occured to me > while I was > doing it, including some things which I think are bugs in numarray, so > I > thought at least posting the bugs would be a useful service; the rest > is very > raw and essentially unedited cut-and-paste of these notes -- sorry > about that > and I hope it doesn't contain anything particularly offensive. > > P.S. just dumped the code for the matrix class (nummat) at > http://www.dcs.ex.ac.uk/~aschmolc/Stuff/ > > 'as > > The following are my notes: > > > Things that fairly clearly seem to be bugs: > - numarray.Int32 etc. can't be pickled Known limitation, but OK. Arrays can be pickled, as can Numeric typecodes so I'm not sure how critical this omission is. > - ``a = array(1+0j); a.imag = a.real * 10`` => IndexError > - array(0, type=Float64) + 1e3000 => `inf` with right error modes > but array(0, type=Float32) + 1e3000 => `OverflowError` > - numarray.array(10)/numarray.array(0) => 0 > - numarray.array(10000000000000L) => array(1316134912) > - numarray.where(0,1,0) => array([0]) There seems to be an infinity of rank-0 issues and so little justification for having them that at one point we considered ripping them out altogether. Noted, but low priority. [Amen. If I had known the problems that rank-0 zero arrays would cause I think I would have excluded them. I'm not sure I see the need for them now that coercion rules have changed and helper functions to change scalars into rank-1 len-1 arrays which serve almost all other purposes. I'm interested in seeing what real purpose they serve now (I understand the backward compatibility issue, but backward compatibility is not the be all and end all for numarray; more on that later)] > - l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l > => [1, 2, 3] Should raise a TypeError I guess. > a = array([1,2,3]); > numarray.put(a,numarray.array([1,2,0]),[0,0,0]); a => array([0, 0, 0]) I don't see what's wrong here. > - repr(numarray.array([],typecode='i')) (etc. etc.) => > "numarray.array([])" Zero length arrays are rather like rank-0 arrays: low priority. Agreed... this is a small wart. > - getattr(array([1,2,3]), '_aligned') => SystemError Interesting. I've been thinking about ripping out the _align and _contiguous self-test hacks for a long time. You've made up my mind. > - obscure: numarray.where(0, matrix(568, convert_scalars=True),2) > => > ValueError (tries __len__ which fails, as len(array(568)) also > fails) I think this may boil down to "no where() for object arrays". numarray.where() can't handle object arrays and there is no numarray.objects.where(). Not implemented yet. > Numeric incompatiblilities (that are either undocumented or bug-like) The best Numeric compatibility in numarray comes from: import numarray.numeric as Numeric It's still not perfect, but it is more compatible than ordinary numarray. > - numarray.array('a', typecode='O') => TypeError (object arrays) > - for extra fun try: numarray.array(1, type=numarray.Object) -=> > RuntimeError > something entirely different Object arrays in numarray do not have the synergy they have in Numeric. In particular, numarray.array() can't create them, only numarray.objects.array(). [At the time we added object arrays, we noticed that they were not safe in Numeric; that is, Numeric was not properly handling reference counts of objects in arrays for at least some operations and it was possible to segfault object arrays. This may have changed since then; we haven't had a chance to check the current status. But the point is that handling object arrays safely is a lot more than just loading them with object pointers. Any function that can set values in arrays needs to handle their refcounts, and that isn't all that trivial. We took a short cut of using a Python implementation for object arrays that doesn't have all the old functionality, but also didn't have the problems that they did at the time.] > - nonzero is completely incompatible numarray.numeric covers this. numarray's nonzero() is more powerful, capable of handling multidimensional arrays, so it returns a tuple of values rather than a single value. It's unfortunate that we chose to use the name nonzero() for the "new" function; it has the right interface and the wrong name. Keep in mind though, our compatibility goals have grown immensely since we started. > - shape(None) etc. no longer works (IMHO a bug) This may be related to the object array synergy. I think numarray.asarray() is the problem here, since it doesn't know how to create object arrays. > - cross_correlate & average missing I think cross_correlate is in numarray.convolve.correlate. It was a conscious choice not to put it in core numarray. Average has never been implemented and should be, especially since it has different semantics than the mean() method. > - left_shift et al missing These were renamed lshift and rshift. Note that << works fine. Synonyms should probably be added. > - numarray.sqrt(a,a) is None (*not* the result, as it used to be) What do you want here? What we have now is, IMO, correct. [Amen. This was intentionally changed from Numeric.] > - num.put(a, [0,1,2,3], [10,20]) style behavior seems unavailable > (without numarray.numeric) I wasn't exactly sure what the expected behavior was for this, but guessed is was some kind of repeat. If that's what the behavior was, Perry and I don't really like it. Besides, numarray.numeric.put *is* Numeric.put, modulo numarray underpinnings. > put(array([[ 0., 1., 2.], [ 3., 4., 5.]]), [1, 4], [10,40]) > fails numarray.put() does have different semantics for multi-dimensional destinations... you need multi-dimensional indexes (i.e. a tuple of index arrays). Again, there's now numarray.numeric.put(). > - boolean testing (not even bool(array(0)) works; I'm not sure this is > good) [I am. This was a clear and explicit decision to not replicate Numeric behavior. I'm convinced that it is the right decision. There is just too much confusion about what the truth value of an array should be. Helper functions should be used to make it unambiguous.] > - Generally different handling of rank0-arrays; e.g. > ``type(num.array(1.0) + > 0) is float``; one potentially very nasty gotcha are inplace > operations > (e.g. a**=2) which have totally different semantics for python > scalars and > rank0 arrays, which, unlike Attribute errors on ``a.shape``, can > lead to > nasty bugs in corner cases (e.g. when a reduction just infrequently > yields > scalar ``a``) -- I think this should be mentioned in a gotchas > section We have areduce() for this case, which always returns an array. > (another possible entry would be the need to use .copy() to **save** > memory > on slicing and 1xN, Nx1 matrices versus vectors (people are not used > to > thinking properly about rank from mathematical training or matlab > exposure)). [You will need to elaborate about what you mean here. E.g., as to the first: I'm guessing you mean when a slice is taken and then the original array is deleted. But it isn't clear.] > - asarray downcasts arrays (e.g.: asarray(array([1.,2.,3.]),'i')) True enough. Is there some reason why the method should silently succeed (I know we wanted that) and the function should not? > - numarray.ones(-5) => MemoryError (ValueError would be nicer) Easy to change. > - numarray.ones(2.0), This fails, and that's fine by me. The idea of floating point shapes seems bogus. > numarray.ones([2]) AFIK, this works, and should work. > fail (cf. numarray.range(2.0)) IMHO, arange() is a special case and not really equivalent to numarray.ones(). > b=num.array([[1,2,3,4],[5,6,7,8]]*2) > assert eq(num.diagonal(b), [1,6,3,8]) > assert eq(num.diagonal(b, -1), [5,2,7]) > c = num.array([b,b]) > assert eq(num.diagonal(c,1), [[2,7,4], [2,7,4]]) > - no a.toscalar() !!! a.toscalar() is written a[()] in numarray. [This is one method that shouldn't be there IMO. What would people expect it to do for arrays with len>1 ?] > - matrixmultiply in the docs OK. > - what's the point of swapaxes (i.e. why not have a generalized > in-place > transpose?) It's a very common function in implementation of numarray/Numeric. [In many cases it is far easier to use than an generalized transpose (which does exist, but requires all axes to be explicitly given)] > - what's the point of innerproduct? Compatibility. [For a while the flavor is: "dammit, why aren't you compatible?" Now it's: "dammit, why are you compatible?"] > - indexing by a list is different from indexing by tuple (I haven't > had time > to look closely at the docs whether that's intentional) It's intentional. Indexing by a list is "array" indexing. Indexing by a tuple is not. Thus, a 3D array by [1,2,3] is pulling out 2D blocks, while (1,2,3) is pulling out a single scalar. [In particular, tuples have a special meaning for indexing; this distinction is unavoidable since it is a Python language issue.] > - doesn't know about Numeric's bizzarre '\x0b' typecode Me either. Should we add this? [Not unless there is a good reason. What's it for? Why are you using it (particularly since you called it bizarre)?] > - numarray.sqrt.reduce([]) raises (sensibly) TypeError, not ValueError Got lucky I guess. > - len(array(1)) or array(1)[0] won't work anymore (understandable, but > should be documented) OK. > - (should maximim, minimum reduce to -inf and inf?) Don't they? > - is not > a very helpful repr; should be possible to get to the ufunc itself Doesn't this comment fly in the face of Python itself? [I imagine it is possible, but why? repr(dir) doesn't give you a usable function creator, nor does it work in Numeric.] > - as in Numeric numarray.maximum.reduce(numarray.array([0,-0.])) => > -0.0 Talk about fine points... noted. I think the problem is that 0.0 == -0.0, so there's no way for the reduction to get it right without adding special code to look for this case, and that isn't gonna happen without a strong case being made. [Again, a very good case needs to be made for handling this. I doubt that it is important to many, and as Todd mentions, not easy to handle.] > - __array__ protocol no longer supported (how can a non-derived class > convert > itself efficiently to an array?) Maybe an old-timer can explain how this worked for Numeric. I think this is only partially implemented in numarray and that maybe we need to add a check for an __array__() method to numarray.array(). > Documentation Gotchas > - p. 34 IMO row vector is used incorrectly; row and column vectors are > really > matrices (i.e. have rank 2) so ``array([[1,2,3]])`` would be a > row vector Sounds reasonable. > - No proper explanation of differences between Numeric and numarray, > or > numarray.numeric module differences to proper (e.g. argmin) If there is, I don't know where it is. Noted, but I'm not really an encyclopedia of these facts myself. > - No migration and best-practice advice (e.g. there should be a > standard way > for packages which work with both numarray and numeric as backends > to let > the user choose his preference; how about setting an environment var > NumPy > or something?) We're just working this out ourselves. [Let me elaborate more. We haven't really had much experience yet porting tons of Numeric code (MA is about the only example). We are working on scipy now so I expect that in a few months we will know much better what the most important porting issues are. At the moment, this is better documented by others.] > Waffle [meaning?] > ------ > > - there *really* ought to be an array equality function (with optional > tolerance); it's quite difficult to get right for are normal user > (nans; > zero-size arrays etc.) and it's often required, especially for > testing You're right. Want submit one? [Make sure it isn't dependent on the underlying C compiler's libraries for testing floating point special values!] > - rank preserving reduction seems useful as an option would be nice -- > e.g. to > subtract out or divide by the reduced portion (which currently won't > e.g. > work for columns without adding a unit-dimension by hand). Sounds like an interesting idea, but also method bloat. > Design > > The (AFAICS) benefit-free but downside-rich introduction of `type` > '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' > > Is there any reason that Typecode objects that compare as desired to > the > relevant strings ("i", "d") wouldn't have done? Now there is an > explosion > and confusion of interfaces -- some numpy code will now only except > type(code)s as "typecode" keyword parameter (even in numarray! see > numarray.mlab!) and other stuff > > Never mind that type already is a highly overused word in the python > world. Personally, I like type because it's succinct and we have type objects, not single character codes. More importantly, Perry likes type, and the bottom line is that it's his shot to call and he's called it. [We wrestled with this a while. Given that the representation of the type had changed from a character code, typecode is clearly misleading and inappropriate. It is there only for backward compatibility; for new code to be used under numarray only, people shouldn't use it. Type certainly seemed by far the most descriptive and accurate term. It does have the drawback of overloading the type function. Other considerations were things like atype, but type is what we went with.] > The big method bloat. > ''''''''''''''''''''' > > As it says in the Numeric manual introductions there were "good > reasons" for I actually don't buy the reasons myself. Some methods are natural, convenient, and good so I need to hear more voices arguing this point before I'll budge. Clearly there is *some* bloat, but identifying what to ax is more difficult. I suppose we could do a vote to clean this up. > "very few array methods" -- now there are **56** public methods and > 8 public > attributes (public == not starting with '_'); of those 56 methods > about 11 > are accessors and of the rest about half are redundant or worse > (i.e. they > either also exist as numarray functions (argmin, argmax, diagonal, > ...) or Which of the public attributes do you have a problem with? Which accessors? > they really ought to be functions (mean, stddev) or they are quite > confusing The need for these is common so I thought it would be good to add them. Functions could be added as well. > (``a.min``, ``a.max`` These require tricks to get right so we added them. The doc-strings explain what they do. > which behave quite differenlty from ``a.argmin`` and > ``a.argmax``, Good point. These are inconsistent with min and max, which were added independently at a later date. I'm thinking we should deprecate the argmin and argmax methods, which I added hoping to do polymorphism for strings and records and if I recall correctly never did anyway. IMHO, min(), max(), mean(), and stddev() are simple, useful, and should remain. > never mind ``numarray.minimum``) or min != minimum, and because it is a little tricky to get right, we codified it as a method. > simply utterly pointless > (``a.nelements`` == ``a.size``)). I added nelements() because I needed it and didn't know about a.size()... simple as that. a.size() came later for compatibility only. [I'll argue that nelements is far clearer in meaning. What does size mean? Total bytes? Total number of elements? Sorry, I disagree on this one.] > If there really is a need for these (there isn't) if anything a.min > and a.max > should be called a.flatmin, a.flatmax flatmin is certainly clear, but the min/max docstrings also explain it with no fuzz. > - diagonal, mean, nelements, nonzero, ... nonzero(), and diagonal() I could care less about so they can probably be deprecated and removed. I like mean(). > - perversely the **only** function that I can think off that could > have > sensibly become a method hasn't: ``put`` (it used to work only on > arrays > under Numeric and not without reason, so making it a method would > have > been sensible; numarray.put of course also "works" on non-arrays, > it just > doesn't do anything with them) Well, we need the numarray.put() function for compatibility, and there's already a more succinct syntax for put(), which is array based indexing so I don't see any point in adding a put() method. > Test Code > ''''''''' > numtest.py doesn't inspire full confidence (it's about 1000 lines of > actual > code but it doesn't seem that clearly structured and AFAICT contains > no > single loop (and that despite the diversity of shapes, types etc. > that exist > in numarray -- why not try something slightly more systematic?)) Testing could certainly be better. unittest might work better for this kind of thing than doctest. I agree that we should test for a wider variety of shapes, types, sizes, and behaviors but it takes time and effort to do it so it hasn't been done yet. There's little doubt we'd find bugs and the system would be better for it. [On the other hand, is it the most important thing to do next? Any volunteers to improve the test suite? It may not be the most complete and systematic one out there, but it's at least as good as the one for Numeric ;-)] There's a lot of input here. We'll see what we can do. Thanks again. Regards, Todd [A few more editorial comments. When we started numarray, compatibility was not high on the list of priorities, so the initial implementation didn't focus on it. A number of the problems you point out reflect that origin. While it is more important, it isn't the only guide. We seek compatibility when there is no strong reason to be incompatible. But there are a number of issues where we definitely wanted different behavior (if it were to be completely compatible, we wouldn't have bothered in the first place; we needed some changes). Given the odd corners you've run into, it makes me curious to see the code that generated this; particularly with regard to rank-0 arrays. If I get a chance I'll take a look at the link you provided. I wonder if it is typical of what other users will encounter or not. I guess our experience in porting scipy will give us a better indication. To summarize what we see as work that should be done to address the points made: rank-0 issues: 1) a.imag doesn't work 2) array(0, type=Float64) + 1e3000 => `inf` with right error modes but array(0, type=Float32) + 1e3000 => `OverflowError` 3) numarray.array(10)/numarray.array(0) => 0 4) numarray.array(10000000000000L) => array(1316134912) 5) numarray.where(0,1,0) => array([0]) 6) documentation of behavior (how to turn into scalar, that len and [0] indexing doesn't work, etc.) Others 1) puts into lists should raise Type error l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l => [1, 2, 3] 2) repr for zero length arrays needs to show type and other info. 3) rip out _align and _contiguous self-test hacks 4) improved object array handling (e.g., where and the like) 5) average function 6) change MemoryError to ValueError for ones(-5) 7) document matrixmultiply 8) support for __array__ protocol? 9) Documentation fix for p34 row vector usage. 10) Numeric to numarray conversion guide 11) Better tests Most of these are not likely to get immediate attention as our focus now is on integrating scipy. To the extent they make it easier to do, their priority may be raised. There are a lot of "should"s but we have limited resources just like anyone else; we can't do it all at once.] From jmiller at stsci.edu Thu Oct 14 06:11:22 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 14 06:11:22 2004 Subject: [Numpy-discussion] character arrays supported by C API? In-Reply-To: References: <1095253587.4624.380.camel@halloween.stsci.edu> Message-ID: <1097759076.4219.39.camel@halloween.stsci.edu> On Thu, 2004-10-14 at 04:20, Faheem Mitha wrote: > On Wed, 15 Sep 2004, Todd Miller wrote: > > > On Wed, 2004-09-15 at 00:52, Faheem Mitha wrote: > >> Dear People, > >> > >> Are character arrays supported by the Numarray C API? My impression from > >> the documentation is no, but I would appreciate a confirmation. Thanks. > >> > >> Faheem. > > > > Yes and no. CharArray is not as well supported from C as NumArray; > > there are no easy to call functions which will convert a nested sequence > > of strings into a CharArray. > > > > However, it is possible to call the Python functions in the CharArray > > module from C, and a pre-existing CharArray is a PyArrayObject so it > > can be manipulated in C as a struct; it's shape and strides are > > visible, it's itemsize is the length of the string, etc. > > > > What is it you want to do? What functions do you think would help? > > Hi. Sorry about the slow reply. > > What I want to do is extremely simple. I want to convert (in C++) a C++ > character array to a CharArray. The simplest way of doing this would be to > create an array of the appropriate size, and write character strings into > it element by element. > > So, a utility function which creates a character array of appropriate > dimensions would be useful. Also a utility function which convert a list > of strings into a Character Array would also be desirable. > > Currently I am having to work around this limitation by returning lists of > strings back to Python. I'd prefer to not have to do that. That's a sensible addition, but right now, such a function does not exist, and I don't have time to add it myself. The way to achieve this without C-API support by CharArray is to do a Python callback. The steps in C would be roughly: 0. Import the numarray.strings module. PyImport_ImportModule(). 1. Get the module's dictionary object. PyModule_GetDict(). 2. Get a pointer to CharArray by looking it up in the dictionary. PyDict_GetItemString(). 3. Construct an argument tuple which contains the constructor parameters. Py_BuildValue(). 4. Call the constructor using the arg tuple. The return value is the CharArray. PyObject_CallFunction(). Similar steps are done for NumArray in the current C-API in newarray.ch in NA_NewAllFromBuffer(). Regards, Todd From akulla at comcast.net Thu Oct 14 06:44:20 2004 From: akulla at comcast.net (akulla at comcast.net) Date: Thu Oct 14 06:44:20 2004 Subject: [Numpy-discussion] Slow operation of nd_image.generic_filter Message-ID: <101420041338.1510.416E8157000325EE000005E622007456720E04049A050E@comcast.net> Hi all, Could it be that the execution of the following function lasts more than 25 seconds, for an array of shape (256, 480)? ... def myFunc(anArray, winSize=5): return numarray.nd_image.generic_filter(\ input=anArray, function=lambda win: win.mean(), size=winSize, mode='constant') ... Python 2.3, numarray 1.0 (XP, P4) Regards, Alban From falted at pytables.org Fri Oct 15 04:27:55 2004 From: falted at pytables.org (Francesc Alted) Date: Fri Oct 15 04:27:55 2004 Subject: [Numpy-discussion] numarray and ATLAS Message-ID: <200410151318.40035.falted@pytables.org> Hi, Perhaps this is a too recurrent subject, but I'm having problems when making numarray to use ATLAS instead of the mini-lapack included. I've installed ATLAS 3.6.0 on my pentium IV machine. I've made it a completely featured LAPACK by following the instructions in: http://math-atlas.sourceforge.net/errata.html#completelp and I'm pretty sure that the resulting library works. Now, after exporting USE_LAPACK and set the appropiate directory for lapack_dirs in addons.py, the compilation went well (however, I can see that lapack_litemodule.c is still being compiled, and I don't know if that's normal or not). The command I've used to install is: $ python setup.py install --gencode --home=/users/exp/alted/bin-i686 And the error that happens during the test phase follows: $ python Python 2.3.4 (#1, Jul 22 2004, 20:47:54) [GCC 3.3.2 20031022 (Red Hat Linux 3.3.2-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numarray.testall as testall >>> testall.test() numarray: ((0, 1199), (0, 1199)) numarray.records: (0, 48) numarray.strings: (0, 176) numarray.memmap: (0, 82) numarray.objects: (0, 105) numarray.memorytest: (0, 16) numarray.examples.convolve: ((0, 20), (0, 20), (0, 20), (0, 20)) numarray.convolve: (0, 52) Traceback (most recent call last): File "", line 1, in ? File "/users/exp/alted/bin-i686/lib/python/numarray/testall.py", line 24, in test result = eval(p+".test()") File "", line 0, in ? File "/users/exp/alted/bin-i686/lib/python/numarray/fft/FFT.py", line 326, in test import dtest File "/users/exp/alted/bin-i686/lib/python/numarray/fft/dtest.py", line 238, in ? import numarray.random_array as random_array File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/__init__.py", line 7, in ? from RandomArray2 import * File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/RandomArray2.py", line 3, in ? import numarray.linear_algebra as linalg File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/__init__.py", line 1, in ? from LinearAlgebra2 import * File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/LinearAlgebra2.py", line 23, in ? import lapack_lite2 ImportError: /users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/lapack_lite2.so: undefined symbol: dgesdd_ I've checked that dgesdd symbol exists on my liblapack.a: $ strings ~/bin-i686/lib/atlas/liblapack.a | grep dgesdd dgesdd.o/ 1097832195 2514 515 100644 13788 ` but not a dgesdd_, as you can see. I'm missing something? -- Francesc Alted From falted at pytables.org Fri Oct 15 10:07:40 2004 From: falted at pytables.org (Francesc Alted) Date: Fri Oct 15 10:07:40 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <200410151318.40035.falted@pytables.org> References: <200410151318.40035.falted@pytables.org> Message-ID: <200410151903.41288.falted@pytables.org> Hi, Despite de fact that some errors arise, I've checked the numarray version linked against ATLAS, and it seems like it doesn't get the expected ATLAS boost: >>> import timeit >>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)") >>> t1.repeat(3,10) [3.7274820804595947, 3.8542821407318115, 3.7117569446563721] However, Numeric seems to get it: >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))") >>> t3.repeat(3,10) [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281] i.e. almost 300 faster than numarray Anyone is getting the acceleration boost with numarray & ATLAS? Cheers, A Divendres 15 Octubre 2004 13:18, Francesc Alted va escriure: > Hi, > > Perhaps this is a too recurrent subject, but I'm having problems when > making numarray to use ATLAS instead of the mini-lapack included. > > I've installed ATLAS 3.6.0 on my pentium IV machine. I've made it a > completely featured LAPACK by following the instructions in: > > http://math-atlas.sourceforge.net/errata.html#completelp > > and I'm pretty sure that the resulting library works. Now, after exporting > USE_LAPACK and set the appropiate directory for lapack_dirs in addons.py, > the compilation went well (however, I can see that lapack_litemodule.c is > still being compiled, and I don't know if that's normal or not). The command > I've used to install is: > > $ python setup.py install --gencode --home=/users/exp/alted/bin-i686 > > And the error that happens during the test phase follows: > > $ python > Python 2.3.4 (#1, Jul 22 2004, 20:47:54) > [GCC 3.3.2 20031022 (Red Hat Linux 3.3.2-1)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numarray.testall as testall > >>> testall.test() > numarray: ((0, 1199), (0, 1199)) > numarray.records: (0, 48) > numarray.strings: (0, 176) > numarray.memmap: (0, 82) > numarray.objects: (0, 105) > numarray.memorytest: (0, 16) > numarray.examples.convolve: ((0, 20), (0, 20), (0, 20), (0, 20)) > numarray.convolve: (0, 52) > Traceback (most recent call last): > File "", line 1, in ? > File "/users/exp/alted/bin-i686/lib/python/numarray/testall.py", line 24, in test > result = eval(p+".test()") > File "", line 0, in ? > File "/users/exp/alted/bin-i686/lib/python/numarray/fft/FFT.py", line 326, in test > import dtest > File "/users/exp/alted/bin-i686/lib/python/numarray/fft/dtest.py", line 238, in ? > import numarray.random_array as random_array > File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/__init__.py", line 7, in ? > from RandomArray2 import * > File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/RandomArray2.py", line 3, in ? > import numarray.linear_algebra as linalg > File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/__init__.py", line 1, in ? > from LinearAlgebra2 import * > File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/LinearAlgebra2.py", line 23, in ? > import lapack_lite2 > ImportError: > /users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/lapack_lite2.so: > undefined symbol: dgesdd_ > > I've checked that dgesdd symbol exists on my liblapack.a: > > $ strings ~/bin-i686/lib/atlas/liblapack.a | grep dgesdd > dgesdd.o/ 1097832195 2514 515 100644 13788 ` > > but not a dgesdd_, as you can see. > > I'm missing something? > -- Francesc Alted From dd55 at cornell.edu Fri Oct 15 14:18:41 2004 From: dd55 at cornell.edu (Darren Dale) Date: Fri Oct 15 14:18:41 2004 Subject: [Numpy-discussion] how to deal with large arrays Message-ID: <200410151714.38492.dd55@cornell.edu> Hello, I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to sum q(r) over R, so I take a dot product RQ and then sum along one axis to get a 1-by-q result. I'm doing this with dot products because it is much faster than the equivalent for or while loop. The intermediate r-by-q array can get very large though (200MB in my case), so I was wondering if there is a better way to go about it? If not, I can slice up R and deal with it one chunk at a time, then the intermediate arrays fit within the available system resources. Would somebody offer a suggestion of how to do this intelligently? Should the intermediate array be about the size of the processor cache, some fraction of the available memory, or is there something else I need to consider? Thank you, Darren From tim.hochberg at cox.net Fri Oct 15 15:11:05 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Oct 15 15:11:05 2004 Subject: [Numpy-discussion] how to deal with large arrays In-Reply-To: <200410151714.38492.dd55@cornell.edu> References: <200410151714.38492.dd55@cornell.edu> Message-ID: <41704A3C.5080802@cox.net> Darren Dale wrote: >Hello, > >I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to sum >q(r) over R, so I take a dot product RQ and then sum along one axis to get a >1-by-q result. > >I'm doing this with dot products because it is much faster than the equivalent >for or while loop. The intermediate r-by-q array can get very large though >(200MB in my case), so I was wondering if there is a better way to go about >it? > > I think so. I believe you are doing something like this: result_1 = na.sum(na.dot(R,Q), 0) I'm fairly certain (but I urge you to double check), that this reduces to: result_2 = na.dot(na.sum(R, 0), Q) which will take up much less intermediate storage and be faster to boot. In more quasi-mathematical notations: result_1 => sum_i sum_j R_ij Qjk = sum_j sum_i R_ij Q_jk = sum_j Q_jk sum_i R_ij => result_2 A quick test seems to confirm this: import numarray as na from numarray import random_array q = 10 r = 12 R = random_array.random((r,3)) Q = random_array.random((3,q)) x1 = na.sum(na.dot(R,Q), 0) x2 = na.dot(na.sum(R, 0), Q) print na.allclose(x1, x2) -tim >If not, I can slice up R and deal with it one chunk at a time, then the >intermediate arrays fit within the available system resources. Would somebody >offer a suggestion of how to do this intelligently? Should the intermediate >array be about the size of the processor cache, some fraction of the >available memory, or is there something else I need to consider? > >Thank you, >Darren > > >------------------------------------------------------- >This SF.net email is sponsored by: IT Product Guide on ITManagersJournal >Use IT products in your business? Tell us what you think of them. Give us >Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more >http://productguide.itmanagersjournal.com/guidepromo.tmpl >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From dd55 at cornell.edu Fri Oct 15 16:29:03 2004 From: dd55 at cornell.edu (Darren Dale) Date: Fri Oct 15 16:29:03 2004 Subject: [Numpy-discussion] how to deal with large arrays In-Reply-To: <41704A3C.5080802@cox.net> References: <200410151714.38492.dd55@cornell.edu> <41704A3C.5080802@cox.net> Message-ID: <200410151927.54005.dd55@cornell.edu> Thank you for your response, Tim, On Friday 15 October 2004 06:07 pm, Tim Hochberg wrote: > Darren Dale wrote: > >Hello, > > > >I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to > > sum q(r) over R, so I take a dot product RQ and then sum along one axis > > to get a 1-by-q result. > > > >I'm doing this with dot products because it is much faster than the > > equivalent for or while loop. The intermediate r-by-q array can get very > > large though (200MB in my case), so I was wondering if there is a better > > way to go about it? > > I'm fairly certain (but I urge you to double check), that this reduces to: > > result_2 = na.dot(na.sum(R, 0), Q) > Yes. As usual, I left out a bit of information that turned out to be important. See below A modified test: from numarray import * from numarray import random_array q = 10 r = 12 R = random_array.random((r,3)) Q = random_array.random((3,q)) x1 = sum( exp(1j*dot(R,Q)), 0) #note complex argument to exp() x2 = exp(1j*dot(sum(R, 0), Q)) print allclose(x1, x2) The complex arithmetic changes things. I am still learning how to keep my code efficient. The following code is actually almost as fast as using the large dot product, apparently I had some other sinks in my original tests: phase = zeros(len(Q[0]),'d') for i in range(len(Q[0])): phase[i] = phase[i] + sum(exp(1j*dot(R,Q[:,i])), 0) If q=1000 and r=2500, the for loop takes about 13% longer than the dot product method. Incredibly, if q=10,000 and r=2500, the for loop is 17% faster. So I am going to use it instead. Apparently I had some other time sink in my original test. from numarray import * from numarray import random_array from time import clock q = 10000 r = 2500 R = random_array.random((r,3)) Q = random_array.random((3,q)) t0 = clock() x1 = sum(exp(1j*dot(R,Q)), 0) #note complex argument to exp() t1 = clock() dt1 = t1-t0 phase = zeros(len(Q[0]),'d') for i in range(len(Q[0])): phase[i] = phase[i] + sum(exp(1j*dot(R,Q[:,i])), 0) t2 = clock() dt2 = t2-t1 print (dt2-dt1)/dt1 -- Darren From falted at pytables.org Sat Oct 16 04:29:02 2004 From: falted at pytables.org (Francesc Alted) Date: Sat Oct 16 04:29:02 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <200410151903.41288.falted@pytables.org> References: <200410151318.40035.falted@pytables.org> <200410151903.41288.falted@pytables.org> Message-ID: <200410161327.47485.falted@pytables.org> A Divendres 15 Octubre 2004 19:03, Francesc Alted va escriure: > >>> import timeit > >>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)") > >>> t1.repeat(3,10) > [3.7274820804595947, 3.8542821407318115, 3.7117569446563721] > > However, Numeric seems to get it: > > >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))") > >>> t3.repeat(3,10) > [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281] > > i.e. almost 300 faster than numarray Ooops! The Numeric test had a bug on it. The correct test would be: >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');m1=Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');m2=Numeric.reshape(m2,(dim1,dim1))") >>> t3.repeat(3,10) [0.47363090515136719, 0.47403502464294434, 0.47770595550537109] which is 8 times faster, more or less, than numarray (or Numeric) without ATLAS. Just to clarify things ;) -- Francesc Alted From aisaac at american.edu Sat Oct 16 15:53:01 2004 From: aisaac at american.edu (Alan G Isaac) Date: Sat Oct 16 15:53:01 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: <1097609991.30171.556.camel@halloween.stsci.edu> References: <1097550159.2568.5.camel@localhost.localdomain><1097595580.24491.4.camel@freyer.sfo.csun.edu><1097609991.30171.556.camel@halloween.stsci.edu> Message-ID: On 12 Oct 2004, Todd Miller apparently wrote: > Go here: > http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse > then "Submit New", and set the "category" to "documentation. Done. Thanks, Alan Isaac From aisaac at american.edu Sat Oct 16 15:53:02 2004 From: aisaac at american.edu (Alan G Isaac) Date: Sat Oct 16 15:53:02 2004 Subject: [Numpy-discussion] matrixmultiply: return type Message-ID: Being new to numerical Python applications, I was a little puzzled/concerned when I read http://sourceforge.net/tracker/index.php?func=detail&aid=984368&group_id=1369&atid=450446 I *think* the answer is: matrixmultiply will always return an array. Is there a stable view about what type of object will be returned by matrixmultiply? Currently, to my initial surprise, it returns an array when the arguments are matrices. Is this stable? Might an optional argument to specify the return type be desirable? Thank you, Alan Isaac From jmiller at stsci.edu Sat Oct 16 18:27:04 2004 From: jmiller at stsci.edu (Todd Miller) Date: Sat Oct 16 18:27:04 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: References: <1097550159.2568.5.camel@localhost.localdomain> <1097595580.24491.4.camel@freyer.sfo.csun.edu> <1097609991.30171.556.camel@halloween.stsci.edu> Message-ID: <1097976412.3744.159.camel@localhost.localdomain> On Sat, 2004-10-16 at 17:17, Alan G Isaac wrote: > On 12 Oct 2004, Todd Miller apparently wrote: > > Go here: > > http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse > > then "Submit New", and set the "category" to "documentation. > > > Done. > > Thanks, > Alan Isaac As it turns out, I misdirected you. The above link is for numarray bugs. This link is for Numeric bugs: http://sourceforge.net/tracker/?group_id=1369&atid=101369 I moved the diagonal doc bug report to the Numeric bugs tracker. Regards, Todd From jmiller at stsci.edu Sat Oct 16 18:50:04 2004 From: jmiller at stsci.edu (Todd Miller) Date: Sat Oct 16 18:50:04 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <200410161327.47485.falted@pytables.org> References: <200410151318.40035.falted@pytables.org> <200410151903.41288.falted@pytables.org> <200410161327.47485.falted@pytables.org> Message-ID: <1097977801.3744.184.camel@localhost.localdomain> On Sat, 2004-10-16 at 07:27, Francesc Alted wrote: > A Divendres 15 Octubre 2004 19:03, Francesc Alted va escriure: > > >>> import timeit > > >>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)") > > >>> t1.repeat(3,10) > > [3.7274820804595947, 3.8542821407318115, 3.7117569446563721] > > > > However, Numeric seems to get it: > > > > >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))") > > >>> t3.repeat(3,10) > > [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281] > > > > i.e. almost 300 faster than numarray > > Ooops! The Numeric test had a bug on it. The correct test would be: > > >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');m1=Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');m2=Numeric.reshape(m2,(dim1,dim1))") > >>> t3.repeat(3,10) > [0.47363090515136719, 0.47403502464294434, 0.47770595550537109] > > which is 8 times faster, more or less, than numarray (or Numeric) without > ATLAS. > > Just to clarify things ;) Hi Francesc, I don't think numarray dot() will pick up any boost at all from ATLAS because it's not written to do it. Besides that, there are two performance problems I know of with numarray's dot() which may dominate or dilute any ATLAS benefits: 1. dot() requires array creation. 2. dot() requires array copies. Because it has a class hierarchy and a memory buffer object, numarray is at a disadvantage for (1). (2) just hasn't been optimized yet for noncontiguous arrays which (I think) are always present when dot() starts with two contiguous array parameters. Regards, Todd From stephen.walton at csun.edu Sun Oct 17 17:35:03 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Sun Oct 17 17:35:03 2004 Subject: [Numpy-discussion] New LAPACK and ScaLAPACK planned Message-ID: <1098059497.5110.5.camel@localhost.localdomain> From volume 4 #37 of the NA-Digest mailing list. I hope this is of enough interest to this list to justify the cross post. From dongarra at cs.utk.edu Fri Oct 15 04:10:44 2004 From: dongarra at cs.utk.edu (Jack Dongarra) Date: Fri, 15 Oct 2004 04:10:44 -0400 Subject: New Release of LAPACK and ScaLAPACK Planned Message-ID: New Release of LAPACK and ScaLAPACK planned. We are pleased to announce that we recently received NSF funding for new releases of the LAPACK and ScaLAPACK linear algebra libraries. The proposal pointed out the new and better algorithms that have been developed by many people in the community since the first releases of these libraries, as well as more obvious gaps and possible improvements. The proposal listed a large number of activities, which we now need to prioritize. There are a number of design decisions that still need to be made, for which we are interested in your input. For this purpose, we would like to remind you of a web page to collect your input that we originally announced on NA-Digest while we were preparing the proposal: http://icl.cs.utk.edu/lapack-survey.html In addition to the questions on that form, we are interested in your opinion on all aspects of the proposal, a copy of which you may find at: http://www.cs.berkeley.edu/~demmel/Sca-LAPACK-Proposal.pdf Thanks, Jim Demmel and Jack Dongarra --=20 Stephen Walton Dept. of Physics & Astronomy, CSU Northridge --=-vf5K3It096b9Vx529EKP Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQBBcw7pURWByv7S9xcRAms0AJ0YE13AXJ127J/5UVRs2t+BUYMIUQCgnd8I kvjNlPBX6phVfhjclKGExPY= =1kTj -----END PGP SIGNATURE----- --=-vf5K3It096b9Vx529EKP-- From falted at pytables.org Mon Oct 18 01:30:01 2004 From: falted at pytables.org (Francesc Alted) Date: Mon Oct 18 01:30:01 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <1097977801.3744.184.camel@localhost.localdomain> References: <200410151318.40035.falted@pytables.org> <200410161327.47485.falted@pytables.org> <1097977801.3744.184.camel@localhost.localdomain> Message-ID: <200410181029.14879.falted@pytables.org> Hi Todd, A Diumenge 17 Octubre 2004 03:50, Todd Miller va escriure: > I don't think numarray dot() will pick up any boost at all from ATLAS > because it's not written to do it. Besides that, there are two > performance problems I know of with numarray's dot() which may dominate > or dilute any ATLAS benefits: > > 1. dot() requires array creation. Yes, but my guess is that for large arrays, this time should be negligible compared with the multiplication time. > 2. dot() requires array copies. Mmm, you mean even for well-behaved arrays? Sorry, but I don't understand why. May I ask if there is any plan to complete a better integration of external LAPACK libraries in numarray or this is considered low priority? Never mind, I don't need this functionality right now. It's just that I'm preparing a series of 'hands-on' sessions about Python and Scientific Computing, and I was trying to understand the current advantages and limitations of numarray compared with NumPy. Cheers, -- Francesc Alted From jmiller at stsci.edu Mon Oct 18 04:53:01 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Oct 18 04:53:01 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <200410181029.14879.falted@pytables.org> References: <200410151318.40035.falted@pytables.org> <200410161327.47485.falted@pytables.org> <1097977801.3744.184.camel@localhost.localdomain> <200410181029.14879.falted@pytables.org> Message-ID: <1098100329.3741.96.camel@localhost.localdomain> On Mon, 2004-10-18 at 04:29, Francesc Alted wrote: > Hi Todd, > > A Diumenge 17 Octubre 2004 03:50, Todd Miller va escriure: > > I don't think numarray dot() will pick up any boost at all from ATLAS > > because it's not written to do it. Besides that, there are two > > performance problems I know of with numarray's dot() which may dominate > > or dilute any ATLAS benefits: > > > > 1. dot() requires array creation. > > Yes, but my guess is that for large arrays, this time should be negligible > compared with the multiplication time. > Probably true. I should measure this. For small computations, it's an issue. > > 2. dot() requires array copies. > > Mmm, you mean even for well-behaved arrays? Sorry, but I don't understand > why. I looked at this some this morning, trying to figure out why this is a problem only for numarray. It turns out that Numeric strides its arrays to get around the copy. When I implemented numarray, I chose not to stride because I thought it would be too slow... Recently I realized that one input array to dot() is *always* transposed and therefore likely noncontiguous and therefore copied. I think it's now possible to simply port the Numeric code so I'll look into that. > May I ask if there is any plan to complete a better integration of external > LAPACK libraries in numarray or this is considered low priority? Perry may answer this. I have no immediate plans for it... it does sound like enough people need this that it should be done. Regards, Todd From perry at stsci.edu Mon Oct 18 05:21:02 2004 From: perry at stsci.edu (Perry Greenfield) Date: Mon Oct 18 05:21:02 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <1098100329.3741.96.camel@localhost.localdomain> References: <200410151318.40035.falted@pytables.org> <200410161327.47485.falted@pytables.org> <1097977801.3744.184.camel@localhost.localdomain> <200410181029.14879.falted@pytables.org> <1098100329.3741.96.camel@localhost.localdomain> Message-ID: On Oct 18, 2004, at 7:52 AM, Todd Miller wrote: > On Mon, 2004-10-18 at 04:29, Francesc Alted wrote: > >> May I ask if there is any plan to complete a better integration of >> external >> LAPACK libraries in numarray or this is considered low priority? > > Perry may answer this. I have no immediate plans for it... it does > sound like enough people need this that it should be done. > Like Todd says, it does sound like this needs to be done. I think it takes a back seat to doing the scipy integration in general, but will need to be addressed soon thereafter. Perry From frank.horowitz at csiro.au Mon Oct 18 23:33:03 2004 From: frank.horowitz at csiro.au (Frank Horowitz) Date: Mon Oct 18 23:33:03 2004 Subject: [Numpy-discussion] Numeric Underflow Exceptions: Recommendations? Message-ID: <1098167541.8538.48.camel@localhost> Hi all, Using Numeric 23.5 I've been bitten by the dreaded 'floating point underflow throws an "OverflowError: math range error" instead of silently returning zero' bug. My setup is Debian unstable (Sid) on an i386, and I am using Debian's binary package "python-numeric". I understand from googling past discussions that this is (used to be?) phase-of-the-moon stuff, depending mostly upon architecture, options at libm compilation time of libc6. Several references to a trick of adding "-lieee" to the link list succeeding in taming the bug were mentioned around the era of Python2.0. My questions are these: Is there some higher level way of dealing with underflow now in Numeric? Or am I going to have to track down wherever "-lieee" has disappeared to in Debian, and recompile Numeric in the hopes that that still cures the problem? Any other tricks up people's sleeves for dealing with this? (I already know about exp_safe in Fernando Perez' IPython/numutils.py, BTW. I'm kind of hoping for a library level fix though, since my code is littered with "Numeric.exp()" calls.) TIA for any help you might be able to provide! Cheers, Frank Horowitz From falted at pytables.org Tue Oct 19 01:35:05 2004 From: falted at pytables.org (Francesc Alted) Date: Tue Oct 19 01:35:05 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <1098100329.3741.96.camel@localhost.localdomain> References: <200410151318.40035.falted@pytables.org> <200410181029.14879.falted@pytables.org> <1098100329.3741.96.camel@localhost.localdomain> Message-ID: <200410191034.08018.falted@pytables.org> A Dilluns 18 Octubre 2004 13:52, Todd Miller va escriure: > > > 1. dot() requires array creation. > > > > Yes, but my guess is that for large arrays, this time should be negligible > > compared with the multiplication time. > > > > Probably true. I should measure this. For small computations, it's an > issue. Well, for small arrays ATLAS (or any other optimized LAPACK library) can't probably do much better than lapack lite, so I think you should not worry about this anyway. > Perry may answer this. I have no immediate plans for it... it does > sound like enough people need this that it should be done. Ok. Thanks for information, -- Francesc Alted From flin at broadpark.no Wed Oct 20 02:23:30 2004 From: flin at broadpark.no (Frank Lindseth) Date: Wed Oct 20 02:23:30 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 Message-ID: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> Hi, I need numeric in a python2.4 / win32 project. Is there a binary installer somewhere? I tried to compile it from source but ran into the following problem (se below): Where are the libs supposed to come from? - Frank C:\users\frankl\download\Numeric-23.5>c:\Python24\python.exe setup.py install running install running build running build_py running build_ext building 'lapack_lite' extension C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link.exe /DLL /nologo /INCREMENTAL:NO /LIBPATH:/usr/lib/atlas /LIBPATH:c:\Python24\libs /LIBPATH:c:\P ython24\PCBuild lapack.lib cblas.lib f77blas.lib atlas.lib g2c.lib /EXPORT:initl apack_lite build\temp.win32-2.4\Release\Src\lapack_litemodule.obj /OUT:build\lib .win32-2.4\lapack_lite.pyd /IMPLIB:build\temp.win32-2.4\Release\Src\lapack_lite. lib LINK : fatal error LNK1181: cannot open input file 'lapack.lib' error: command '"C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link .exe"' failed with exit status 1181 C:\users\frankl\download\Numeric-23.5> -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen.walton at csun.edu Wed Oct 20 09:22:37 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Oct 20 09:22:37 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> Message-ID: <1098288859.7182.11.camel@sunspot.csun.edu> On Wed, 2004-10-20 at 11:17 +0200, Frank Lindseth wrote: > LINK : fatal error LNK1181: cannot open input file 'lapack.lib' Edit setup.py, setting the variables library_dirs_list and libraries_list to empty lists, and try again. List: shouldn't this be the default? Right now Numeric looks for ATLAS by default. -- Stephen Walton, Professor of Physics and Astronomy, California State University, Northridge stephen.walton at csun.edu From flin at broadpark.no Wed Oct 20 11:46:27 2004 From: flin at broadpark.no (flin at broadpark.no) Date: Wed Oct 20 11:46:27 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <1098288859.7182.11.camel@sunspot.csun.edu> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu> Message-ID: <1098297610.4176b10a3b4d2@webmail.broadpark.no> Thank you for the replay Stephen, I did as you suggested: library_dirs_list = [] libraries_list = [] #library_dirs_list = ['/usr/lib/atlas'] #libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] but it still woun't install (se below) Any suggestions? C:\users\frankl\download\Numeric-23.5>c:\Python24\python.exe setup.py install running install running build running build_py running build_ext building 'lapack_lite' extension C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link.exe /DLL /nologo /INCREMENTAL:NO /LIBPATH:c:\Python24\libs /LIBPATH:c:\Python24\PCBuild /EXPORT: initlapack_lite build\temp.win32-2.4\Release\Src\lapack_litemodule.obj /OUT:buil d\lib.win32-2.4\lapack_lite.pyd /IMPLIB:build\temp.win32-2.4\Release\Src\lapack_ lite.lib Creating library build\temp.win32-2.4\Release\Src\lapack_lite.lib and object build\temp.win32-2.4\Release\Src\lapack_lite.exp lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgeev_ refere nced in function _lapack_lite_dgeev lapack_litemodule.obj : error LNK2019: unresolved external symbol _dsyevd_ refer enced in function _lapack_lite_dsyevd lapack_litemodule.obj : error LNK2019: unresolved external symbol _zheevd_ refer enced in function _lapack_lite_zheevd lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgelsd_ refer enced in function _lapack_lite_dgelsd lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgesv_ refere nced in function _lapack_lite_dgesv lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgesdd_ refer enced in function _lapack_lite_dgesdd lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgetrf_ refer enced in function _lapack_lite_dgetrf lapack_litemodule.obj : error LNK2019: unresolved external symbol _dpotrf_ refer enced in function _lapack_lite_dpotrf lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgeev_ refere nced in function _lapack_lite_zgeev lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgelsd_ refer enced in function _lapack_lite_zgelsd lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgesv_ refere nced in function _lapack_lite_zgesv lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgesdd_ refer enced in function _lapack_lite_zgesdd lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgetrf_ refer enced in function _lapack_lite_zgetrf lapack_litemodule.obj : error LNK2019: unresolved external symbol _zpotrf_ refer enced in function _lapack_lite_zpotrf build\lib.win32-2.4\lapack_lite.pyd : fatal error LNK1120: 14 unresolved externa ls error: command '"C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link .exe"' failed with exit status 1120 C:\users\frankl\download\Numeric-23.5> Quoting Stephen Walton : > On Wed, 2004-10-20 at 11:17 +0200, Frank Lindseth wrote: > > > LINK : fatal error LNK1181: cannot open input file 'lapack.lib' > > Edit setup.py, setting the variables library_dirs_list and > libraries_list to empty lists, and try again. > > List: shouldn't this be the default? Right now Numeric looks for ATLAS > by default. > > -- > Stephen Walton, Professor of Physics and Astronomy, > California State University, Northridge > stephen.walton at csun.edu > > From stephen.walton at csun.edu Wed Oct 20 12:09:00 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Oct 20 12:09:00 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <1098297610.4176b10a3b4d2@webmail.broadpark.no> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu> <1098297610.4176b10a3b4d2@webmail.broadpark.no> Message-ID: <1098299055.7182.33.camel@sunspot.csun.edu> On Wed, 2004-10-20 at 20:40 +0200, flin at broadpark.no wrote: > Thank you for the replay Stephen, > I did as you suggested: > library_dirs_list = [] > libraries_list = [] > #library_dirs_list = ['/usr/lib/atlas'] > #libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] > > but it still woun't install (se below) > Any suggestions? I'm guessing you still have files left over from last time. On Unix, you can run the 'makeclean.sh' script. On Windows, manually deleting the directories listed in that script (they are all called build) should do the trick. Then try the 'setup.py build' again. -- Stephen Walton, Professor of Physics and Astronomy, California State University, Northridge stephen.walton at csun.edu From flin at broadpark.no Wed Oct 20 15:59:45 2004 From: flin at broadpark.no (flin at broadpark.no) Date: Wed Oct 20 15:59:45 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <1098299055.7182.33.camel@sunspot.csun.edu> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu> <1098297610.4176b10a3b4d2@webmail.broadpark.no> <1098299055.7182.33.camel@sunspot.csun.edu> Message-ID: <1098312989.4176ed1d83de1@webmail.broadpark.no> Thanks again Stephen. Still no success. I deleted the whole Numeric-directory-tree, unzipped a newly downloaded src-file, edited the setup.py as you suggested, tried to run the installer, same error. I'm not sure what to du next? (what canm't somebody just make a binary installer for python2.4, after all it's in beta now...) - Frank -------- Quoting Stephen Walton : > On Wed, 2004-10-20 at 20:40 +0200, flin at broadpark.no wrote: > > Thank you for the replay Stephen, > > I did as you suggested: > > library_dirs_list = [] > > libraries_list = [] > > #library_dirs_list = ['/usr/lib/atlas'] > > #libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] > > > > but it still woun't install (se below) > > Any suggestions? > > I'm guessing you still have files left over from last time. On Unix, > you can run the 'makeclean.sh' script. On Windows, manually deleting > the directories listed in that script (they are all called build) should > do the trick. Then try the 'setup.py build' again. > > -- > Stephen Walton, Professor of Physics and Astronomy, > California State University, Northridge > stephen.walton at csun.edu > > From stephen.walton at csun.edu Wed Oct 20 16:47:52 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Oct 20 16:47:52 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <1098312989.4176ed1d83de1@webmail.broadpark.no> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu> <1098297610.4176b10a3b4d2@webmail.broadpark.no> <1098299055.7182.33.camel@sunspot.csun.edu> <1098312989.4176ed1d83de1@webmail.broadpark.no> Message-ID: <1098315982.7159.2.camel@freyer.sfo.csun.edu> On Wed, 2004-10-20 at 15:56, flin at broadpark.no wrote: > Thanks again Stephen. > Still no success. Sorry. Being a Linux user I'm afraid I can't help much. > I'm not sure what to du next? Download SciPy from http://www.scipy.org/? It is much more than you actually need, being all of Scientific Python as well as Numeric, but at least it's an all-in-one installer. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From mdehoon at ims.u-tokyo.ac.jp Wed Oct 20 21:40:04 2004 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Wed Oct 20 21:40:04 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <1098312989.4176ed1d83de1@webmail.broadpark.no> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu> <1098297610.4176b10a3b4d2@webmail.broadpark.no> <1098299055.7182.33.camel@sunspot.csun.edu> <1098312989.4176ed1d83de1@webmail.broadpark.no> Message-ID: <41772BE1.5020403@ims.u-tokyo.ac.jp> flin at broadpark.no wrote: > Thanks again Stephen. > Still no success. > I deleted the whole Numeric-directory-tree, > unzipped a newly downloaded src-file, > edited the setup.py as you suggested, > tried to run the installer, > same error. Previously I managed to compile Numeric for Python 2.4 on Windows, using the MinGW compiler and Atlas. If you still need it, I can send you the binaries. > > I'm not sure what to du next? > (what canm't somebody just make a binary installer for python2.4, > after all it's in beta now...) There is a bug in Python 2.4 that prevents users from running bdist_wininst to create a binary installer. python setup.py install fails too. See bug 1021756 on sourceforge. --Michiel, U Tokyo. -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From stark at tuebingen.mpg.de Wed Oct 20 23:49:23 2004 From: stark at tuebingen.mpg.de (Sebastian Stark) Date: Wed Oct 20 23:49:23 2004 Subject: [Numpy-discussion] Re: numarray and ATLAS Message-ID: <200410210846.09275.stark@tuebingen.mpg.de> > Perhaps this is a too recurrent subject, but I"m having problems when > making numarray to use ATLAS instead of the mini-lapack included. I had to change lapack_libs and lapack_dirs in addons.py to read: lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm'] lapack_dirs = ['/usr/local/lib/ATLAS'] I have all my .a files in /usr/local/lib/ATLAS so I can control which ones I'm actually linking against. mosel ~ % ls -l /usr/local/lib/ATLAS total 14608 -rw-r--r-- 1 root staff 7952316 Oct 20 10:03 libatlas.a -rw-r--r-- 1 root staff 277592 Oct 20 10:03 libcblas.a -rw-r--r-- 1 root staff 261060 Oct 20 10:45 libf2c.a -rw-r--r-- 1 root staff 353278 Oct 20 10:03 libf77blas.a -rw-r--r-- 1 root staff 5734736 Oct 20 10:42 liblapack.a -rw-r--r-- 1 root staff 324968 Oct 20 10:03 libtstatlas.a -Sebastian (and yes, I get a significant speed boost from ATLAS) -- Sebastian Stark -- http://www.kyb.tuebingen.mpg.de/~stark Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen Phone: +49 7071 601 555 -- Fax: +49 7071 601 552 From stark at tuebingen.mpg.de Wed Oct 20 23:56:14 2004 From: stark at tuebingen.mpg.de (Sebastian Stark) Date: Wed Oct 20 23:56:14 2004 Subject: [Numpy-discussion] indexing on uninitialized arrays Message-ID: <200410210852.02285.stark@tuebingen.mpg.de> In matlab I can do: >> x = [] x = [] >> x(2) = 1.4 x = 0 1.4000 >> x(2,4) = 2.9 x = 0 1.4000 0 0 0 0 0 2.9000 which means x expands as necessary depending on "how far" my indexing goes. Now I'm thinking about how to realize this with numarray. I could imagine to define a derived array type "SelfInflatingArray" which catches the IndexError exception and does the right thing then. Any better ideas? -Sebastian -- Sebastian Stark -- http://www.kyb.tuebingen.mpg.de/~stark Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen Phone: +49 7071 601 555 -- Fax: +49 7071 601 552 From falted at pytables.org Thu Oct 21 00:32:31 2004 From: falted at pytables.org (Francesc Alted) Date: Thu Oct 21 00:32:31 2004 Subject: [Numpy-discussion] Re: numarray and ATLAS In-Reply-To: <200410210846.09275.stark@tuebingen.mpg.de> References: <200410210846.09275.stark@tuebingen.mpg.de> Message-ID: <200410210929.18477.falted@pytables.org> A Dijous 21 Octubre 2004 08:46, Sebastian Stark va escriure: > > > Perhaps this is a too recurrent subject, but I"m having problems when > > making numarray to use ATLAS instead of the mini-lapack included. > > I had to change lapack_libs and lapack_dirs in addons.py to read: > > lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm'] > lapack_dirs = ['/usr/local/lib/ATLAS'] I've done something similar: lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas'] lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib'] Mmm, I can see that you have added 'f2c'. However, I don't have it installed. Could that be the cause that tests would not pass in my case? > (and yes, I get a significant speed boost from ATLAS) Great, it's good to know that. Thank you very much for your feedback, -- Francesc Alted From rkern at ucsd.edu Thu Oct 21 02:05:51 2004 From: rkern at ucsd.edu (Robert Kern) Date: Thu Oct 21 02:05:51 2004 Subject: [Numpy-discussion] Re: numarray and ATLAS In-Reply-To: <200410210929.18477.falted@pytables.org> References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org> Message-ID: <41777422.8040205@ucsd.edu> Francesc Alted wrote: > A Dijous 21 Octubre 2004 08:46, Sebastian Stark va escriure: > >>>Perhaps this is a too recurrent subject, but I"m having problems when >>>making numarray to use ATLAS instead of the mini-lapack included. >> >>I had to change lapack_libs and lapack_dirs in addons.py to read: >> >> lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm'] >> lapack_dirs = ['/usr/local/lib/ATLAS'] > > > I've done something similar: > > lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas'] > lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib'] > > Mmm, I can see that you have added 'f2c'. However, I don't have it > installed. Could that be the cause that tests would not pass in my case? If you are compiling with gcc, add 'g2c' after 'f77blas'. It's g77's FORTRAN runtime library. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From falted at pytables.org Thu Oct 21 02:33:28 2004 From: falted at pytables.org (Francesc Alted) Date: Thu Oct 21 02:33:28 2004 Subject: [Numpy-discussion] Re: numarray and ATLAS In-Reply-To: <41777422.8040205@ucsd.edu> References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org> <41777422.8040205@ucsd.edu> Message-ID: <200410211126.42729.falted@pytables.org> A Dijous 21 Octubre 2004 10:32, Robert Kern va escriure: > > Mmm, I can see that you have added 'f2c'. However, I don't have it > > installed. Could that be the cause that tests would not pass in my case? > > If you are compiling with gcc, add 'g2c' after 'f77blas'. It's g77's > FORTRAN runtime library. Yeah, that made the trick!. So for a gcc compiler, this works just fine: lapack_libs = ['lapack', 'f77blas', 'g2c', 'cblas', 'atlas', 'm'] Many thanks!, -- Francesc Alted From stephen.walton at csun.edu Thu Oct 21 10:59:05 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Thu Oct 21 10:59:05 2004 Subject: [Numpy-discussion] Counting array elements Message-ID: <1098381332.8249.12.camel@freyer.sfo.csun.edu> Is there some simple way of counting the number of array elements which satisfy a certain condition? It is easy to do A[A<=1].sum() to sum all the values of A which are less than 1, but there doesn't seem to be a count() method. I tried (A<=1).sum() but this throws an exception at numarray 1.1. If I try sum(A<=value) I have to nest multiple sums if A has rank greater than 1, plus the sum overflows if A is large, apparently because boolean gets treated as Int8. (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)). You get zero.) The following works: array(A<=1024,type=Int32).sum() but is awkward. Am I missing an obvious better alternative? If not, I'm going to file an RFE :-) . -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From Chris.Barker at noaa.gov Thu Oct 21 11:33:03 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Thu Oct 21 11:33:03 2004 Subject: [Numpy-discussion] Re: numarray and ATLAS In-Reply-To: <41777422.8040205@ucsd.edu> References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org> <41777422.8040205@ucsd.edu> Message-ID: <4177FFF6.40006@noaa.gov> Robert Kern wrote: > Francesc Alted wrote: >>> I had to change lapack_libs and lapack_dirs in addons.py to read: >>> lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm'] >>> lapack_dirs = ['/usr/local/lib/ATLAS'] >> I've done something similar: >> >> lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas'] >> lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib'] For what it's worth, this is what worked for me on Gentoo Linux: lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm'] -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jmiller at stsci.edu Thu Oct 21 11:33:46 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 21 11:33:46 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098381332.8249.12.camel@freyer.sfo.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> Message-ID: <1098383430.3644.4.camel@halloween.stsci.edu> On Thu, 2004-10-21 at 13:55, Stephen Walton wrote: > Is there some simple way of counting the number of array elements which > satisfy a certain condition? It is easy to do > > A[A<=1].sum() > > to sum all the values of A which are less than 1, but there doesn't seem > to be a count() method. I tried > > (A<=1).sum() > > but this throws an exception at numarray 1.1. If I try This works now in CVS and will be part of numarray-1.2. Another more tedious approach which works for numarray-1.1 is: (A <= 1).astype('Int32').sum() > sum(A<=value) > > I have to nest multiple sums if A has rank greater than 1, plus the sum > overflows if A is large, apparently because boolean gets treated as > Int8. (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)). You get > zero.) The following works: > > array(A<=1024,type=Int32).sum() > > but is awkward. Am I missing an obvious better alternative? If not, > I'm going to file an RFE :-) . I don't think there's any need for an RFE, provided you're satisfied with (A<=1).sum(). Regards, Todd From rkern at ucsd.edu Thu Oct 21 12:22:20 2004 From: rkern at ucsd.edu (Robert Kern) Date: Thu Oct 21 12:22:20 2004 Subject: [Numpy-discussion] argmin and unsigned types Message-ID: <41780BE5.4070009@ucsd.edu> argmin locates the minimum by finding the maximum of the negative of the input. Unfortunately, for unsigned arrays, the negative has nothing to do with the actual numerical negative. Example: >>> from numarray import * >>> a = arange(10).astype(UInt8) >>> print a [0 1 2 3 4 5 6 7 8 9] >>> print -a [ 0 255 254 253 252 251 250 249 248 247] >>> argmin(a) 1 We need a separate argmin to handle these arrays properly. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From jmiller at stsci.edu Thu Oct 21 15:04:04 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 21 15:04:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098383430.3644.4.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <1098383430.3644.4.camel@halloween.stsci.edu> Message-ID: <1098396116.3644.129.camel@halloween.stsci.edu> On Thu, 2004-10-21 at 14:30, Todd Miller wrote: > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote: > > Is there some simple way of counting the number of array elements which > > satisfy a certain condition? It is easy to do > > > > A[A<=1].sum() > > > > to sum all the values of A which are less than 1, but there doesn't seem > > to be a count() method. I tried > > > > (A<=1).sum() > > > > but this throws an exception at numarray 1.1. If I try > > This works now in CVS and will be part of numarray-1.2. Stephen tried this and it turns out my earlier statement was untrue, (A<=1).sum() doesn't do anything reasonable, even in CVS. The problem is that sum() is written (without direct C support) to conserve storage. As a result, it doesn't do implicit > Another more > tedious approach which works for numarray-1.1 is: > > (A <= 1).astype('Int32').sum() > There's also a prettier approach that works for 1.1 that I forgot about: (A <= 1).sum('Int32') > > sum(A<=value) > > > > I have to nest multiple sums if A has rank greater than 1, plus the sum > > overflows if A is large, apparently because boolean gets treated as > > Int8. (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)). You get > > zero.) The following works: > > > > array(A<=1024,type=Int32).sum() > > > > but is awkward. Am I missing an obvious better alternative? If not, > > I'm going to file an RFE :-) . > > I don't think there's any need for an RFE, provided you're satisfied > with (A<=1).sum(). > > Regards, > Todd > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From jmiller at stsci.edu Thu Oct 21 15:08:52 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 21 15:08:52 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098396116.3644.129.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <1098383430.3644.4.camel@halloween.stsci.edu> <1098396116.3644.129.camel@halloween.stsci.edu> Message-ID: <1098396420.28271.0.camel@halloween.stsci.edu> On Thu, 2004-10-21 at 18:01, Todd Miller wrote: > On Thu, 2004-10-21 at 14:30, Todd Miller wrote: > > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote: > > > Is there some simple way of counting the number of array elements which > > > satisfy a certain condition? It is easy to do > > > > > > A[A<=1].sum() > > > > > > to sum all the values of A which are less than 1, but there doesn't seem > > > to be a count() method. I tried > > > > > > (A<=1).sum() > > > > > > but this throws an exception at numarray 1.1. If I try > > > > This works now in CVS and will be part of numarray-1.2. > > Stephen tried this and it turns out my earlier statement was untrue, > (A<=1).sum() doesn't do anything reasonable, even in CVS. The problem > is that sum() is written (without direct C support) to conserve > storage. As a result, it doesn't do implicit > > Another more > > tedious approach which works for numarray-1.1 is: > > > > (A <= 1).astype('Int32').sum() > > > > There's also a prettier approach that works for 1.1 that I forgot about: > > (A <= 1).sum('Int32') > > > > sum(A<=value) > > > > > > I have to nest multiple sums if A has rank greater than 1, plus the sum > > > overflows if A is large, apparently because boolean gets treated as > > > Int8. (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)). You get > > > zero.) The following works: > > > > > > array(A<=1024,type=Int32).sum() > > > > > > but is awkward. Am I missing an obvious better alternative? If not, > > > I'm going to file an RFE :-) . > > > > I don't think there's any need for an RFE, provided you're satisfied > > with (A<=1).sum(). > > > > Regards, > > Todd > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > > Use IT products in your business? Tell us what you think of them. Give us > > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > > http://productguide.itmanagersjournal.com/guidepromo.tmpl > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From jmiller at stsci.edu Thu Oct 21 15:11:23 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 21 15:11:23 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098396116.3644.129.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <1098383430.3644.4.camel@halloween.stsci.edu> <1098396116.3644.129.camel@halloween.stsci.edu> Message-ID: <1098396569.28351.0.camel@halloween.stsci.edu> On Thu, 2004-10-21 at 18:01, Todd Miller wrote: > On Thu, 2004-10-21 at 14:30, Todd Miller wrote: > > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote: > > > Is there some simple way of counting the number of array elements which > > > satisfy a certain condition? It is easy to do > > > > > > A[A<=1].sum() > > > > > > to sum all the values of A which are less than 1, but there doesn't seem > > > to be a count() method. I tried > > > > > > (A<=1).sum() > > > > > > but this throws an exception at numarray 1.1. If I try > > > > This works now in CVS and will be part of numarray-1.2. > > Stephen tried this and it turns out my earlier statement was untrue, > (A<=1).sum() doesn't do anything reasonable, even in CVS. The problem > is that sum() is written (without direct C support) to conserve > storage. As a result, it doesn't do implicit > > Another more > > tedious approach which works for numarray-1.1 is: > > > > (A <= 1).astype('Int32').sum() > > > > There's also a prettier approach that works for 1.1 that I forgot about: > > (A <= 1).sum('Int32') > > > > sum(A<=value) > > > > > > I have to nest multiple sums if A has rank greater than 1, plus the sum > > > overflows if A is large, apparently because boolean gets treated as > > > Int8. (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)). You get > > > zero.) The following works: > > > > > > array(A<=1024,type=Int32).sum() > > > > > > but is awkward. Am I missing an obvious better alternative? If not, > > > I'm going to file an RFE :-) . > > > > I don't think there's any need for an RFE, provided you're satisfied > > with (A<=1).sum(). > > > > Regards, > > Todd > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > > Use IT products in your business? Tell us what you think of them. Give us > > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > > http://productguide.itmanagersjournal.com/guidepromo.tmpl > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From jmiller at stsci.edu Thu Oct 21 16:41:29 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 21 16:41:29 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098396569.28351.0.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <1098383430.3644.4.camel@halloween.stsci.edu> <1098396116.3644.129.camel@halloween.stsci.edu> <1098396569.28351.0.camel@halloween.stsci.edu> Message-ID: <1098401959.3744.34.camel@localhost.localdomain> On Thu, 2004-10-21 at 18:09, Todd Miller wrote: > On Thu, 2004-10-21 at 18:01, Todd Miller wrote: > > On Thu, 2004-10-21 at 14:30, Todd Miller wrote: > > > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote: > > > > Is there some simple way of counting the number of array elements which > > > > satisfy a certain condition? It is easy to do > > > > > > > > A[A<=1].sum() > > > > > > > > to sum all the values of A which are less than 1, but there doesn't seem > > > > to be a count() method. I tried > > > > > > > > (A<=1).sum() > > > > > > > > but this throws an exception at numarray 1.1. If I try > > > > > > This works now in CVS and will be part of numarray-1.2. > > > > Stephen tried this and it turns out my earlier statement was untrue, > > (A<=1).sum() doesn't do anything reasonable, even in CVS. The problem > > is that sum() is written (without direct C support) to conserve > > storage. As a result, it doesn't do implicit up-casting. I'm pretty sure this was a conscious and discussed choice (this is actually the 2nd time sum() has been wrong). IMHO, the typing for sum() should be revised because it is too dangerous the way it is now. Regards, Todd From nwagner at mecha.uni-stuttgart.de Fri Oct 22 02:17:16 2004 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Fri Oct 22 02:17:16 2004 Subject: [Numpy-discussion] Problems with complex matrices Message-ID: <4178CEFF.2050608@mecha.uni-stuttgart.de> Hi all, Another bug is revealed Traceback (most recent call last): File "complex_it.py", line 6, in ? res=dot(A,x)-r File "/usr/lib/python2.3/site-packages/Numeric/dotblas/__init__.py", line 55, in dot if multiarray.array(a).shape == () or multiarray.array(b).shape == (): TypeError: a float is required Nils -------------- next part -------------- A non-text attachment was scrubbed... Name: complex_it.py Type: text/x-python Size: 139 bytes Desc: not available URL: From Sebastien.deMentendeHorne at electrabel.com Fri Oct 22 02:44:46 2004 From: Sebastien.deMentendeHorne at electrabel.com (Sebastien.deMentendeHorne at electrabel.com) Date: Fri Oct 22 02:44:46 2004 Subject: [Numpy-discussion] Problems with complex matrices Message-ID: <035965348644D511A38C00508BF7EAEB145CB168@seacex03.eib.electrabel.be> gmres returns a tuple so you should have used res = dot(A, x[0]) - r seb > -----Original Message----- > From: Nils Wagner [mailto:nwagner at mecha.uni-stuttgart.de] > Sent: vendredi 22 octobre 2004 11:13 > To: SciPy Users List; numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] Problems with complex matrices > > > Hi all, > > Another bug is revealed > > Traceback (most recent call last): > File "complex_it.py", line 6, in ? > res=dot(A,x)-r > File > "/usr/lib/python2.3/site-packages/Numeric/dotblas/__init__.py", > line 55, in dot > if multiarray.array(a).shape == () or > multiarray.array(b).shape == (): > TypeError: a float is required > > Nils > > From Chris.Barker at noaa.gov Fri Oct 22 11:07:32 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Oct 22 11:07:32 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098392607.8249.20.camel@freyer.sfo.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> Message-ID: <41794B47.4090909@noaa.gov> Stephen Walton wrote: > There is a difference between the sum() Ufunc and the sum() method which > is not mentioned in the documentation: the function works along an > axis, while the method works on the whole array. That is, A.sum() and > A.flat.sum() are equivalent regardless of the rank of A. Bummer. I was hoping this was a move to a more object-oriented style, rather than different functionality. Also, it's pretty confusing terminology, particularly if it's not documented! Why not .SumAll() or something? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rowen at u.washington.edu Fri Oct 22 11:20:36 2004 From: rowen at u.washington.edu (Russell E Owen) Date: Fri Oct 22 11:20:36 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <41794B47.4090909@noaa.gov> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> Message-ID: At 11:02 AM -0700 2004-10-22, Chris Barker wrote: >Stephen Walton wrote: > >> There is a difference between the sum() Ufunc and the sum() method which >> is not mentioned in the documentation: the function works along an >> axis, while the method works on the whole array. That is, A.sum() and >> A.flat.sum() are equivalent regardless of the rank of A. > > >Bummer. I was hoping this was a move to a more object-oriented >style, rather than different functionality. Also, it's pretty >confusing terminology, particularly if it's not documented! Why not >.SumAll() or something? I agree. Numarray is already confusing enough without identically named functions and methods that do different things. (nElements and size are another pet peeve, with size used in several places and nElements appearing exactly once. Though I am grateful to whoever added size as a workalike for nElements; formerly you had to know what kind of array you had before you knew how to find out how many elements it had.) -- Russell From strawman at astraw.com Fri Oct 22 11:25:58 2004 From: strawman at astraw.com (Andrew Straw) Date: Fri Oct 22 11:25:58 2004 Subject: [Numpy-discussion] floating point exception weirdness In-Reply-To: <411A08FA.7000601@astraw.com> References: <4119BBFC.6020304@astraw.com> <1092221365.3752.32.camel@localhost.localdomain> <411A08FA.7000601@astraw.com> Message-ID: <41795006.1040807@astraw.com> I've isolated a bug I first reported on this mailing list in August. I've now confined it to a small code snippet using entirely open-source software (previously I saw it while using Intel's IPP). In a nutshell, importing numarray.ieeespecial triggers a floating point exception (which kills my program) when I call Numeric's singular_value_decomposition() function: import Numeric from LinearAlgebra import singular_value_decomposition if want_FPE: import numarray.ieeespecial A= [[-5.7, 2.2, -0.53, 46.0], [-2.3, -5.5, -1.0, 1091.0], [5.9, 1.4, -0.1, -142.0], [-1.3, 5.7, -1.5, 2673.0]] A=Numeric.array(A) u,s,v = singular_value_decomposition(A) # FPE triggered here Here's my setup: $ python Python 2.3.4 (#2, Sep 24 2004, 08:39:09) [GCC 3.3.4 (Debian 1:3.3.4-12)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import Numeric >>> Numeric.__version__ '23.6' >>> import numarray >>> numarray.__version__ '1.2a' $ gcc -v Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.4/specs Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux Thread model: posix gcc version 3.3.4 (Debian 1:3.3.4-13) Now, for the clue: the above error is ONLY triggered when I compile Numeric to use system blas and friends, not when I use lapack_lite included with Numeric. This leads me to suspect it is related to the SSE2 unit -- I have Debian sarge's atlas3-base, atlas3-see, atlas3-sse2, blas, lapack, lapack3, and refblas3 packages installed on my P4 machine. So, to propose a hypothesis: numarray.ieeespecial sets the FPE bit in the SSE2 hardware, but for some reason this does not raise SIGFPE. However, when the next call that touches SSE2 happens, the kernel sees that error bit and throws the signal. Does this explanation make sense? Is it easy to fix? Cheers! Andrew From jmiller at stsci.edu Fri Oct 22 14:19:17 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Oct 22 14:19:17 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> Message-ID: <1098479844.29804.260.camel@halloween.stsci.edu> On Fri, 2004-10-22 at 14:17, Russell E Owen wrote: > At 11:02 AM -0700 2004-10-22, Chris Barker wrote: > >Stephen Walton wrote: > > > >> There is a difference between the sum() Ufunc and the sum() method which > >> is not mentioned in the documentation: the function works along an > >> axis, while the method works on the whole array. That is, A.sum() and > >> A.flat.sum() are equivalent regardless of the rank of A. > > > > > >Bummer. I was hoping this was a move to a more object-oriented > >style, rather than different functionality. Also, it's pretty > >confusing terminology, particularly if it's not documented! Why not > >.SumAll() or something? sumAll() would certainly be better. Unless there are objections, I'll rename the current sum() method to sumAll() and re-write sum() to give a deprecation warning before calling sumAll(). Eventually, it'll go away altogether. I reviewed the discussion of the sum() result type from a year ago: "[Numpy-discussion] sum and mean methods behaviour". We discussed sum() in depth and AFIK I implemented the recommendations. The results need to be documented. By default, sum() now uses the maximum type of the type family of the array, so families Bool, Integer, UnsignedInteger, Float, or Complex result in max types Bool, Int64, UInt64, Float64, Complex64. I'm not sure why we segregated Bool and it looks like a mistake to me now. I'm thinking the Bool "family" should just go away and be re-classified as UnsignedInteger. These ideas are captured by the numerictypes.MaximumType() function which is also potentially useful for any reduction. > I agree. Numarray is already confusing enough without identically > named functions and methods that do different things. True enough. This'll be fixed. > (nElements and > size are another pet peeve, with size used in several places and > nElements appearing exactly once. Though I am grateful to whoever > added size as a workalike for nElements; formerly you had to know > what kind of array you had before you knew how to find out how many > elements it had.) I'm not sure what you mean here. When I grepped, I got 52 hits for nelements() in the numarray source, let alone what users have done with it. Right now, IMHO, it's not clearly broken and there are bigger fish to fry. Regards, Todd From stephen.walton at csun.edu Fri Oct 22 14:37:05 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Fri Oct 22 14:37:05 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> Message-ID: <1098480955.11372.19.camel@freyer.sfo.csun.edu> On Fri, 2004-10-22 at 11:17, Russell E Owen wrote about the sum() Ufunc vs. the sum() method: > Numarray is already confusing enough without identically > named functions and methods that do different things When I went through the Numarray docs and made suggestions for improvements (see the list I posted at Sourceforge), I didn't make any comments about functional changes, only what the documentation said. Since the sum() method is documented using 1-D arrays, you can't tell that it in fact behaves differently than the sum() Ufunc. On reflection, I also agree that the Ufuncs and methods should behave the same way. Why do you say 'numarray is confusing'? What in the docs would help un-confuse it, in your view? -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From rowen at u.washington.edu Fri Oct 22 14:48:03 2004 From: rowen at u.washington.edu (Russell E Owen) Date: Fri Oct 22 14:48:03 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> Message-ID: At 5:17 PM -0400 2004-10-22, Todd Miller wrote: >On Fri, 2004-10-22 at 14:17, Russell E Owen wrote: >> I agree. Numarray is already confusing enough without identically >> named functions and methods that do different things. > >True enough. This'll be fixed. Great! >> (nElements and >> size are another pet peeve, with size used in several places and >> nElements appearing exactly once. Though I am grateful to whoever >> added size as a workalike for nElements; formerly you had to know >> what kind of array you had before you knew how to find out how many >> elements it had.) > >I'm not sure what you mean here. When I grepped, I got 52 hits for >nelements() in the numarray source, let alone what users have done with >it. Right now, IMHO, it's not clearly broken and there are bigger fish >to fry. Since you ask... I'm counting the number of implementations in the public interface of the numarray package. There are four implementations of size (including the numarray array method, which is simply a synonym for nelements), but only one implementation of nelements. When I started using numarray, the following was true: * numarray had a function named size. * numarray.ma had the same function * numarray.ma arrays had method size * All of these worked the same way: size(array, axis=None) size returns the number of elements in an array or along the specified axis. BUT numarray arrays had no method size. Instead there was a method nelements, which did the same thing as size, but had no "axis" argument. This was very confusing, and I got tripped up badly because I was trying to count array elements and was using both "normal" numarray arrays and masked arrays. I filed PR 934514 and some kind soul patched the problem by making size a synonym for nelements. There is a bit of residual mess because the new size does not have the axis argument. And then there's the historical clutter of two ways to do the same thing, but presumably one just lives with that. Though it seems a bit strange to me not to deprecate nelements and stop using it internally. -- Russell From Fernando.Perez at colorado.edu Fri Oct 22 14:50:04 2004 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Fri Oct 22 14:50:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> Message-ID: <41797FFE.8090802@colorado.edu> Todd Miller wrote: > sumAll() would certainly be better. > > Unless there are objections, I'll rename the current sum() method to > sumAll() and re-write sum() to give a deprecation warning before calling > sumAll(). Eventually, it'll go away altogether. silly, minor nit: can we avoid mixed case names? Either sum_all or SumAll? I'm not too fond of CamelCase, but camelCase looks even worse to me :) As I said, it's just a minor nit. I don't know if there's an official naming policy for numarray, so please don't get angry at me if my comment is out of place. Best, f From Chris.Barker at noaa.gov Fri Oct 22 15:12:01 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Oct 22 15:12:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> Message-ID: <4179853F.8040800@noaa.gov> Todd Miller wrote: > By default, sum() now uses the maximum type of the type family of the > array, so families Bool, Integer, UnsignedInteger, Float, or Complex > result in max types Bool, Int64, UInt64, Float64, Complex64. I'm not > sure why we segregated Bool and it looks like a mistake to me now. I'm > thinking the Bool "family" should just go away and be re-classified as > UnsignedInteger. Well, I think that the idea of a bool being different than an int is often useful. In this case, we want Bool to behave like an integer, so that we can use some version of sum() to add up all the true values. This is handy, but maybe we need more complete support for boolean arrays, rather than getting rid of them. For instance, there could be a NumTrue() function or method, for this case. I would probably maintain the easy conversion of a Bool array to an Int array, for when you really do need to do math with them. We'd want a compete set, many of which already exist. A few off the top of my head: sometrue alltrue numtrue Maybe mirrors for false: somefalse allfalse numfalse What else would be needed? My vote would be for all of these to be methods of a Bool array, but I'm partial to methods over functions anyway. On the other hand, Python itself is sub classing Bool from integer, so maybe there's little point. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From aisaac at american.edu Fri Oct 22 15:14:07 2004 From: aisaac at american.edu (Alan G Isaac) Date: Fri Oct 22 15:14:07 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu><417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu><41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> Message-ID: On 22 Oct 2004, Todd Miller apparently wrote: > sumAll() would certainly be better. > Unless there are objections, I'll rename the current sum() method to > sumAll() and re-write sum() to give a deprecation warning before calling > sumAll(). Eventually, it'll go away altogether. Just two thoughts from a new user. i. I agree that .sumAll is better than the current name confusion. ii. even better, I propose, would be for .sum to take an axis argument, with default matching the sum function, and possible value axis="all". For the transition, the axis argument can be required. fwiw, Alan Isaac From rowen at u.washington.edu Fri Oct 22 15:19:02 2004 From: rowen at u.washington.edu (Russell E Owen) Date: Fri Oct 22 15:19:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098480955.11372.19.camel@freyer.sfo.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098480955.11372.19.camel@freyer.sfo.csun.edu> Message-ID: At 2:35 PM -0700 2004-10-22, Stephen Walton wrote: >On Fri, 2004-10-22 at 11:17, Russell E Owen wrote about the sum() Ufunc >vs. the sum() method: > >> Numarray is already confusing enough without identically >> named functions and methods that do different things > >When I went through the Numarray docs and made suggestions for >improvements (see the list I posted at Sourceforge), I didn't make any >comments about functional changes, only what the documentation said. >Since the sum() method is documented using 1-D arrays, you can't tell >that it in fact behaves differently than the sum() Ufunc. On >reflection, I also agree that the Ufuncs and methods should behave the >same way. > >Why do you say 'numarray is confusing'? What in the docs would help >un-confuse it, in your view? OK, since I seem to be in a grumpy mood today, here are some examples (probably nothing new here): - I'll expose my ignorance, but I find the take stuff and fancy indexing nearly incomprehensible. I've tried to follow the examples (several times--i.e. every time I need to do something fancy), but generally I either flail around until I find something that works, or give up and write a C extension. - I'd like to write C/C++ code that would work on multiple array types. This seems a natural use of C++ templates, but that doesn't seem to be "how it's done". I hate to think how the internal code is managing this without being a horrible sphaghetti of code repeated for each array type. The nd_image package is the closest I've come to finding source code that makes any sense to me in this areay. But it uses so many custom-defined specialized functions that I figured it was just too much work to figure out w/out a manual (and risky to rely on these functions since they are internal to the package). So I gave up and just support the one data type I really need now. Very disappointing. - Important functions are sometimes buried in a non-obvious (to me) sub-package. For example: try to find that location at which an array has a minimum value (if there's more than one such point, pick any). You'd think it'd be a standard numarray function, wouldn't you? After all, you can ask for the minimum value. Now try to find it. Well, I started out by trying to figure out how to get argmin to do the job. Horrible. Fortunately I finally found minimum_position buried in nd_image. - Masked arrays are not integrated. Thus a lot of important filtering and stuff simply cannot be done on masked data without writing custom extensions. For instance I'd like to do a median-filter that ignores masked data (taking the median of non-masked data only). - For 2-d images x and y are reversed. I know this isn't going to change, but it is a headache every time I have to write new image processing code. - I keep wanting more support for dealing with arrays of indices, e.g. "give me all the indices for which this is true", then use that to process the data in an array. Numarray seems to do that kind of operation in an entirely different way, suggesting I'm not "with it" on the underlying philosophy. Unfortunately no really good examples come to mind at the moment (it's been awhile since I've created new code using numarray), though I was fairly well convinced that if I had enough support for this I could code an efficient radial profile function w/out using a C extension. -- Russell From perry at stsci.edu Fri Oct 22 16:50:01 2004 From: perry at stsci.edu (Perry Greenfield) Date: Fri Oct 22 16:50:01 2004 Subject: [Numpy-discussion] In case there are any questions about numarray... In-Reply-To: Message-ID: Todd and I will be away most of next week at a conference and will likely not have a chance to respond to questions about numarray or continue the current discussions about the proper numarray interface or improvements to the documentation. Perry From aisaac at american.edu Fri Oct 22 19:17:02 2004 From: aisaac at american.edu (Alan G Isaac) Date: Fri Oct 22 19:17:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <4179853F.8040800@noaa.gov> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu><4179853F.8040800@noaa.gov> Message-ID: More new user feedback ... On Fri, 22 Oct 2004, Chris Barker apparently wrote: > Well, I think that the idea of a bool being different than > an int is often useful. Yes. E.g., applications to directed graphs. > we can use some version of sum() to add up all the > true values. Unclear, but given the existence of sometrue, it seems natural enough to let sum treat a Bool as an integer. Products work naturally, of course. > I would probably maintain > the easy conversion of a Bool array to an Int array, for when you really > do need to do math with them. I would rephrase this. Boolean arrays have a naturally different math, which it would be nice to have supported. It would also be nice to easily convert to Int, when that representation captures the math needed. > We'd want a compete set, many of which already exist. A few off the top > of my head: > sometrue > alltrue > numtrue I'd just let sum handle numtrue. > Maybe mirrors for false: > somefalse, allfalse, numfalse I'd just rely on alltrue, sometrue, and (size less sum) for these. fwiw, Alan From stephen.walton at csun.edu Fri Oct 22 22:23:02 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Fri Oct 22 22:23:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098480955.11372.19.camel@freyer.sfo.csun.edu> Message-ID: <1098508579.3403.6.camel@localhost.localdomain> I had no idea my innocent question would generate so much discussion. Mindful that Perry and Todd are at ADASS in Pasadena next week: On Fri, 2004-10-22 at 15:18 -0700, Russell E Owen wrote: > At 2:35 PM -0700 2004-10-22, Stephen Walton wrote: > > > >Why do you say 'numarray is confusing'? What in the docs would help > >un-confuse it, in your view? > > - I'll expose my ignorance, but I find the take stuff and fancy > indexing nearly incomprehensible. I agree. It took me much experimentation to figure out exactly how it worked. I'd appreciate it very much if you would look at my suggested rewrite of this section of the documentation at http://sourceforge.net/tracker/index.php?func=detail&aid=1047889&group_id=1369&atid=101369 and give me any further thoughts for clarification (post them as comments to the bug report itself). > - I'd like to write C/C++ code that would work on multiple array > types. I can't help much here, other than to say that C and C++ are pretty low level languages, not well suited for this level of abstraction. > - Important functions are sometimes buried in a non-obvious (to me) > sub-package. > For example: try to find that location at which an array has a > minimum value The current index to the documentation seems to include only the function names but not concepts, which is a problem. I myself was trying to remember how to do type conversion; there is no entry in the index for 'conversion' or 'coercion' and I finally grepped my local copy of the HTML files to re-find astype(). > - Masked arrays are not integrated. I haven't tried these yet personally, but I agree that such a feature is a very important one. IRAF got partway along on this but didn't finish it either. Having said that, my workaround/technique for both MATLAB and numarray is to simply put NaN's in the places where this not valid data and do something like sum(sum(A(~isnan(A))) This is MATLAB syntax of course. Something similar in numarray would go a long way to helping me. For example, I have full disk solar images and I'd like to be able to operate on just the sunspot pixels, or just the sky pixels, in a straightforward way. > - For 2-d images x and y are reversed. Are you referring to the fact that C and numarray are row major and Fortran is column major? Or to how images get displayed in the various plot packages? > - I keep wanting more support for dealing with arrays of indices, > e.g. "give me all the indices for which this is true", then use that > to process the data in an array. Numarray seems to do that kind of > operation in an entirely different way, suggesting I'm not "with it" > on the underlying philosophy. There are two ways to do this, both of which work. For example: A=arange(25) sum(A[A<=7]) will work just as you expect. A bool array used as an index picks out those values for which the bool is True. Essentially identical syntax now works in MATLAB too. If you want an index array instead: >>> index=where(A<7) >>> A[index] will do the trick. For arrays of rank greater than 1: >>> A=arange(25,shape=(5,5)) >>> where(A<7) (array([0, 0, 0, 0, 0, 1, 1]), array([0, 1, 2, 3, 4, 0, 1])) which is a tuple of two arrays that can be used to index A: >>> ind1,ind2=where(A<7) >>> A[ind1,ind2] array([0, 1, 2, 3, 4, 5, 6]) >>> A[ind1,ind2]=[6,5,4,3,2,1,0] # assignment works too >>> A array([[ 6, 5, 4, 3, 2], [ 1, 0, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]]) Does this help? -- Stephen Walton Physics & Astronomy CSUN From verveer at embl-heidelberg.de Sat Oct 23 04:14:04 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Sat Oct 23 04:14:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098480955.11372.19.camel@freyer.sfo.csun.edu> Message-ID: <9633F2FA-24E4-11D9-B9D4-000D932805AC@embl-heidelberg.de> I thought I just give my point of view on this, since I do believe we should give these some thought. On Oct 23, 2004, at 12:18 AM, Russell E Owen wrote: > OK, since I seem to be in a grumpy mood today, here are some examples > (probably nothing new here): > - I'll expose my ignorance, but I find the take stuff and fancy > indexing nearly incomprehensible. I've tried to follow the examples > (several times--i.e. every time I need to do something fancy), but > generally I either flail around until I find something that works, or > give up and write a C extension. I agree, it is very complicated, I always have trouble getting understanding what is going on when I use take and indexing. More documentation may help. > - I'd like to write C/C++ code that would work on multiple array > types. This seems a natural use of C++ templates, but that doesn't > seem to be "how it's done". I hate to think how the internal code is > managing this without being a horrible sphaghetti of code repeated for > each array type. This is a good point. If you look at examples for implementing something in C, you always see that the code only handles a single data type, usually converting all input to double type. That is not always a good way to write an extension if you want it to be of generic use (e.g. the FFT module does not handle 32 bits floating point well, which is a problem for big arrays). Some support in writing functions that handle multiple data types would be good. > The nd_image package is the closest I've come to finding source code > that makes any sense to me in this areay. But it uses so many > custom-defined specialized functions that I figured it was just too > much work to figure out w/out a manual (and risky to rely on these > functions since they are internal to the package). The internal nd_image C functions are indeed not exported and should not be used to implement extensions. That is going to stay that way since I do not plan to document these, and in any case, exposing such functions is not the purpose of the module. On the other hand, some of the techniques use may be generally useful. I could try to factor some of the functions and macros out and write something up on the use of these to write extensions that handle multiple data types. > So I gave up and just support the one data type I really need now. > Very disappointing. Yes, it should be easier to do this, I agree. Using C macros as a 'poor man' templating system is in fact not too complicated (although pretty ugly). Another approach that I have tried to use in nd_image is to provide generic functions that take a python or a C function to implement functionality. For instance to implement an arbitrary filter function in nd_image you only need to implement a function that calculates the filter at one point. You then call a generic filter function that does the heavy lifting of dealing with multiple array types, iterating over the array, dealing with borders and such, applying the function at each array element. The filter function can be in python, but can also be a C function, communicated by a CObject. Maybe some of these type functions could be provided with the numarray package. This could simplify writing extensions a lot. Would there be interest for a package of such functions? If there is I could think about it a bit more, and propose (and implement) something in the form of an extension. > - Important functions are sometimes buried in a non-obvious (to me) > sub-package. > > For example: try to find that location at which an array has a minimum > value (if there's more than one such point, pick any). You'd think > it'd be a standard numarray function, wouldn't you? After all, you can > ask for the minimum value. Now try to find it. Agreed, this bothered me too. > Well, I started out by trying to figure out how to get argmin to do > the job. Horrible. > > Fortunately I finally found minimum_position buried in nd_image. It is there because numarray did not provide it... But it is also there because it offers much functionality that would not be appropriate for the main package. It is part of the object measurement functions. A simpler, possibly more efficient routine should maybe be part of the main package. > - Masked arrays are not integrated. Thus a lot of important filtering > and stuff simply cannot be done on masked data without writing custom > extensions. For instance I'd like to do a median-filter that ignores > masked data (taking the median of non-masked data only). I agree very much! To be honest, I do not like the ma package much. I don't like the idea of having to use a separate package with a different array type that duplicates the functionality in the main package. I think it would be much better if all functions (where it makes sense) in numarray would accept an optional mask argument. To me it makes more sense to provide the mask with the operation, not as part of the array like in ma (a package like ma could still be layered on top.) I realize it would be a lot of work to make all numarray functions mask aware, but it is something to think about maybe. > - For 2-d images x and y are reversed. I know this isn't going to > change, but it is a headache every time I have to write new image > processing code. This is not really a problem I think, but you have to get used to it. If you treat the last dimension always as X and the first as Y, you have the same layout in memory as is usual in most image processing software. So X corresponds to axis=1 and Y to axis=0. Or use axis=-1 and axis=-2. Cheers, Peter From aisaac at american.edu Sat Oct 23 12:01:04 2004 From: aisaac at american.edu (Alan G Isaac) Date: Sat Oct 23 12:01:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu><417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu><41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> Message-ID: On Fri, 22 Oct 2004 Alan G Isaac apparently wrote: > Just two thoughts from a new user. > i. I agree that .sumAll is better than the current name > confusion. > ii. even better, I propose, would be for .sum to take > an axis argument, with default matching the sum function, > and possible value axis="all". > For the transition, the axis argument can be required. That should have been: axis=None fwiw, Alan Isaac From stephen.walton at csun.edu Sun Oct 24 19:22:03 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Sun Oct 24 19:22:03 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <41797FFE.8090802@colorado.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> Message-ID: <1098670236.1907.21.camel@localhost.localdomain> On Fri, 2004-10-22 at 14:47, Fernando Perez wrote: > silly, minor nit: can we avoid mixed case names? Either sum_all or SumAll? I'm > not too fond of CamelCase, but camelCase looks even worse to me :) I agree with Fernando about CamelCase (which among other things seriously bites one when moving from case-sensitive to case-insensitive OS's). But I want to make a broader point: I don't think we need sumall. The methods and the functions should simply work the same way. If one wants sumall, use A.flat.sum() or, if you can't use the methods or attributes on your old version of Python, sum(ravel(A)). If you start writing sumall, then you'll need meanall, stdall, prodall, etc, etc. The flat attribute and ravel function/method already provide all the needed functionality. Just trying to save Todd some work. Steve -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From verveer at embl-heidelberg.de Mon Oct 25 01:37:05 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 01:37:05 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098670236.1907.21.camel@localhost.localdomain> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> Message-ID: <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> On 25 Oct 2004, at 04:17, Stephen Walton wrote: > On Fri, 2004-10-22 at 14:47, Fernando Perez wrote: > >> silly, minor nit: can we avoid mixed case names? Either sum_all or >> SumAll? I'm >> not too fond of CamelCase, but camelCase looks even worse to me :) > > I agree with Fernando about CamelCase (which among other things > seriously bites one when moving from case-sensitive to case-insensitive > OS's). But I want to make a broader point: > > I don't think we need sumall. The methods and the functions should > simply work the same way. If one wants sumall, use A.flat.sum() or, if > you can't use the methods or attributes on your old version of Python, > sum(ravel(A)). If you start writing sumall, then you'll need meanall, > stdall, prodall, etc, etc. The flat attribute and ravel > function/method > already provide all the needed functionality. I think this may be inefficient, because ravel and flat may make a copy of the data. Also I think using flat/ravel in such a way is plain ugly and a complex way to do it. But I do agree that it is not a good idea to introduce another set of names. In my opinion functions that calculate a statistic like sum should return the total in the first place, rather then over a single axis. But I guess it is too late to change that for sum, because of backward compatibility. Cheers, Peter From stephen.walton at csun.edu Mon Oct 25 09:20:02 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Mon Oct 25 09:20:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> Message-ID: <1098721171.19183.12.camel@sunspot.csun.edu> On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote: > On 25 Oct 2004, at 04:17, Stephen Walton wrote: > > > > I don't think we need sumall. The methods and the functions should > > simply work the same way. If one wants sumall, use A.flat.sum() or, if > > you can't use the methods or attributes on your old version of Python, > > sum(ravel(A)). > > I think this may be inefficient, because ravel and flat may make a copy > of the data. Also I think using flat/ravel in such a way is plain ugly > and a complex way to do it. You may be right about the copying, I couldn't say. I don't think sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array, but ugly is in the eye of the beholder. > In my opinion functions that calculate a statistic like sum > should return the total in the first place, rather then over a single > axis. It depends on the data. I use rank-2 arrays which are images and are therefore homogeneous. Even there, though, I often want the sum of all rows or all columns. For heterogeneous data (e.g., columns of different Y's as a function of X), the present sum() makes sense. In other words, we will always need ways to sum over just one dimension and over all dimensions. By analogy with MATLAB (I'm guessing), sum() in Numeric and numarray does a one-D sum. -- Stephen Walton, Professor of Physics and Astronomy, California State University, Northridge stephen.walton at csun.edu From tim.hochberg at cox.net Mon Oct 25 09:32:01 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Oct 25 09:32:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> Message-ID: <417D2A3C.7010108@cox.net> Stephen Walton wrote: >On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote: > > >>On 25 Oct 2004, at 04:17, Stephen Walton wrote: >> >> >>>I don't think we need sumall. The methods and the functions should >>>simply work the same way. If one wants sumall, use A.flat.sum() or, if >>>you can't use the methods or attributes on your old version of Python, >>>sum(ravel(A)). >>> >>> >>I think this may be inefficient, because ravel and flat may make a copy >>of the data. Also I think using flat/ravel in such a way is plain ugly >>and a complex way to do it. >> >> > >You may be right about the copying, I couldn't say. I don't think >sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array, >but ugly is in the eye of the beholder. > > I'm not sure how feasible it is, but I'd much rather an efficient, non-copying, 1-D view of an noncontiguous array (from an enhanced version of flat or ravel or whatever) than a bunch of extra methods. The former allows all of the standard methods to just work efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, etc]. Making special whole array methods for everything just leads to method eplosion. -tim > > >>In my opinion functions that calculate a statistic like sum >>should return the total in the first place, rather then over a single >>axis. >> >> > >It depends on the data. I use rank-2 arrays which are images and are >therefore homogeneous. Even there, though, I often want the sum of all >rows or all columns. For heterogeneous data (e.g., columns of different >Y's as a function of X), the present sum() makes sense. In other words, >we will always need ways to sum over just one dimension and over all >dimensions. By analogy with MATLAB (I'm guessing), sum() in Numeric and >numarray does a one-D sum. > > > From stephen.walton at csun.edu Mon Oct 25 09:35:06 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Mon Oct 25 09:35:06 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> Message-ID: <1098722079.19183.22.camel@sunspot.csun.edu> On Mon, 2004-10-25 at 09:19 -0700, Stephen Walton wrote: > On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote: > > > I think this may be inefficient, because ravel and flat may make a copy > > of the data. Also I think using flat/ravel in such a way is plain ugly > > and a complex way to do it. > > You may be right about the copying, I couldn't say. I just looked at the source (numeric-1.1/Lib/generic.py). The comment to the ravel() function states that it returns a view, not a copy; but it calls reshape() which does make a copy if the input array is not contiguous. I just tested this: A=arange(25,shape=(5,5)) A.transpose() # now A is not contiguous v=ravel(A) A[2,2]=-17 v # verifies that v did not change. So, in the above, it does look like ravel() made a copy, and your fears about inefficiency are warranted. Another test shows that changing ravel(A) to A.flat above also results in a copy. Mayhaps we need sumall() after all. -- Stephen Walton, Professor of Physics and Astronomy, California State University, Northridge stephen.walton at csun.edu From verveer at embl-heidelberg.de Mon Oct 25 09:44:04 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 09:44:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> Message-ID: <0BC8D972-26A5-11D9-9F77-000A95C92C8E@embl-heidelberg.de> On 25 Oct 2004, at 18:19, Stephen Walton wrote: > On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote: >> On 25 Oct 2004, at 04:17, Stephen Walton wrote: >>> >>> I don't think we need sumall. The methods and the functions should >>> simply work the same way. If one wants sumall, use A.flat.sum() or, >>> if >>> you can't use the methods or attributes on your old version of >>> Python, >>> sum(ravel(A)). >> >> I think this may be inefficient, because ravel and flat may make a >> copy >> of the data. Also I think using flat/ravel in such a way is plain ugly >> and a complex way to do it. > > You may be right about the copying, I couldn't say. I don't think > sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array, > but ugly is in the eye of the beholder. It does not look worse, I agree with that! But I would argue it should have been sum(A) in the first place to sum over al axes... The sumall would not have been needed, and summing over one (or a sub-set) axis could have been implemented as a an optional argument to sum(). > >> In my opinion functions that calculate a statistic like sum >> should return the total in the first place, rather then over a single >> axis. > > It depends on the data. I use rank-2 arrays which are images and are > therefore homogeneous. Even there, though, I often want the sum of all > rows or all columns. For heterogeneous data (e.g., columns of > different > Y's as a function of X), the present sum() makes sense. In other > words, > we will always need ways to sum over just one dimension and over all > dimensions. By analogy with MATLAB (I'm guessing), sum() in Numeric > and > numarray does a one-D sum. I agree it is a useful feature, and it should still be possible to do that using an optional axis argument, even better I would love to be able to sum over several axes in one go, I find the one-dimensional character of reduce limiting, but I digress. In any case, I suppose we will stick with the current behaviour for backwards compatibility. Cheers, Peter From verveer at embl-heidelberg.de Mon Oct 25 09:47:01 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 09:47:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098722079.19183.22.camel@sunspot.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <1098722079.19183.22.camel@sunspot.csun.edu> Message-ID: <60595242-26A5-11D9-9F77-000A95C92C8E@embl-heidelberg.de> On 25 Oct 2004, at 18:34, Stephen Walton wrote: > On Mon, 2004-10-25 at 09:19 -0700, Stephen Walton wrote: >> On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote: >> >>> I think this may be inefficient, because ravel and flat may make a >>> copy >>> of the data. Also I think using flat/ravel in such a way is plain >>> ugly >>> and a complex way to do it. >> >> You may be right about the copying, I couldn't say. > > I just looked at the source (numeric-1.1/Lib/generic.py). The comment > to the ravel() function states that it returns a view, not a copy; but > it calls reshape() which does make a copy if the input array is not > contiguous. I just tested this: > > A=arange(25,shape=(5,5)) > A.transpose() # now A is not contiguous > v=ravel(A) > A[2,2]=-17 > v # verifies that v did not change. > > So, in the above, it does look like ravel() made a copy, and your fears > about inefficiency are warranted. Another test shows that changing > ravel(A) to A.flat above also results in a copy. Mayhaps we need > sumall() after all. Yes, we do I guess, but I do not like such things creeping into an otherwise elegant package if I may be frank... Peter From strang at nmr.mgh.harvard.edu Mon Oct 25 09:53:00 2004 From: strang at nmr.mgh.harvard.edu (Gary Strangman) Date: Mon Oct 25 09:53:00 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <417D2A3C.7010108@cox.net> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> Message-ID: > I'm not sure how feasible it is, but I'd much rather an efficient, > non-copying, 1-D view of an noncontiguous array (from an enhanced version of > flat or ravel or whatever) than a bunch of extra methods. The former allows > all of the standard methods to just work efficiently using sum(ravel(A)) or > sum(A.flat) [ and max and min, etc]. Making special whole array methods for > everything just leads to method eplosion. I completely agree with this ... an efficient flat/ravel would seem to solve many of the issues being raised. Forgive the potentially naive question here, but is there any reason such an efficient, enhanced view can't be implemented for the .flat method? I like the concept of .flat, but I regularly call functions with arguments that may-or-may-not be contiguous. For robustness, such functions _must_ be coded with ravel() because .flat fails for non-contiguous arrays. I never fully understood why there were two ways of "flattening" in the first place. Gary -------------------------------------------------------------- Gary Strangman, PhD | Director, Neural Systems Group Office: 617-724-0662 | Massachusetts General Hospital Fax: 617-726-4078 | 149 13th Street, Ste 10018 | Charlestown, MA 02129 From verveer at embl-heidelberg.de Mon Oct 25 10:09:05 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 10:09:05 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> Message-ID: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> On 25 Oct 2004, at 18:51, Gary Strangman wrote: > >> I'm not sure how feasible it is, but I'd much rather an efficient, >> non-copying, 1-D view of an noncontiguous array (from an enhanced >> version of flat or ravel or whatever) than a bunch of extra methods. >> The former allows all of the standard methods to just work >> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, >> etc]. Making special whole array methods for everything just leads to >> method eplosion. > > I completely agree with this ... an efficient flat/ravel would seem to > solve many of the issues being raised. Forgive the potentially naive > question here, but is there any reason such an efficient, enhanced > view can't be implemented for the .flat method? I believe it is not possible without copying data. The strides between elements of a noncontiguous array are not always the same, so you cannot efficiently view it as a 1D array. > I like the concept of .flat, but I regularly call functions with > arguments that may-or-may-not be contiguous. For robustness, such > functions _must_ be coded with ravel() because .flat fails for > non-contiguous arrays. Functions should be coded in the first place to take multi-dimensional nature into account in my opinion. One of the points of numarray is that it is multi-dimensional. If a function can work over multiple dimensions, but it only works for 1D arrays, it is broken in my opinion. In my opinion sum() _is_ broken, and introducing a separate sum_all() is an ugly hack. > I never fully understood why there were two ways of "flattening" in > the first place. I suppose it is for efficiency reasons, flat may not always works, but if it does, it is efficient since it would not need to copy any data. Peter From Chris.Barker at noaa.gov Mon Oct 25 10:10:20 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Mon Oct 25 10:10:20 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098508579.3403.6.camel@localhost.localdomain> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098480955.11372.19.camel@freyer.sfo.csun.edu> <1098508579.3403.6.camel@localhost.localdomain> Message-ID: <417D3309.9070302@noaa.gov> A few comments on a number of posts in this thread: Stephen Walton wrote: >>- I'd like to write C/C++ code that would work on multiple array >>types. > > I can't help much here, other than to say that C and C++ are pretty low > level languages, not well suited for this level of abstraction. Well, this is certainly true for C, but not so much for C++. I'm not expert, but C++ templates could be very handy here. When the numarray projects was just getting started, there was some discussion about using a template-based array package as the base, perhaps Blitz++. I still this this was a great idea, but I think the biggest issue at the time was that templates were still not constantly well supported by the wide variety of compilers that numarray should work with. Personally I think that anything supported by gcc should be fine, as anyone can use gcc on virtually any platform, if they want. Anyway, it's too late to re-write numarray, but maybe a numarray <--> blitz++ conversion package would make it easy to write numarray extensions with blitz++. Perhaps even integrate it with Boost.Python. Another option would be to write a template-based wrapper around the existing Numarray objects. By the way, my other issue with extensions is the difficulty of writing extensions that support discontinuous arrays, in addition to multiple data types. It seems someone smarter than me could use C++ classes to solve this one as well. Peter Verveer wrote: > But I do agree that it is not a good idea to introduce another set of > names. In my opinion functions that calculate a statistic like sum > should return the total in the first place, rather then over a single > axis. Absolutely not! I'm far more likely to want it over a single axis, it's the core of "vectorizing" your code. If the data are mean the same thing, why aren't you storing it in a 1-d array? That being said, it should be easy to do various reductions over all axis, which I think .flat() does nicely. I thought .flat() never made a copy: am I wrong? Stephen Walton wrote: > It depends on the data. I use rank-2 arrays which are images and are > therefore homogeneous. OK, good example.... I take back some of what I said above! > By analogy with MATLAB (I'm guessing), sum() in Numeric and > numarray does a one-D sum. except Matab does it worse. If your 2-d array happens to have only one row, you get the sum over that..yecch! Tim Hochberg wrote: > I'm not sure how feasible it is, but I'd much rather an efficient, > non-copying, 1-D view of an noncontiguous array (from an enhanced > version of flat or ravel or whatever) than a bunch of extra methods. The > former allows all of the standard methods to just work efficiently using > sum(ravel(A)) or sum(A.flat) [ and max and min, etc]. Making special > whole array methods for everything just leads to method eplosion. here! here! I thought that was exactly what .flat() was for. Shows what I know! -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rowen at u.washington.edu Mon Oct 25 10:33:02 2004 From: rowen at u.washington.edu (Russell E Owen) Date: Mon Oct 25 10:33:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> Message-ID: At 7:08 PM +0200 2004-10-25, Peter Verveer wrote: >On 25 Oct 2004, at 18:51, Gary Strangman wrote: > >> >>> I'm not sure how feasible it is, but I'd much rather an >>>efficient, non-copying, 1-D view of an noncontiguous array (from >>>an enhanced version of flat or ravel or whatever) than a bunch of >>>extra methods. The former allows all of the standard methods to >>>just work efficiently using sum(ravel(A)) or sum(A.flat) [ and max >>>and min, etc]. Making special whole array methods for everything >>>just leads to method eplosion. >> >> I completely agree with this ... an efficient flat/ravel would >>seem to solve many of the issues being raised. Forgive the >>potentially naive question here, but is there any reason such an >>efficient, enhanced view can't be implemented for the .flat method? > >I believe it is not possible without copying data. The strides >between elements of a noncontiguous array are not always the same, >so you cannot efficiently view it as a 1D array. How about providing an iterator that counts through all the elements of an array (e.g. arr.itervalues()). So long as C extensions could efficiently make use of such an iterator, I think it'd do the job. One could also imagine: - arr.iteritems(), which returned (index, value) for each item - a mask argument: a boolean array the same shape as the data array; True means elide the corresponding value from the data array - general support for indexing More generally, I agree that sum should work the same as a function and a method, and that an extra axis argument could be a good thing (it is so common elsewhere, e.g. size). I'd be tempted to break backwards compatibility to fix this, since numarray is still new and the current situation is very confusing. -- Russell From strang at nmr.mgh.harvard.edu Mon Oct 25 10:38:01 2004 From: strang at nmr.mgh.harvard.edu (Gary Strangman) Date: Mon Oct 25 10:38:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> Message-ID: >> I completely agree with this ... an efficient flat/ravel would seem to >> solve many of the issues being raised. Forgive the potentially naive >> question here, but is there any reason such an efficient, enhanced view >> can't be implemented for the .flat method? > > I believe it is not possible without copying data. The strides between > elements of a noncontiguous array are not always the same, so you cannot > efficiently view it as a 1D array. And it gets even worse for different-stride slices of N-D arrays (though I'm not yet ready to say it's impossible to do without copying). Maybe it's just me, but it does seem somewhat non-pythonic for a function/method to break for an inefficient case, instead of dropping back to less efficient (i.e., copying) behavior. > Functions should be coded in the first place to take multi-dimensional nature > into account in my opinion. One of the points of numarray is that it is > multi-dimensional. If a function can work over multiple dimensions, but it > only works for 1D arrays, it is broken in my opinion. In my opinion sum() > _is_ broken, and introducing a separate sum_all() is an ugly hack. +1. ;-) Hence the thought to make flattening a single "enhanced" method/fcn ... to essentially eliminate the need for such ugly hacks. Typically, my functions accept N-D arguments, and can operate over a user-selected subset of these dimensions. I may pass a whole array, or every other column, or whatever. Judging from the history of this thread, I think a .flat that is as-efficient-as-possible and also robust to all forms of non-contiguity would benefit many, while also reducing the learning-curve issues associated with .flat vs ravel(). As for where/when/how to introduce .newandimprovedflat, welllllll, that's for another thread. ;-) Gary -------------------------------------------------------------- Gary Strangman, PhD | Director, Neural Systems Group Office: 617-724-0662 | Massachusetts General Hospital Fax: 617-726-4078 | 149 13th Street, Ste 10018 | Charlestown, MA 02129 From verveer at embl-heidelberg.de Mon Oct 25 10:42:03 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 10:42:03 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <417D3309.9070302@noaa.gov> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098480955.11372.19.camel@freyer.sfo.csun.edu> <1098508579.3403.6.camel@localhost.localdomain> <417D3309.9070302@noaa.gov> Message-ID: <1A9085AC-26AD-11D9-9F77-000A95C92C8E@embl-heidelberg.de> > Stephen Walton wrote: >>> - I'd like to write C/C++ code that would work on multiple array >>> types. >> I can't help much here, other than to say that C and C++ are pretty >> low >> level languages, not well suited for this level of abstraction. > > Well, this is certainly true for C, but not so much for C++. I'm not > expert, but C++ templates could be very handy here. When the numarray > projects was just getting started, there was some discussion about > using a template-based array package as the base, perhaps Blitz++. I > still this this was a great idea, but I think the biggest issue at the > time was that templates were still not constantly well supported by > the wide variety of compilers that numarray should work with. > Personally I think that anything supported by gcc should be fine, as > anyone can use gcc on virtually any platform, if they want. I think having the option of using C++ would be cool. But as soon as we would 'require' it, I would not develop for numarray anymore. C++ is a big pain in my opinion, although I do agree that a well written templating system like Blitz++ is nice if you actually use C++. > Anyway, it's too late to re-write numarray, but maybe a numarray <--> > blitz++ conversion package would make it easy to write numarray > extensions with blitz++. Perhaps even integrate it with Boost.Python. > Another option would be to write a template-based wrapper around the > existing Numarray objects. yes, it would be nice to have the option. There is no reason why there could not be a C++ API which would include the use of templates layered on top of the current C API for those people that would like to use it. > By the way, my other issue with extensions is the difficulty of > writing extensions that support discontinuous arrays, in addition to > multiple data types. It seems someone smarter than me could use C++ > classes to solve this one as well. I had to deal with that problem too in nd_image. It is doable, albeit ugly if you depend on plain C. Probably C++ could do it differently and more nicely, Blitz++ possible does. Again, not for me. > Peter Verveer wrote: > >> But I do agree that it is not a good idea to introduce another set of >> names. In my opinion functions that calculate a statistic like sum >> should return the total in the first place, rather then over a single >> axis. > > Absolutely not! I'm far more likely to want it over a single axis, > it's the core of "vectorizing" your code. If the data are mean the > same thing, why aren't you storing it in a 1-d array? I agree that it is important, I am just saying that both are very common operations. Why not support operations over an axis by a optional argument, you will often have to specify which axis you want anyway. > That being said, it should be easy to do various reductions over all > axis, which I think .flat() does nicely. I thought .flat() never made > a copy: am I wrong? Unfortunately, flattening an array is not always possible without copying, due to the fact that arrays may be not contiguous in memory. > Tim Hochberg wrote: >> I'm not sure how feasible it is, but I'd much rather an efficient, >> non-copying, 1-D view of an noncontiguous array (from an enhanced >> version of flat or ravel or whatever) than a bunch of extra methods. >> The former allows all of the standard methods to just work >> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, >> etc]. Making special whole array methods for everything just leads to >> method eplosion. > > here! here! I thought that was exactly what .flat() was for. Shows > what I know! It is however not feasible I think to do it efficiently. It seems to me that a set of functions is necessary to do things like sum, minimum and so on, that work on the whole array. I would also prefer they are not methods. Introducing a whole array of sum_all() like functions is also not great. Cheers, Peter From verveer at embl-heidelberg.de Mon Oct 25 11:04:01 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 11:04:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> Message-ID: <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> On 25 Oct 2004, at 19:32, Russell E Owen wrote: > At 7:08 PM +0200 2004-10-25, Peter Verveer wrote: >> On 25 Oct 2004, at 18:51, Gary Strangman wrote: >> >>> >>>> I'm not sure how feasible it is, but I'd much rather an efficient, >>>> non-copying, 1-D view of an noncontiguous array (from an enhanced >>>> version of flat or ravel or whatever) than a bunch of extra >>>> methods. The former allows all of the standard methods to just work >>>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, >>>> etc]. Making special whole array methods for everything just leads >>>> to method eplosion. >>> >>> I completely agree with this ... an efficient flat/ravel would seem >>> to solve many of the issues being raised. Forgive the potentially >>> naive question here, but is there any reason such an efficient, >>> enhanced view can't be implemented for the .flat method? >> >> I believe it is not possible without copying data. The strides >> between elements of a noncontiguous array are not always the same, so >> you cannot efficiently view it as a 1D array. > > How about providing an iterator that counts through all the elements > of an array (e.g. arr.itervalues()). So long as C extensions could > efficiently make use of such an iterator, I think it'd do the job. It would still be slower, because you would need a function call at each element that returns a value. Not a problem if you do a lot of work at each element, but if you are just adding values you want a custom written C function. You can do it a the C level with macros or so, (I do that in nd_image) but that would not help at the python level. > One could also imagine: > - arr.iteritems(), which returned (index, value) for each item > - a mask argument: a boolean array the same shape as the data array; > True means elide the corresponding value from the data array > - general support for indexing Essentially you are suggesting to expose iterators at the python level that iterate over an array in some predefined way. That is possible, but I doubt it will be efficient. At the C level however, it might be worth thinking about as a way of easing writing functions in C. I proposed to do it the other way around in an earlier mail: providing a set of generic functions that take a python or a C function to be applied at each element. I most likely will implement something in that direction, but I should give your idea also some thought. > More generally, I agree that sum should work the same as a function > and a method, and that an extra axis argument could be a good thing > (it is so common elsewhere, e.g. size). I'd be tempted to break > backwards compatibility to fix this, since numarray is still new and > the current situation is very confusing. I would absolutely vote for such a change. Simply because we would like a range of such functions, e.g. minimum, maximum, and so on. Even if we have to leave sum() as it is, I think we should have the alternatives, we would just have to come up with an alternative name for sum(). In fact I would consider volunteering implementing these functions. Peter From tim.hochberg at cox.net Mon Oct 25 14:03:03 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Oct 25 14:03:03 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> Message-ID: <417D69CD.7070604@cox.net> Peter Verveer wrote: > > On 25 Oct 2004, at 19:32, Russell E Owen wrote: > >> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote: >> >>> On 25 Oct 2004, at 18:51, Gary Strangman wrote: >>> >>>> >>>>> I'm not sure how feasible it is, but I'd much rather an >>>>> efficient, non-copying, 1-D view of an noncontiguous array (from >>>>> an enhanced version of flat or ravel or whatever) than a bunch of >>>>> extra methods. The former allows all of the standard methods to >>>>> just work efficiently using sum(ravel(A)) or sum(A.flat) [ and max >>>>> and min, etc]. Making special whole array methods for everything >>>>> just leads to method eplosion. >>>> >>>> >>>> I completely agree with this ... an efficient flat/ravel would >>>> seem to solve many of the issues being raised. Forgive the >>>> potentially naive question here, but is there any reason such an >>>> efficient, enhanced view can't be implemented for the .flat method? >>> >>> >>> I believe it is not possible without copying data. The strides >>> between elements of a noncontiguous array are not always the same, >>> so you cannot efficiently view it as a 1D array. >> >> >> How about providing an iterator that counts through all the elements >> of an array (e.g. arr.itervalues()). So long as C extensions could >> efficiently make use of such an iterator, I think it'd do the job. > > > It would still be slower, because you would need a function call at > each element that returns a value. Not a problem if you do a lot of > work at each element, but if you are just adding values you want a > custom written C function. You can do it a the C level with macros or > so, (I do that in nd_image) but that would not help at the python level. > >> One could also imagine: >> - arr.iteritems(), which returned (index, value) for each item >> - a mask argument: a boolean array the same shape as the data array; >> True means elide the corresponding value from the data array >> - general support for indexing > > > Essentially you are suggesting to expose iterators at the python level > that iterate over an array in some predefined way. That is possible, > but I doubt it will be efficient. > > At the C level however, it might be worth thinking about as a way of > easing writing functions in C. I proposed to do it the other way > around in an earlier mail: providing a set of generic functions that > take a python or a C function to be applied at each element. I most > likely will implement something in that direction, but I should give > your idea also some thought. > >> More generally, I agree that sum should work the same as a function >> and a method, and that an extra axis argument could be a good thing >> (it is so common elsewhere, e.g. size). I'd be tempted to break >> backwards compatibility to fix this, since numarray is still new and >> the current situation is very confusing. > > > I would absolutely vote for such a change. Simply because we would > like a range of such functions, e.g. minimum, maximum, and so on. Even > if we have to leave sum() as it is, I think we should have the > alternatives, we would just have to come up with an alternative name > for sum(). In fact I would consider volunteering implementing these > functions. Why the need to break backwards compatability? If one is going to reimplement sum, et al so as to operate on an arbitrary set of axes there's no reason one couldn't maintain the current behaviour as the default. All that is required is to allow axis to be a number (current behaviour), a tuple (reduce across the designated axes) or some special value to sum over all (None?, "all"?). Having two sum functions with different names is not particularly better than the current proposal of a method and a function. -tim From verveer at embl-heidelberg.de Mon Oct 25 15:48:03 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 15:48:03 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <417D69CD.7070604@cox.net> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net> Message-ID: On Oct 25, 2004, at 11:02 PM, Tim Hochberg wrote: > Peter Verveer wrote: > >> >> On 25 Oct 2004, at 19:32, Russell E Owen wrote: >> >>> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote: >>> >>>> On 25 Oct 2004, at 18:51, Gary Strangman wrote: >>>> >>>>> >>>>>> I'm not sure how feasible it is, but I'd much rather an >>>>>> efficient, non-copying, 1-D view of an noncontiguous array (from >>>>>> an enhanced version of flat or ravel or whatever) than a bunch of >>>>>> extra methods. The former allows all of the standard methods to >>>>>> just work efficiently using sum(ravel(A)) or sum(A.flat) [ and >>>>>> max and min, etc]. Making special whole array methods for >>>>>> everything just leads to method eplosion. >>>>> >>>>> >>>>> I completely agree with this ... an efficient flat/ravel would >>>>> seem to solve many of the issues being raised. Forgive the >>>>> potentially naive question here, but is there any reason such an >>>>> efficient, enhanced view can't be implemented for the .flat >>>>> method? >>>> >>>> >>>> I believe it is not possible without copying data. The strides >>>> between elements of a noncontiguous array are not always the same, >>>> so you cannot efficiently view it as a 1D array. >>> >>> >>> How about providing an iterator that counts through all the elements >>> of an array (e.g. arr.itervalues()). So long as C extensions could >>> efficiently make use of such an iterator, I think it'd do the job. >> >> >> It would still be slower, because you would need a function call at >> each element that returns a value. Not a problem if you do a lot of >> work at each element, but if you are just adding values you want a >> custom written C function. You can do it a the C level with macros or >> so, (I do that in nd_image) but that would not help at the python >> level. >> >>> One could also imagine: >>> - arr.iteritems(), which returned (index, value) for each item >>> - a mask argument: a boolean array the same shape as the data array; >>> True means elide the corresponding value from the data array >>> - general support for indexing >> >> >> Essentially you are suggesting to expose iterators at the python >> level that iterate over an array in some predefined way. That is >> possible, but I doubt it will be efficient. >> >> At the C level however, it might be worth thinking about as a way of >> easing writing functions in C. I proposed to do it the other way >> around in an earlier mail: providing a set of generic functions that >> take a python or a C function to be applied at each element. I most >> likely will implement something in that direction, but I should give >> your idea also some thought. >> >>> More generally, I agree that sum should work the same as a function >>> and a method, and that an extra axis argument could be a good thing >>> (it is so common elsewhere, e.g. size). I'd be tempted to break >>> backwards compatibility to fix this, since numarray is still new and >>> the current situation is very confusing. >> >> >> I would absolutely vote for such a change. Simply because we would >> like a range of such functions, e.g. minimum, maximum, and so on. >> Even if we have to leave sum() as it is, I think we should have the >> alternatives, we would just have to come up with an alternative name >> for sum(). In fact I would consider volunteering implementing these >> functions. > > Why the need to break backwards compatability? If one is going to > reimplement sum, et al so as to operate on an arbitrary set of axes > there's no reason one couldn't maintain the current behaviour as the > default. It seems to me that the behavior one would expect for a function like that, would be to apply the operation to the whole array. Not along an axis. What would you expect as a new user if you call a minimum() function? A single value that is the minimum. So that is the logical choice for the default behavior, I would think. > All that is required is to allow axis to be a number (current > behaviour), a tuple (reduce across the designated axes) or some > special value to sum over all (None?, "all"?). Yes, that would be the idea anyway. The question is what should be the default behavior for this type of functions, something I think we should not decide based on the current behavior of a single existing function, but based on what makes the most sense. That is obviously something that can be discussed... > > Having two sum functions with different names is not particularly > better than the current proposal of a method and a function. This is certainly true. I would prefer breaking compability... Peter From meikuan75 at hotmail.com Tue Oct 26 02:22:05 2004 From: meikuan75 at hotmail.com (Mei Kuan) Date: Tue Oct 26 02:22:05 2004 Subject: [Numpy-discussion] Singaporeans ay tumutulong para mapaunlad ang sariling negosyo Message-ID: Dear Filipino friend, Kumusta ka na? We were looking and your email just appeared, perhaps it was GOD's will. We sincerely hope that you read on this letter. This may be of significant relevance to you and your loved ones and give you something you are looking for in life. Do allow us to provide you with a brief introduction of ourselves. We are a team of Singaporean entrepreneurs hailing from various professional fields. We know that, in the new millennium, more Filipino employees and professionals are finding it harder to get ahead in life due to greater job insecurity as a result of corporate downsizing and global outsourcing, diminishing wages, office politics, not forgetting constant retrenchment threats. They are further affected by the rising costs of living and interest rates, not forgetting the current economic difficulties that Philippines is currently facing. There are also thousands of Filipinos who have to endure the heart-break of leaving their loved ones to venture overseas in order to support their loved ones and the Philippines economy once again. Filipino businessmen too, have to grapple with increasing economic and political uncertainties, epidemic threats such as the Avian Flu, competitive threats and unstable crude oil crisis. Further, due to the increasingly rapid changes in the business environment, they find it harder to keep up with the increasingly volatile business cycles. We recognise these problems faced by many Filipinos today and decide to embark on a more fulfilling long term career of helping them solve their problems and improving their lives in the process. What we do is to help Filipinos develop/diversify into their own businesses in a new, potentially huge and expanding industry so that they can start managing the above adversities and making significant progress towards what they and their loved ones want in life once again. Would this be something that may be deemed as a long term solution in your life? Our fellow associates from Singapore will be flying specially to the Philippines to conduct a series of exclusive previews in Makati, Cebu and Naga in November. Would you be interested to attend one of our previews for you to discover how our revolutionary platform can possibly help you and your loved ones improve your results on a long-term basis? If you are interested to attend, could you kindly provide us with your cellphone no. for our senior associate, Mr. Chew to text you when he is in Philippines next month? Mr. Chew was a very successful corporate executive from a Multi-National Corporation and a former Economic Lecturer. He held a Master of Science Degree in Financial Economics. Hence, he knows what it takes for a business to be considered a viable one and of course, what it takes to succeed in the business. He gave up a very successful corporate life to help many Filipinos change their lives. Despite his busy schedule, he is committed to flying to Philippines to help them. As such, he is a great mentor, inspirational, dynamic leader to many of us. He gained great respects from many of our Filipinos and non Filipinos friends. We believe he is the best person to share with you in depth how our revolutionary platform can fulfill your goals in life and turn your dreams into reality. We would handle all enquiries via Chikka: 001877961 or Skype: Reychell We sincerely urge you to communicate with us on Chikka/Skype to know you better as a friend and understand the challenges you are currently facing because we are looking to help you on a long-term basis. Ingats. GOD BLESS. Chow Mei Kuan (Ms) / Don (Mr.) Email: reychell at singnet.com.sg /chewlw at singnet.com.sg Chikka No.: 001877961 Skype ID: Reychell P.S.: This may be a GOD-send opportunity to improve your life. Disclaimer: This email, together with any attachments, is intended ONLY for the use of the individual or entity to which it is addressed, and may contain information that is legally privileged, confidential, and/or subject to copyright. If you are not the intended recipient, please be informed that any dissemination, distribution or copying of this email, any attachment, or part thereof is strictly prohibited. Kindly note that internet communications are not secure, and therefore are susceptible to alterations. If you have received this email in error, please advise the sender by reply email, and delete this message. Your co-operation on this matter is highly appreciated. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Tue Oct 26 09:21:08 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Oct 26 09:21:08 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net> Message-ID: <417E7907.9060107@noaa.gov> Peter Verveer wrote: > On Oct 25, 2004, at 11:02 PM, Tim Hochberg wrote: >> Why the need to break backwards compatability? If one is going to >> reimplement sum, et al so as to operate on an arbitrary set of axes >> there's no reason one couldn't maintain the current behaviour as the >> default. Great idea! > It seems to me that the behavior one would expect for a function like > that, would be to apply the operation to the whole array. Not along an > axis. What would you expect as a new user if you call a minimum() > function? A single value that is the minimum. So that is the logical > choice for the default behavior, I would think. nope. I'd expect it to be along an axis, by default the last one. To me, that's what vectorization is all about. Maybe this is because of my MATLAB (and now Numeric) background, but it makes the most sense to me that a method either returns an array of the same rank, or "reducing" methods return an array of rank reduced by one. Having a method return the same rank answer, no matter the rank of the input, is weird to me. This all depends on how you use arrays. I can see that if you tend to use a 2-d array to store an image, that the single minimum would seem logical, but for many other uses, each dimension has an independent meaning. > Yes, that would be the idea anyway. The question is what should be the > default behavior for this type of functions, something I think we should > not decide based on the current behavior of a single existing function, > but based on what makes the most sense. That is obviously something that > can be discussed... yup, but frankly, this isn't about just one function, it's really about all the reductions: min, max, sum, etc, etc. I think the rule of thumb is not to break backward compatibility unless there is a compelling reason, and given that it's not clear what is most "natural" in this case, keeping the default the same makes the most sense. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From verveer at embl-heidelberg.de Tue Oct 26 11:20:02 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Tue Oct 26 11:20:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <417E7907.9060107@noaa.gov> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net> <417E7907.9060107@noaa.gov> Message-ID: <8629C0DC-277B-11D9-8DC3-000D932805AC@embl-heidelberg.de> On Oct 26, 2004, at 6:19 PM, Chris Barker wrote: > Peter Verveer wrote: >> It seems to me that the behavior one would expect for a function like >> that, would be to apply the operation to the whole array. Not along >> an axis. What would you expect as a new user if you call a minimum() >> function? A single value that is the minimum. So that is the logical >> choice for the default behavior, I would think. > > nope. I'd expect it to be along an axis, by default the last one. I still do not agree completely with that, I will elaborate more below, because I also do not agree anymore with my own earlier writings :-). But I see your point that this type of operation can be natural depending on what you are doing. Sometimes a single value does make sense, sometimes not, I think we can agree on that. >> Yes, that would be the idea anyway. The question is what should be >> the default behavior for this type of functions, something I think we >> should not decide based on the current behavior of a single existing >> function, but based on what makes the most sense. That is obviously >> something that can be discussed... > > yup, but frankly, this isn't about just one function, it's really > about all the reductions: min, max, sum, etc, etc. Actually no. It seems that sum() is a special case, along with a few others. Again: I elaborate on the general case below. > I think the rule of thumb is not to break backward compatibility > unless there is a compelling reason, and given that it's not clear > what is most "natural" in this case, keeping the default the same > makes the most sense. I agree. In contrast what I have said before I think we should keep it as it is, for compatibility. Now to elaborate on the general problem, please correct me if I get something wrong. I will use the minimum function as an example and come back to sum() later. If you look at a minimum operation then there are three different things you might like to do: 1) An element by element minimum: minimum(a1, a2). This is the current behaviour. Like all binary ufuncs of this type, it operates on pairs of arrays. So by default it does not do reduction or calculate a single minimum. For most ufuncs that is the natural behavior anyway. 2) A reduction: minimum.reduce(a1). The reduce method of ufuncs is generally used for reductions. Having to use .reduce makes clear what you are doing. Although a bit odd at first sight, I think it is a clever way to overload ufuncs names with different functionality. 3) The minimum of the array: In numarray you do a1.min(). I think in Numeric, you have to do something like minimum.reduce(a1.flat), correct me if I am wrong. Not nice in both cases... Note that calling a binary ufunc with a single argument will give an error: minimum(a1) raises a TypeError. That seems to be a good decision, because people seem to have different ideas of what should happen: I would expect the minimum of the array, others expect a reduction. Generally I guess it was a wise decision not to change the meaning of a function depending on wether it has one or two arguments. The sum() function is an alias to add.reduce. there are a few more of these aliases (i.e. product). I would still say that it is a bit unfortunate, since not everybody may immediately realize that these functions are in fact reductions. I wonder if one would not be better of without these functions at all, after all you can access the functionality through .reduce(). If you mind the extra typing, just define your own alias. Can't we shift them into numarray.numeric? Just a thought... In any case, clearly these functions need to stay around as they are for compatibility reasons. It is far more productive to add the functionality that a few people already proposed: allow reductions over multiple axes. I would welcome that, I always found 1D reductions a bit limited anyway. Obviously you can do sequential 1D reductions, but that can be quite inefficient. As proposed, the axis argument would take maybe a list of dimensions, and 'all' or None. I would like to propose an additional possibility: like minimum.reduce(), we could have a minimum.all() function that reduces over all dimensions (with a potentially much more efficient implementation.) We don't need a sum_all(a1) then, you would use add.all(a1). I guess this would be easily prototyped using sequential reductions, one can worry about efficiency later. Sorry for the long story... Cheers, Peter From haase at msg.ucsf.edu Wed Oct 27 09:59:02 2004 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Wed Oct 27 09:59:02 2004 Subject: [Numpy-discussion] bug? in len(arr.flat) Message-ID: <200410270958.20025.haase@msg.ucsf.edu> Hi, I have a (UInt16) 3d data stack and want to get to it's underlying buffer (to (later) feed it into memmap) ... I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't multiply itemsize in. >>> pr2.shape (40, 512, 512) >>> pr2.flat.shape (10485760) >>> 512*512*40 10485760 >>> len(pr2.flat) 10485760 >>> pr2.flat._itemsize 2 >>> len(pr2._data) 20971520 >>> pr2._byteoffset 0 Is this a bug or am I missunderstanding ? Thanks, Sebastian Haase From strawman at astraw.com Thu Oct 28 19:21:02 2004 From: strawman at astraw.com (Andrew Straw) Date: Thu Oct 28 19:21:02 2004 Subject: [Numpy-discussion] floating point exception weirdness In-Reply-To: <41795006.1040807@astraw.com> References: <4119BBFC.6020304@astraw.com> <1092221365.3752.32.camel@localhost.localdomain> <411A08FA.7000601@astraw.com> <41795006.1040807@astraw.com> Message-ID: <4181A8CC.2040807@astraw.com> Just a small addendum, (which I hope will spur on bug-fixing once Todd et al. are back from the conference -- let me know if I should file a sourceforge bug report): Numeric is not necessary to trigger the bug in the below code -- numarray is sufficient on its own. Furthermore, I can confirm that merely removing the "atlas3-sse2" Debian package from my system causes the code, whether or not numarray.ieeespecial is imported, to run without being killed by an FPE. Andrew Straw wrote: > I've isolated a bug I first reported on this mailing list in August. > I've now confined it to a small code snippet using entirely > open-source software (previously I saw it while using Intel's IPP). > In a nutshell, importing numarray.ieeespecial triggers a floating > point exception (which kills my program) when I call Numeric's > singular_value_decomposition() function: > > import Numeric > from LinearAlgebra import singular_value_decomposition > > if want_FPE: > import numarray.ieeespecial > > A= [[-5.7, 2.2, -0.53, 46.0], > [-2.3, -5.5, -1.0, 1091.0], > [5.9, 1.4, -0.1, -142.0], > [-1.3, 5.7, -1.5, 2673.0]] > A=Numeric.array(A) > u,s,v = singular_value_decomposition(A) # FPE triggered here > > Here's my setup: > > $ python > Python 2.3.4 (#2, Sep 24 2004, 08:39:09) > [GCC 3.3.4 (Debian 1:3.3.4-12)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import Numeric > >>> Numeric.__version__ > '23.6' > >>> import numarray > >>> numarray.__version__ > '1.2a' > > $ gcc -v > Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.4/specs > Configured with: ../src/configure -v > --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang > --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info > --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared > --with-system-zlib --enable-nls --without-included-gettext > --enable-__cxa_atexit --enable-clocale=gnu --enable-debug > --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux > Thread model: posix > gcc version 3.3.4 (Debian 1:3.3.4-13) > > Now, for the clue: the above error is ONLY triggered when I compile > Numeric to use system blas and friends, not when I use lapack_lite > included with Numeric. This leads me to suspect it is related to the > SSE2 unit -- I have Debian sarge's atlas3-base, atlas3-see, > atlas3-sse2, blas, lapack, lapack3, and refblas3 packages installed on > my P4 machine. > > So, to propose a hypothesis: numarray.ieeespecial sets the FPE bit in > the SSE2 hardware, but for some reason this does not raise SIGFPE. > However, when the next call that touches SSE2 happens, the kernel sees > that error bit and throws the signal. Does this explanation make > sense? Is it easy to fix? > > Cheers! > Andrew > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out > more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From stevech1097 at yahoo.com.au Thu Oct 28 21:56:30 2004 From: stevech1097 at yahoo.com.au (Steve Chaplin) Date: Thu Oct 28 21:56:30 2004 Subject: [Numpy-discussion] Re: floating point exception weirdness (Andrew Straw) In-Reply-To: References: Message-ID: <1099025806.2742.23.camel@f1> > Just a small addendum, (which I hope will spur on bug-fixing once Todd > et al. are back from the conference -- let me know if I should file a > sourceforge bug report): I've not read all this thread so I don't know the full background. But I had a floating point / SSE problem using numarray. It turned out to be a glibc not numarray problem and was solved by upgrading glibc. http://sources.redhat.com/bugzilla/show_bug.cgi?id=10 There was also a SourceForge bug report but I can't locate it. Regards Steve From jmiller at stsci.edu Fri Oct 29 06:27:11 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Oct 29 06:27:11 2004 Subject: [Numpy-discussion] bug? in len(arr.flat) In-Reply-To: <200410270958.20025.haase@msg.ucsf.edu> References: <200410270958.20025.haase@msg.ucsf.edu> Message-ID: <1099056380.4904.12.camel@localhost.localdomain> On Wed, 2004-10-27 at 12:58, Sebastian Haase wrote: > Hi, > I have a (UInt16) 3d data stack and want to get to it's underlying buffer (to > (later) feed it into memmap) ... > I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't > multiply itemsize in. > >>> pr2.shape > (40, 512, 512) > >>> pr2.flat.shape > (10485760) > >>> 512*512*40 > 10485760 > >>> len(pr2.flat) > 10485760 > >>> pr2.flat._itemsize > 2 > >>> len(pr2._data) > 20971520 > >>> pr2._byteoffset > 0 > > Is this a bug No. > or am I missunderstanding ? Yes. _data is "an object which supports the buffer protocol". In this context, it is effectively a string and thus the product of the total number of elements and the itemsize. (We'll ignore for now the fact that not every array uses the entire buffer.) In contrast, shape(.flat) is only the total number of elements and is independent of itemsize. Regards, Todd From haase at msg.ucsf.edu Fri Oct 29 09:03:25 2004 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri Oct 29 09:03:25 2004 Subject: [Numpy-discussion] bug? in len(arr.flat) In-Reply-To: <1099056380.4904.12.camel@localhost.localdomain> References: <200410270958.20025.haase@msg.ucsf.edu> <1099056380.4904.12.camel@localhost.localdomain> Message-ID: <200410290902.25410.haase@msg.ucsf.edu> Of course ! sorry I forgot. Thanks, Sebastian On Friday 29 October 2004 06:26 am, Todd Miller wrote: > On Wed, 2004-10-27 at 12:58, Sebastian Haase wrote: > > Hi, > > I have a (UInt16) 3d data stack and want to get to it's underlying buffer > > (to (later) feed it into memmap) ... > > I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't > > multiply itemsize in. > > > > >>> pr2.shape > > > > (40, 512, 512) > > > > >>> pr2.flat.shape > > > > (10485760) > > > > >>> 512*512*40 > > > > 10485760 > > > > >>> len(pr2.flat) > > > > 10485760 > > > > >>> pr2.flat._itemsize > > > > 2 > > > > >>> len(pr2._data) > > > > 20971520 > > > > >>> pr2._byteoffset > > > > 0 > > > > Is this a bug > > No. > > > or am I missunderstanding ? > > Yes. _data is "an object which supports the buffer protocol". In this > context, it is effectively a string and thus the product of the total > number of elements and the itemsize. (We'll ignore for now the fact > that not every array uses the entire buffer.) In contrast, shape(.flat) > is only the total number of elements and is independent of itemsize. > > Regards, > Todd > > > > > ------------------------------------------------------- > This Newsletter Sponsored by: Macrovision > For reliable Linux application installations, use the industry's leading > setup authoring tool, InstallShield X. Learn more and evaluate > today. http://clk.atdmt.com/MSI/go/ins0030000001msi/direct/01/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From jmiller at stsci.edu Fri Oct 29 11:19:14 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Oct 29 11:19:14 2004 Subject: [Numpy-discussion] Counting array elements Message-ID: <1099073854.4904.321.camel@localhost.localdomain> I have returned from our astronomical data systems conference and I am going to take a short cut and summarize what I saw as the key developments of this thread. I apologize for not responding sooner and individually but the web-mail system I use isn't effective for conducting any kind of discussion. You guys did a great job sorting this out this week. I marked my key points with **. The rest is probably only for people with a lot of patience. ** I've finally come to terms with the fact that functions are the right way to do numarray rather than methods. The arguments in the Numeric manual are no more persuasive now than they ever were, but Stephen Walton's remarks about method explosion finally convinced me what the "real" reason for doing functions is that using methods combines every new feature under the umbrella of a single namespace, the NumArray class. Using functions lets us partition things into modules which can be used selectively and makes a more extensible and understandable system. Thanks Stephen. A couple people remarked that using .flat might solve everything with something like a.flat.sum() or sum(ravel(a). This gets to the original motivation for the sum() method, which was the codification of a simple and storage efficient technique for reducing noncontiguous arrays. The first point is that a non-contiguous array cannot generally be reshaped without making a copy. The basic idea of the sum() method is to do *two* reductions, the first, along a single axis, results in a smaller contiguous array. In the case of astronomical images which are generally square or at least non-degenerate, the reduction result is a *much* smaller array. The second reduction handles all the remaining dimensions since .flat is guaranteed to work because the array is contiguous. The end result is a complete sum() without righting additional ufuncs or making an array copy. There was understandable confusion about why .flat is sometimes allowed to fail. Since it is an attribute, we thought it inappropriate to make it return a copy of the source array and chose instead to raise an exception. In contrast, it is reasonable for the ravel() function to return a completely different array, so it always works. (I just noticed that ravel() is not named flat()). Some of our more contemporary thinkers suggested using iterators to produce a .flat which always works. If anyone has an idea how to make this work with good performance, please let me know; I don't. ** Tim Hochberg pointed out that we can overload the reduction (and not accumulation?) axis parameter with an "all" or a tuple describing a sequence of axes to reduce along. My perception was that there was a consensus behind this and in any case I'm in agreement with Tim. Alan Isaac pointed out that None might be better here than "all" and I agree. At this point, I think sumAll() is dead, the sum() method will be deprecated, and the reductions should be expanded as Tim suggested. ** Peter Verveer made some comments about the expectations of a naive user regarding reductions, namely that "all" should be the default. My own experience bears this out, and I am torn about what to do here. Chris Barker pointed out the need for backward compatibility with Numeric, and given the current numarray goal of supporting SciPy, this need is growing stronger and more complex. SciPy uses yet another axis convention. If anyone has any ideas how to handle these multiple conventions with elegance, let me know. A number of people commented on our naming conventions, an issue which we have side stepped for the moment with sumAll(). My impression is that, for better or worse, numarray uses the lowerUpper() version of Camel case. I think this is very much a matter of personal taste and don't claim to have any. My guess is that numarray is probably inconsistent at the moment, in part because lowerUpper() often degenerates into merely lower() which degenerates into confusion. Regards, Todd From verveer at embl-heidelberg.de Sat Oct 30 08:39:28 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Sat Oct 30 08:39:28 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain> References: <1099073854.4904.321.camel@localhost.localdomain> Message-ID: > ** Peter Verveer made some comments about the expectations of a naive > user regarding reductions, namely that "all" should be the default. > My > own experience bears this out, and I am torn about what to do here. > Chris Barker pointed out the need for backward compatibility with > Numeric, and given the current numarray goal of supporting SciPy, > this > need is growing stronger and more complex. SciPy uses yet another axis > convention. If anyone has any ideas how to handle these multiple > conventions with elegance, let me know. Numarray should probably be either completely compatible in every small detail, or we could take the opportunity to change what we believe was the wrong choice. Not sure what is really best, although personally feel breaking compatibility is fine if the result is better. Is there not already a sub-package numeric within numarray that provides Numeric compatibility? Such a package could at least provide wrappers with compatible behavior for people who need that. Peter From tim.hochberg at cox.net Sat Oct 30 11:49:36 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sat Oct 30 11:49:36 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain> References: <1099073854.4904.321.camel@localhost.localdomain> Message-ID: <4183E208.6050001@cox.net> Todd Miller wrote: [SNIP] >** Tim Hochberg pointed out that we can overload the reduction (and not >accumulation?) > It seems possible. It's probably marginally useful at best. However, it might be worth doing if not too painful, just so that the accumulate and reduce signatures match. >axis parameter with an "all" or a tuple describing a >sequence of axes to reduce along. My perception was that there was a >consensus behind this and in any case I'm in agreement with Tim. Alan >Isaac pointed out that None might be better here than "all" and I >agree. > Using None to mean ALL seems a little perverse to me, but I'll grant that using an existing singleton makes things simpler. I'll just point out that it would also be possible to define an ALL singleton and use that. Very tangential: it's too bad that '...' can't be typed more places: the natural spelling for ALL is [...] as in: add.reduce(a, axis=[...]) Sadly, that won't work. >At this point, I think sumAll() is dead, the sum() method will >be deprecated, and the reductions should be expanded as Tim suggested. > >** Peter Verveer made some comments about the expectations of a naive >user regarding reductions, namely that "all" should be the default. My >own experience bears this out, and I am torn about what to do here. > > I suspect that one's experience here depends on your typical problem domain. If one does a lot 2D work ALL would seem to be the natural choice. If you use a lot of arrays of vectors, as I do, -1 is the natural choice. At this point I can't recall a case where ALL would have been the natural choice for me. In addition to backwards compatibility, one argument for not using ALL as the default is that it makes little sense or no sense for accumulate. Having the default for reduce be ALL, but that for accumulate be -1 (for instance) would be confusing. >Chris Barker pointed out the need for backward compatibility with >Numeric, > I'd think that the importance of backward compatibility with not just Numeric, but with Numarray itself has been underrated. Changing the default for reduce / sum is a particularly insiduous since many uses will fail silently, producing the wrong answer, but continuing to run. This means that all instances of sum, product and reduce will need to be inspected and corrected. Having 10k LOC that use Numarray, I'll be a bit irked if this gets changed without a better justification than what I've seen thus far. >and given the current numarray goal of supporting SciPy, this >need is growing stronger and more complex. SciPy uses yet another axis >convention. If anyone has any ideas how to handle these multiple >conventions with elegance, let me know. > > Could you describe the SciPy axis convention: I'm not familiar with it. [SNIP] -tim From gazzar at email.com Sun Oct 31 04:22:01 2004 From: gazzar at email.com (Gary Ruben) Date: Sun Oct 31 04:22:01 2004 Subject: [Numpy-discussion] vector cross product Message-ID: <20041031121856.E2DDC1CE304@ws1-6.us4.outblaze.com> Not that I have a really urgent need, but is there a reason that nice, fast C-based vector operations aren't implemented in Numeric or numarray? I notice Fernando Perez has a cross product as a useful SciPy weave example on his site. I've also seen comments elsewhere about Numpy's lack of a cross product. eg. I'm using Konrad Hinsen's Scientific Python for the convenience value of his Vector class, which also provides a nice angle() method but it bothers me that it's implemented in native Python. The Vector type in vpython probably does it 'properly', but I don't use it just for the convenience since it adds an extra dependency to my code. comments? Gary R. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From perry at stsci.edu Sun Oct 31 09:22:28 2004 From: perry at stsci.edu (Perry Greenfield) Date: Sun Oct 31 09:22:28 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain> Message-ID: Todd Miller wrote: > > There was understandable confusion about why .flat is sometimes allowed > to fail. Since it is an attribute, we thought it inappropriate to make > it return a copy of the source array and chose instead to raise an > exception. In contrast, it is reasonable for the ravel() function to > return a completely different array, so it always works. (I just > noticed that ravel() is not named flat()). Some of our more > contemporary thinkers suggested using iterators to produce a .flat which > always works. If anyone has an idea how to make this work with good > performance, please let me know; I don't. > This aspect of flat can be considered a wart. There are three different desired behaviors depending on who you talk to. For efficiency reasons, some only want flat (and even ravel) to work if the array is already contiguous; that is, they don't want copies unless they ask for them. Others want it to always work, producing a copy if necessary but otherwise for it to return a view. Yet others always want a copy. So, are three different versions needed? Or options to a function? The drawback of .flat (as an attribute) is there is only one choice for behavior. For a function (or a method) we could modify the behavior with a keyword argument. Personally, I would rather .flat always work, even if it means returning a copy. Is there any consensus on how this problem should be handled? > ** Peter Verveer made some comments about the expectations of a naive > user regarding reductions, namely that "all" should be the default. My > own experience bears this out, and I am torn about what to do here. > Chris Barker pointed out the need for backward compatibility with > Numeric, and given the current numarray goal of supporting SciPy, this > need is growing stronger and more complex. SciPy uses yet another axis > convention. If anyone has any ideas how to handle these multiple > conventions with elegance, let me know. > I find this issue particularly vexing as well. Let's be clear about this, scipy changes the behavior of Numeric to produce a new flavor. What should numarray do? Follow the scipy behavior or the Numeric behavior? Or should there be a scipy/numarray flavor vs the more Numeric compatible numarray? Note, we never intended numarray to be 100% compatible with Numeric since there were aspects we thought should be changed (e.g., scalar/array type coercions). Yet there appear to be two camps of the Numeric community. Some sort of survey may be in order here. Is scipy where all the new growth is now? Should we just adopt the axis convention used there? I'd very much prefer not proliferate any more flavors of behavior and just settle on one. > A number of people commented on our naming conventions, an issue which > we have side stepped for the moment with sumAll(). My impression is > that, for better or worse, numarray uses the lowerUpper() version of > Camel case. I think this is very much a matter of personal taste and > don't claim to have any. My guess is that numarray is probably > inconsistent at the moment, in part because lowerUpper() often > degenerates into merely lower() which degenerates into confusion. > How much of the public interface uses camelCase? I don't think all that much if any. It seems to me the inclination of scipy is to avoid it and I'm happy with that. The internal implementation is a different issue, and there I think Todd is right that it probably is somewhat inconsistent on that front. Perry From perry at stsci.edu Sun Oct 31 09:30:28 2004 From: perry at stsci.edu (Perry Greenfield) Date: Sun Oct 31 09:30:28 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: Message-ID: Peter Verveer wrote: > Numarray should probably be either completely compatible in every small > detail, or we could take the opportunity to change what we believe was Well, as I mentioned before having numarray match Numeric in every small detail is not going to happen (and even there, which flavor? the original Numeric or the scipy version?). We've been pretty clear about where incompatibilities were deliberate. But on the other hand, that leaves many other choices that could be revisited if enough people support them. The problem is that no matter what is done, I suspect some people are going to be inconvenienced since there is already (without numarray) a split in the community because of scipy. > the wrong choice. Not sure what is really best, although personally > feel breaking compatibility is fine if the result is better. Is there > not already a sub-package numeric within numarray that provides Numeric > compatibility? Such a package could at least provide wrappers with > compatible behavior for people who need that. > At the moment the numeric module provides more Numeric compatibility (but not complete). In matplotlib we use a module called numerix to provide a uniform interface to both Numeric and numerix (along with prohibitions on use of certain features that don't exist in the other). We are looking at scipy_base now that undoubtably will highlight similar cases where we will suggest internal reorganization to do the same sort of thing that was done for matplotlib. For those that intend to use numarray only now and forever, one is free to use all the features they desire. But there still is the behavior issue of those things that are currently incompatible like the axis issue. Perry From tim.hochberg at cox.net Sun Oct 31 14:24:01 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sun Oct 31 14:24:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <4183F168.3060205@ucsd.edu> References: <1099073854.4904.321.camel@localhost.localdomain> <4183E208.6050001@cox.net> <4183F168.3060205@ucsd.edu> Message-ID: <418564AE.6050206@cox.net> Robert Kern wrote: > Tim Hochberg wrote: > >> Could you describe the SciPy axis convention: I'm not familiar with it. > > > axis=-1 OK, so Numarray (currently) and Numeric use axis=0, SciPy uses axis=-1 and there is some desire to use axis=ALL as instead. One advantage of ALL is that it breaks everyone's code equally, so there wouldn't be any charges of favoritism <0.8 wink>. I can't come up with any way to reconcile the three, but I can suggest a transition strategy whatever the decision. Supply an option so that one can require axis arguments to all calls to reduce. Then it's relatively easy to track down all the reduce calls and fix the ones that are broken. Something like numarray.setRequireReduceAxisArg(True). FWIW, it wouldn't bother me much to use SciPy's default here: supporting SciPy is a worthwhile goal and I think SciPy's choice here is a reasonable one. Another alternative that wouldn't bother me much is "In the face of ambiguity, refuse the temptation to guess". That is, always require axis arguments for multidimensional arrays. While not backwards compatible, this would make the transition relatively easy, since uses that might fail would raise exceptions. -tim From rkern at ucsd.edu Sun Oct 31 16:01:04 2004 From: rkern at ucsd.edu (Robert Kern) Date: Sun Oct 31 16:01:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <418564AE.6050206@cox.net> References: <1099073854.4904.321.camel@localhost.localdomain> <4183E208.6050001@cox.net> <4183F168.3060205@ucsd.edu> <418564AE.6050206@cox.net> Message-ID: <41857B53.5010308@ucsd.edu> Tim Hochberg wrote: > Robert Kern wrote: > >> Tim Hochberg wrote: >> >>> Could you describe the SciPy axis convention: I'm not familiar with it. >> >> axis=-1 > > OK, so Numarray (currently) and Numeric use axis=0, Well, sometimes. :-) > SciPy uses axis=-1 I should note that this convention is for Scipy-defined functions. With one unfortunate exception (cumsum), Scipy does not overwrite Numeric's axis default for Numeric-defined functions. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From faheem at email.unc.edu Fri Oct 1 10:19:02 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Fri Oct 1 10:19:02 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs In-Reply-To: <982cfc7f.8876956d.8220100@expms6.cites.uiuc.edu> References: <982cfc7f.8876956d.8220100@expms6.cites.uiuc.edu> Message-ID: On Fri, 1 Oct 2004, Bruce Southey wrote: > Hi, > > I presume that you have R and can build the standalone library. I have > attached my SWIG Smath.i , the SWIG Smath_wrap.c and the > Smath.py files. With these last two files, you shouldn't need SWIG. > > Note that I have not touched the void functions here as I have yet to check > how these work in SWIG. Also, there are a few function in the R header that > are only headers. Eventually someone has to fixed these and add suitable > documentation in some package. I'm not sure what you mean by void functions. > If you have SWIG you can directly use the Smath.i file - while SWIG can take > a .h file directly it would not work in Python. So I just edited the header > file into a .i file. > > The following is my process using Linux (I don't know about other platforms): > > 0) Have swig installed and built the R math library > 1) $ swig -python Smath.i > 2) $ gcc -c Smath_wrap.c -I/usr/local/include/python2.3 > -I/home/bsouthey/Rproject/R-1.9.1/src/nmath > -I/home/bsouthey/Rproject/R-1.9.1/include > 3) $ ld -shared Smath_wrap.o -o _Smath.so -lm -lRmath > -L/home/bsouthey/Rproject/R-1.9.1/src/nmath/standalone > > Of course you must change the include (-I) and library (-L) paths to where > python lives and standard alone Rmath library lives. Thanks. I'm particularly interested in knowing how you interface with the random number generator at the top (Python) level. Can you supply an example? Specifically, I'm looking for the following method. 1) When C/C++ code called, reads seed from python random state. 2) Does its stuff. 3) Writes seed back to python level when it exits. R has this built it, but here one needs to build ones own mechanism. This is complicated by the fact that Numarray and the base Python random library use different RNG mechanisms, so one has to chose which one to use. Which one did you use? Faheem. From jmiller at stsci.edu Fri Oct 1 10:21:04 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Oct 1 10:21:04 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] Message-ID: <1096651226.9400.25.camel@halloween.stsci.edu> -- -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From fccoelho at fiocruz.br Fri Oct 1 13:06:10 2004 From: fccoelho at fiocruz.br (=?iso-8859-1?q?Fl=E1vio_Code=E7o_Coelho?=) Date: Fri, 1 Oct 2004 17:06:10 +0000 Subject: [Matplotlib-users] warning: Numeric and amd64 Message-ID: <200410011706.10524.fccoelho@fiocruz.br> Hi, look at this: >>> from RandomArray import * >>> normal(2,2,10) array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit P4 and it ran fine. Has anyone else seen this before? For those that didn't understand, the normal function as called above, is supposed to give me ten samples form a normal distribution with mean = 2 and standard deviation = 2 luckily: >>> from numarray.random_array import * >>> normal(2,2,10) array([-0.04525638, 4.31467819, -0.17468357, 5.29377031, 0.84202135, 5.29593539, 4.69651532, 1.61354655, 1.10839236, 1.7743317 ]) If anybody still needed a reason for switching to numarray, there you go! I anybody here subscribes the numeric or numarray mailing lists (i.e. if they even exist) could you please forward this message to them? Flavio ------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Matplotlib-users mailing list Matplotlib-users at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users --=-f+ARSKyzBPwKnxDSn4zh-- From jdhunter at ace.bsd.uchicago.edu Fri Oct 1 10:33:02 2004 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Fri Oct 1 10:33:02 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: <1096651226.9400.25.camel@halloween.stsci.edu> (Todd Miller's message of "01 Oct 2004 13:20:26 -0400") References: <1096651226.9400.25.camel@halloween.stsci.edu> Message-ID: >>>>> "Todd" == Todd Miller writes: >>>> from RandomArray import * >>>> normal(2,2,10) Todd> array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) I get this too on a 64bit Opteron 250. The root of the problem appears to be >>> from RandomArray import standard_normal >>> standard_normal(10) array([ 5.31046164e-315, 1.57997427e-314, 5.16421382e-315, 5.22924144e-315, 1.59247813e-314, 1.58920141e-314, 5.23691141e-315, 5.24305935e-315, 5.20686204e-315, 1.58739568e-314]) But MLab.randn, which uses a different approach, works fine. I've have this gnawing feeling I've seen this before, but I can't remember .... JDH From a.schmolck at gmx.net Fri Oct 1 11:34:01 2004 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Fri Oct 1 11:34:01 2004 Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present In-Reply-To: <4159BCA5.6090101@colorado.edu> (Fernando Perez's message of "Tue, 28 Sep 2004 13:33:57 -0600") References: <4159BCA5.6090101@colorado.edu> Message-ID: Fernando Perez writes: > Hi all, > > I found something today a bit unpleasant: if you install numeric without > any BLAS support, 'matrixmultiply is dot==True', so they are fully > interchangeable. However, to my surprise, if you build numeric with the blas > optimizations, they are NOT identical. Oops, my bad (I submitted the patch and while pretty much all the real coding was done by Richard Everson this is my oversight). > The reason is a bug in Numeric.py. After defining dot, the code reads: > > #This is obsolete, don't use in new code > matrixmultiply = dot On the other hand, it gently nudges people to no longer use the obsoleted matrixmultiply ;) > In [4]: timing 1,dot,a,b > ------> timing(1,dot,a,b) > Out[4]: 0.55591500000000005 > > In [5]: timing 1,matrixmultiply,a,b > ------> timing(1,matrixmultiply,a,b) > Out[5]: 68.142640999999998 > > In [6]: _/__ > Out[6]: 122.57744619231356 > > Pretty significant difference... Yup, someone should incorporate optional atlas dot support into numarray if it hasn't happened already (won't be me, IIRC it took some convincing to get this into Numeric and I won't be using numarray for anything real in the near future). cheers, alex From stephen.walton at csun.edu Fri Oct 1 11:37:01 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Fri Oct 1 11:37:01 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: References: <1096651226.9400.25.camel@halloween.stsci.edu> Message-ID: <1096655567.2678.2.camel@localhost.localdomain> On Fri, 2004-10-01 at 09:43, John Hunter wrote: > The root of the problem appears to be > > >>> from RandomArray import standard_normal > >>> standard_normal(10) > array([ 5.31046164e-315, 1.57997427e-314, > I've have this gnawing feeling I've seen this before, but I can't > remember .... Those values look suspiciously like what one sees if one reads a big-endian Float as little-endian or vice versa. I saw similar numbers recently when using pytables on a big-endian HDF5 (which generated a bug report for numarray if you recall). Is the Opteron big-endian? From stephen.walton at csun.edu Fri Oct 1 11:40:01 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Fri Oct 1 11:40:01 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: References: <1096651226.9400.25.camel@halloween.stsci.edu> Message-ID: <1096655567.2678.3.camel@localhost.localdomain> On Fri, 2004-10-01 at 09:43, John Hunter wrote: > The root of the problem appears to be > > >>> from RandomArray import standard_normal > >>> standard_normal(10) > array([ 5.31046164e-315, 1.57997427e-314, > I've have this gnawing feeling I've seen this before, but I can't > remember .... Those values look suspiciously like what one sees if one reads a big-endian Float as little-endian or vice versa. I saw similar numbers recently when using pytables on a big-endian HDF5 (which generated a bug report for numarray if you recall). Is the Opteron big-endian? From stephen.walton at csun.edu Fri Oct 1 11:43:06 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Fri Oct 1 11:43:06 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: References: <1096651226.9400.25.camel@halloween.stsci.edu> Message-ID: <1096655567.2678.4.camel@localhost.localdomain> On Fri, 2004-10-01 at 09:43, John Hunter wrote: > The root of the problem appears to be > > >>> from RandomArray import standard_normal > >>> standard_normal(10) > array([ 5.31046164e-315, 1.57997427e-314, > I've have this gnawing feeling I've seen this before, but I can't > remember .... Those values look suspiciously like what one sees if one reads a big-endian Float as little-endian or vice versa. I saw similar numbers recently when using pytables on a big-endian HDF5 (which generated a bug report for numarray if you recall). Is the Opteron big-endian? From Fernando.Perez at colorado.edu Fri Oct 1 11:51:00 2004 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Fri Oct 1 11:51:00 2004 Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present In-Reply-To: References: <4159BCA5.6090101@colorado.edu> Message-ID: <415DA6D7.4070407@colorado.edu> Alexander Schmolck schrieb: > Fernando Perez writes: > > >>Hi all, >> >>I found something today a bit unpleasant: if you install numeric without >>any BLAS support, 'matrixmultiply is dot==True', so they are fully >>interchangeable. However, to my surprise, if you build numeric with the blas >>optimizations, they are NOT identical. > > > Oops, my bad (I submitted the patch and while pretty much all the real coding > was done by Richard Everson this is my oversight). No prob. It's been fixed in Numeric 23.5, so no more worries. >>Pretty significant difference... > > > Yup, someone should incorporate optional atlas dot support into numarray if it > hasn't happened already (won't be me, IIRC it took some convincing to get this > into Numeric and I won't be using numarray for anything real in the near > future). I'll leave that question to the numarray guys, I have no idea where it stands in terms of blas/atlas support. I certainly hope it has it or that this optimization can be brought in, as it makes a huge difference for the large array case. Best, f From perry at stsci.edu Fri Oct 1 11:57:02 2004 From: perry at stsci.edu (Perry Greenfield) Date: Fri Oct 1 11:57:02 2004 Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present In-Reply-To: <415DA6D7.4070407@colorado.edu> References: <4159BCA5.6090101@colorado.edu> <415DA6D7.4070407@colorado.edu> Message-ID: <52083A9C-13DB-11D9-B931-000A95B68E50@stsci.edu> On Oct 1, 2004, at 2:49 PM, Fernando Perez wrote: > Alexander Schmolck schrieb: >>> Pretty significant difference... >> Yup, someone should incorporate optional atlas dot support into >> numarray if it >> hasn't happened already (won't be me, IIRC it took some convincing to >> get this >> into Numeric and I won't be using numarray for anything real in the >> near >> future). > > I'll leave that question to the numarray guys, I have no idea where it > stands in terms of blas/atlas support. I certainly hope it has it or > that this optimization can be brought in, as it makes a huge > difference for the large array case. > > Best, > > f I'm not sure when it will get done, but we are working on the early stages of getting scipy working with numarray. You should see visible signs of that within a month (i.e., at least some parts of scipy working with numarray). It will probably take months to finish though. Perry From pearu at cens.ioc.ee Fri Oct 1 12:44:58 2004 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Fri Oct 1 12:44:58 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: <1096651226.9400.25.camel@halloween.stsci.edu> Message-ID: On 1 Oct 2004, Todd Miller wrote: > look at this: > > >>> from RandomArray import * > > >>> normal(2,2,10) > array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) > > This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a > 32bit P4 and it ran fine. > Has anyone else seen this before? Yes. I just fixed a similar issue in scipy.stats.rand module. Below is the corresponding patch for Numeric Src/ranlibmodule.c that fixes the issue for Opteron. Regards, Pearu *** ranlibmodule.c Fri Oct 1 22:29:57 2004 --- ranlibmodule.c.orig Fri Oct 1 22:12:13 2004 *************** *** 47,49 **** case 0: ! *out_ptr = (double) ((float (*)(void)) fun)(); break; --- 47,49 ---- case 0: ! *out_ptr = (double) ((double (*)()) fun)(); break; *************** *** 81,83 **** case 1: ! if( !PyArg_ParseTuple(args, "lf|i", &int_arg, &float_arg, &n) ) { return NULL; --- 81,83 ---- case 1: ! if( !PyArg_ParseTuple(args, "if|i", &int_arg, &float_arg, &n) ) { return NULL; *************** *** 213,215 **** ! if( !PyArg_ParseTuple(args, "lO|i", &num_trials, &priors_object, &n) ) { return NULL; --- 213,215 ---- ! if( !PyArg_ParseTuple(args, "iO|i", &num_trials, &priors_object, &n) ) { return NULL; From jmiller at stsci.edu Fri Oct 1 13:35:07 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Oct 1 13:35:07 2004 Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64] In-Reply-To: References: Message-ID: <1096662489.15037.1.camel@halloween.stsci.edu> Thanks Pearu. For some unknown reason, numarray.random_array already had the fixes, but I applied the patch to Numeric CVS. Regards, Todd On Fri, 2004-10-01 at 15:38, Pearu Peterson wrote: > On 1 Oct 2004, Todd Miller wrote: > > > look at this: > > > > >>> from RandomArray import * > > > > >>> normal(2,2,10) > > array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) > > > > This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a > > 32bit P4 and it ran fine. > > Has anyone else seen this before? > > Yes. I just fixed a similar issue in scipy.stats.rand module. Below is the > corresponding patch for Numeric Src/ranlibmodule.c that fixes the issue > for Opteron. > > Regards, > Pearu > > *** ranlibmodule.c Fri Oct 1 22:29:57 2004 > --- ranlibmodule.c.orig Fri Oct 1 22:12:13 2004 > *************** > *** 47,49 **** > case 0: > ! *out_ptr = (double) ((float (*)(void)) fun)(); > break; > --- 47,49 ---- > case 0: > ! *out_ptr = (double) ((double (*)()) fun)(); > break; > *************** > *** 81,83 **** > case 1: > ! if( !PyArg_ParseTuple(args, "lf|i", &int_arg, &float_arg, &n) ) { > return NULL; > --- 81,83 ---- > case 1: > ! if( !PyArg_ParseTuple(args, "if|i", &int_arg, &float_arg, &n) ) { > return NULL; > *************** > *** 213,215 **** > > ! if( !PyArg_ParseTuple(args, "lO|i", &num_trials, &priors_object, &n) ) { > return NULL; > --- 213,215 ---- > > ! if( !PyArg_ParseTuple(args, "iO|i", &num_trials, &priors_object, &n) ) { > return NULL; > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From faheem at email.unc.edu Fri Oct 1 22:28:41 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Fri Oct 1 22:28:41 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code Message-ID: Dear People, I want to write some C++ code to link with Python, using the Boost.Python interface. I need to generate random numbers in the C++ code, and I was wondering as to the best way of doing this. Note that it is important that the random number generation interoperate seamlessly with Python, in the sense that the behavior of the calls to the RNG is the same whether calls are made at the C level or the Python level. I hope the reasons why this is important are obvious. I was thinking that the method should go like this. 1) When C/C++ code called, reads seed from python random state. 2) Does its stuff. 3) Writes seed back to python level when it exits. After doing a little investigation of the numarray.random_array python library and associated extension modules, it seems possible that the answer is simpler than I had supposed. However, I would appreciate it if someone would tell me if my understanding is incorrect in some places. Summary: It seems that I can just call all the C entry point routines defined in ranlib.h, without worrying about getting or setting seeds. Rationale: The structure of this random number facility has three parts, all files in Packages/RandomArray2/Src. 1) low-level C routines: Packages/RandomArray2/Src/com.c and Packages/RandomArray2/Src/ranlib.c. com.c: basic RNG stuff; getting and setting seeds etc. ranlib.c: Random number generator algorithms for different distributions etc. 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c. This interfaces the stuff in com.c and ranlib.c. 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py. This wraps the C interface. In most cases it does not do much else besides some basic argument error checking. >From my perspective, the important thing is that the random number seed is only defined at C level as a static object, all the RNG stuff happens at C level, and the Python code just calls the C code as necessary. (I'm sketchy about the details of what is defined as the seed etc.) This is in contrast with the R RNG facility (the only other RNG facility I am familiar with), which uses macros SetRNGstate() and GetRNGstate() to read and write the seed, which is defined at R level. Therefore, the upshot is that the C routines in ranlib.h read and write the same seed as the python level functions do, so no special action is necessary with regard to the seed. Is this correct? In any case, it would be nice if something like the above was documented, so lost souls like myself don't have to go trawling through the source code to figure out what is going on. Of course it is nice that the source code is available, otherwise even that would be impossible. R documents this stuff in the "Writing R Extensions" manual, online at http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray manual could have a small section about this too. Regards, Faheem. From fccoelho at gmail.com Mon Oct 4 07:59:12 2004 From: fccoelho at gmail.com (Flavio Coelho) Date: Mon Oct 4 07:59:12 2004 Subject: [Numpy-discussion] Bug Compiling Numeric on amd64 Message-ID: Hi, look at this: >>> from RandomArray import * >>> normal(2,2,10) array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit P4 and it ran fine. Has anyone else seen this before? luckily: >>> from numarray.random_array import * >>> normal(2,2,10) array([-0.04525638, 4.31467819, -0.17468357, 5.29377031, 0.84202135, 5.29593539, 4.69651532, 1.61354655, 1.10839236, 1.7743317 ]) Both modules were compiled on my gentoo box with: gcc version 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6) any comments? Flavio -- I use Linux daily to UP my productivity -- Microsoft, UP yours! From jmiller at stsci.edu Mon Oct 4 09:21:32 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Oct 4 09:21:32 2004 Subject: [Numpy-discussion] Bug Compiling Numeric on amd64 In-Reply-To: References: Message-ID: <1096906220.7641.55.camel@localhost.localdomain> On Mon, 2004-10-04 at 10:48, Flavio Coelho wrote: > Hi, > > look at this: > > >>> from RandomArray import * > > >>> normal(2,2,10) > array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]) > > This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit > P4 and it ran fine. > Has anyone else seen this before? > This was discussed here briefly last week after I forwarded your post from matplotlib-users. Pearu Peterson posted a patch which he had already performed for SciPy and I applied it to Numeric on Source Forge. Thanks for raising the issue. Regards, Todd > > luckily: > > >>> from numarray.random_array import * > > >>> normal(2,2,10) > array([-0.04525638, 4.31467819, -0.17468357, 5.29377031, 0.84202135, > 5.29593539, 4.69651532, 1.61354655, 1.10839236, 1.7743317 ]) > > Both modules were compiled on my gentoo box with: > > gcc version 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6) > > any comments? > > Flavio -- From Fernando.Perez at colorado.edu Mon Oct 4 10:59:49 2004 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Mon Oct 4 10:59:49 2004 Subject: [Numpy-discussion] Small bug in MA with arrays of rank > 1 Message-ID: <41618DFD.7030106@colorado.edu> Hi all, a while back I noticed a small problem with MA for rank 2 (and larger) arrays. Here's a simple example: In [1]: a=RA.random((3,3)) In [2]: a Out[2]: array([[ 0.002542, 0.70301 , 0.705466], [ 0.467305, 0.381492, 0.655857], [ 0.103372, 0.776988, 0.466528]]) In [3]: import MA In [4]: a Out[4]: [[ 0.002542, 0.70301 , 0.705466,] [ 0.467305, 0.381492, 0.655857,] [ 0.103372, 0.776988, 0.466528,]] The bug is that the commas at the end of each line are coming _before_ the closing bracket, instead of after. This seemingly trivial problem turns out to be pretty serious for me, because I use this string representation to export python arrays into Mathematica files, by simply replacing [] with {} (and playing some other tricks). Unfortunately, this bug means I can't use MA, which is otherwise great because of the way it gracefully handles the case where you accidentally say A when A is some monster array. With MA, instead of your CPU getting killed for 10 minutes, you get a nice summary of A's dimensions and typecode. Anyway, it would be great if one of the gurus had a chance to fix this one. Best, f From graik at web.de Tue Oct 5 10:44:13 2004 From: graik at web.de (Raik =?iso-8859-1?q?Gr=FCnberg?=) Date: Tue Oct 5 10:44:13 2004 Subject: [Numpy-discussion] Numeric to numarray experiences Message-ID: <200410051941.29807.graik@web.de> Hi there, I've just translated a package for molecular modelling, which makes extensive use of Numeric, from Numeric to numarray. The outcome is somewhat negative - for now we are basically going to postpone the transition - the reasons might be interesting for the list and the numarray developpers out there (who are doing a brave job!). Speed: A typical task in our package is the least-square fitting of a large array of coordinate frames ( N1 x N2 x 3) onto a set of reference or average coordinates (using a sub-set of coordinates for the matching). The example I looked at (500 x 876 x 3 items) took 1.3 s with Numeric and 4.7 s with numarray. The main culprits for the slow-down were: * compress() - factor 10 * average() - factor 7 (average() is missing from Numeric and I hence had to write a little function myself) * LinearAlgebra.singular_value_decomposition() - factor 10 but a lot of extra time is also spent in uufunc.py and various numarraycore.py routines. Memory efficiency: I hoped numarray would solve some of the Out-of-memory problems that I get with Numeric but it turns out that it is rather less memory efficient for my kind of applications. Slicing an array that takes up 800MB on disc just about runs through with Numeric (and heavy swapping) but gives an Out-of-memory with numarray. Suggestions: OK, it's easy to make clever comments without contributing any real work... - compress(), take(), etc, really need some optimization - a C-coded average() routine would be helpful - faster LinearAlgebra routines are necessary Our sysadmin noted that unlike Numeric, numarray is not using any external math libraries (like LAPACK) that have been speed-optimized for decades and are available in CPU-optimized variants (e.g. ATLAS). It's probably difficult to match this efficiency with any new code ... Greetings Raik PS: I didn't find any useful HowTo for the translation from Numeric to numarray. The practical issues were the different nonzero() return value, the more restrictive boolean comparison, that take doesn't support 'O' arrays any longer, and the missing average(). -- ----------------------------------------------------- Raik Gr?nberg | Bioinformatique Structurale | Institut Pasteur | Paris, France ----------------------------------------------------- From southey at uiuc.edu Tue Oct 5 11:33:27 2004 From: southey at uiuc.edu (Bruce Southey) Date: Tue Oct 5 11:33:27 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code Message-ID: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu> Hi, It is rather hard to suggest anything without more detail on what you want to actually do. As you describe it, why do you need the 'seed' returned? It would only make sense if you were going in and out of Python multiple times - a somewhat undesirable situation due to the overhead costs. I see at least three options: 1) Do everything in Python/numarray. 2) Do parts in Python and the other in C/C++. For example, pass a matrix of random numbers to your code from Python. The 'seed' never needs to leave Python. 3) Do it all in C/C++ - pass the 'seed' into your code that includes the random number generator(s) - there is C/C++ code around for this. Do you stuff and then return the 'seed' back with whatever else is required. You can email me privately if you want. Bruce ---- Original message ---- >Date: Sat, 2 Oct 2004 01:23:21 -0400 (EDT) >From: Faheem Mitha >Subject: [Numpy-discussion] numarray.random_array number generation in C code >To: numpy-discussion > > >Dear People, > >I want to write some C++ code to link with Python, using the >Boost.Python interface. I need to generate random numbers in the C++ >code, and I was wondering as to the best way of doing this. > >Note that it is important that the random number generation interoperate >seamlessly with Python, in the sense that the behavior of the calls to >the RNG is the same whether calls are made at the C level or the Python >level. I hope the reasons why this is important are obvious. > >I was thinking that the method should go like this. > >1) When C/C++ code called, reads seed from python random state. > >2) Does its stuff. > >3) Writes seed back to python level when it exits. > >After doing a little investigation of the numarray.random_array python >library and associated extension modules, it seems possible that the >answer is simpler than I had supposed. However, I would appreciate it if >someone would tell me if my understanding is incorrect in some places. > >Summary: It seems that I can just call all the C entry point routines >defined in ranlib.h, without worrying about getting or setting seeds. > >Rationale: > >The structure of this random number facility has three parts, all files in >Packages/RandomArray2/Src. > >1) low-level C routines: Packages/RandomArray2/Src/com.c and >Packages/RandomArray2/Src/ranlib.c. > >com.c: basic RNG stuff; getting and setting seeds etc. >ranlib.c: Random number generator algorithms for different distributions >etc. > >2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c. > >This interfaces the stuff in com.c and ranlib.c. > >3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py. > >This wraps the C interface. In most cases it does not do much else besides >some basic argument error checking. > >From my perspective, the important thing is that the random number seed is >only defined at C level as a static object, all the RNG stuff happens at C >level, and the Python code just calls the C code as necessary. (I'm >sketchy about the details of what is defined as the seed etc.) > >This is in contrast with the R RNG facility (the only other RNG facility I >am familiar with), which uses macros SetRNGstate() and GetRNGstate() to >read and write the seed, which is defined at R level. > >Therefore, the upshot is that the C routines in ranlib.h read and write >the same seed as the python level functions do, so no special action is >necessary with regard to the seed. > >Is this correct? > >In any case, it would be nice if something like the above was documented, >so lost souls like myself don't have to go trawling through the source >code to figure out what is going on. Of course it is nice that the source >code is available, otherwise even that would be impossible. > >R documents this stuff in the "Writing R Extensions" manual, online at >http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray >manual could have a small section about this too. > > Regards, Faheem. > > > >------------------------------------------------------- >This SF.net email is sponsored by: IT Product Guide on ITManagersJournal >Use IT products in your business? Tell us what you think of them. Give us >Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more >http://productguide.itmanagersjournal.com/guidepromo.tmpl >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion From stephen.walton at csun.edu Tue Oct 5 12:20:01 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Tue Oct 5 12:20:01 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: <200410051941.29807.graik@web.de> References: <200410051941.29807.graik@web.de> Message-ID: <1097003873.13715.17.camel@freyer.sfo.csun.edu> On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote: > Our sysadmin noted that unlike Numeric, numarray is not using any external > math libraries (like LAPACK) that have been speed-optimized for decades and > are available in CPU-optimized variants (e.g. ATLAS). It's probably difficult > to match this efficiency with any new code ... This is a key point. Have a look at addons.py in numarray, some previous comments on this list, and build numarray with the line env USE_LAPACK=1 python setup.py build after editing addons.py appropriately. You should see a major speed improvement. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From dd55 at cornell.edu Tue Oct 5 13:02:01 2004 From: dd55 at cornell.edu (Darren Dale) Date: Tue Oct 5 13:02:01 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: <1097003873.13715.17.camel@freyer.sfo.csun.edu> References: <200410051941.29807.graik@web.de> <1097003873.13715.17.camel@freyer.sfo.csun.edu> Message-ID: <200410051600.38254.dd55@cornell.edu> On Tuesday 05 October 2004 03:17 pm, Stephen Walton wrote: > On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote: > > Our sysadmin noted that unlike Numeric, numarray is not using any > > external math libraries (like LAPACK) that have been speed-optimized for > > decades and are available in CPU-optimized variants (e.g. ATLAS). It's > > probably difficult to match this efficiency with any new code ... > > This is a key point. Have a look at addons.py in numarray, some > previous comments on this list, and build numarray with the line > > env USE_LAPACK=1 python setup.py build > > after editing addons.py appropriately. You should see a major speed > improvement. I would kindly suggest updating the numarray documentation. In the section on installation, it is easy to overlook the option to compile againist existing libraries. That is explained in section 16, which appears to be out of date. The code listed in Packages/LinearAlgebra2/setup.py has been moved to addons.py, correct? -- Darren From jmiller at stsci.edu Tue Oct 5 13:37:42 2004 From: jmiller at stsci.edu (Todd Miller) Date: Tue Oct 5 13:37:42 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: <200410051600.38254.dd55@cornell.edu> References: <200410051941.29807.graik@web.de> <1097003873.13715.17.camel@freyer.sfo.csun.edu> <200410051600.38254.dd55@cornell.edu> Message-ID: <1097008567.27149.140.camel@halloween.stsci.edu> On Tue, 2004-10-05 at 16:00, Darren Dale wrote: > On Tuesday 05 October 2004 03:17 pm, Stephen Walton wrote: > > On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote: > > > Our sysadmin noted that unlike Numeric, numarray is not using any > > > external math libraries (like LAPACK) that have been speed-optimized for > > > decades and are available in CPU-optimized variants (e.g. ATLAS). It's > > > probably difficult to match this efficiency with any new code ... > > > > This is a key point. Have a look at addons.py in numarray, some > > previous comments on this list, and build numarray with the line > > > > env USE_LAPACK=1 python setup.py build > > > > after editing addons.py appropriately. You should see a major speed > > improvement. > > I would kindly suggest updating the numarray documentation. Thanks, will do. > In the section on > installation, it is easy to overlook the option to compile againist existing > libraries. That is explained in section 16, which appears to be out of date. > The code listed in Packages/LinearAlgebra2/setup.py has been moved to > addons.py, correct? That's correct. Regards, Todd From faheem at email.unc.edu Tue Oct 5 15:44:36 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Oct 5 15:44:36 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code In-Reply-To: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu> References: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu> Message-ID: On Tue, 5 Oct 2004, Bruce Southey wrote: > Hi, > It is rather hard to suggest anything without more detail on what you want to > actually do. I could give you more details if you were interested. > As you describe it, why do you need the 'seed' returned? It would only > make sense if you were going in and out of Python multiple times - a > somewhat undesirable situation due to the overhead costs. Not really. One might (and I frequently do) want to run the same function (which in this case might be all in C++ code), interactively with different parameters. The kind of thing that I'm doing is akin to exploratory data analysis, and the specific code in question is a stochastic search algorithm. Doing all this in C++ would not be very interactive. Also, one often wants to postprocess data output using Python scripts. This involves multiple calls to C++ code, and would be impossible to do using C++, since one has to call other Python libraries. > I see at least three options: > 1) Do everything in Python/numarray. That's my current situation. > 2) Do parts in Python and the other in C/C++. > For example, pass a matrix of random numbers to your code from Python. The > 'seed' never needs to leave Python. This doesn't work very well unless you know in advance how many random numbers are needed (not the case, for example, for stochastic search algorithms), and in any case is a rather clumsy way to do things. No offense intended. > 3) Do it all in C/C++ - pass the 'seed' into your code that includes the > random number generator(s) - there is C/C++ code around for this. Do you stuff > and then return the 'seed' back with whatever else is required. Yes, but part of the point of mixed programming is that you have an interpreted front end which can easily hook into other routines. Also, in this case, you would not be passing the seed in, since there is nothing to pass it in from. One would simply call system time or something similar to obtain the seed. > You can email me privately if you want. I'll keep sending this to the list unless someone objects, since I think this is of some general interest. Really, my main question was to whether my understanding of how to use the Numarray random number facilities in C was correct or not. Faheem. From stephen.walton at csun.edu Tue Oct 5 16:15:31 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Tue Oct 5 16:15:31 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: References: <200410051941.29807.graik@web.de> <1097003873.13715.17.camel@freyer.sfo.csun.edu> Message-ID: <1097018077.22092.15.camel@freyer.sfo.csun.edu> On Tue, 2004-10-05 at 16:00, Flavio Coelho wrote: > I wrote > > env USE_LAPACK=1 python setup.py build > > > > after editing addons.py appropriately. You should see a major speed > > improvement. > > > > > If that is the case, why is it not the default?, at least when LAPACK > is installed? Well, I won't pretend to speak for the developers on this one. But I strongly suspect it is just too hard to find all possible LAPACK distributions; the default numarray setup should be self contained even if somewhat slower. The current version of Numeric also defaults to its own built-in BLAS and requires editing setup.py to use a different one. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From perry at stsci.edu Tue Oct 5 17:30:58 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Oct 5 17:30:58 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: <1097018077.22092.15.camel@freyer.sfo.csun.edu> Message-ID: Steve Walton wrote: > On Tue, 2004-10-05 at 16:00, Flavio Coelho wrote: > > I wrote > > > env USE_LAPACK=1 python setup.py build > > > > > > after editing addons.py appropriately. You should see a major speed > > > improvement. > > > > > > > > > If that is the case, why is it not the default?, at least when LAPACK > > is installed? > > Well, I won't pretend to speak for the developers on this one. But I > strongly suspect it is just too hard to find all possible LAPACK > distributions; the default numarray setup should be self contained even > if somewhat slower. The current version of Numeric also defaults to its > own built-in BLAS and requires editing setup.py to use a different one. > Well, it's been a while, and Todd handled that aspect of porting those from Numeric, but if I recall correctly, the situation was the same there, and I think Steve is correct. It was to provide the basic functionality as part of the distribution without requiring other installations. If you needed better performance, you jump through a couple more hoops. But requiring it to use LAPACK makes life more difficult for those who were looking for a self contained and easy to install solution. Perry From perry at stsci.edu Tue Oct 5 17:40:51 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Oct 5 17:40:51 2004 Subject: [Numpy-discussion] Numeric to numarray experiences In-Reply-To: <200410051941.29807.graik@web.de> Message-ID: I hadn't seen this until now. It's hard for us to understand exactly the reasons for the slower performance with such large arrays. Could you send us the code and an indication of the what inputs and parameters were used so we could try to figure out why some of these problems exist (we can check the specific functions you mention, but I want to make sure you aren't iterating over array slices or such). It's not obvious to me why you are having out of memory errors and this may help. Perry Greenfield > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Raik > Gr?nberg > Sent: Tuesday, October 05, 2004 1:41 PM > To: numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] Numeric to numarray experiences > > > Hi there, > > I've just translated a package for molecular modelling, which > makes extensive > use of Numeric, from Numeric to numarray. The outcome is somewhat > negative - > for now we are basically going to postpone the transition - the > reasons might > be interesting for the list and the numarray developpers out > there (who are > doing a brave job!). > > Speed: > A typical task in our package is the least-square fitting of a > large array of > coordinate frames ( N1 x N2 x 3) onto a set of reference or average > coordinates (using a sub-set of coordinates for the matching). > The example I > looked at (500 x 876 x 3 items) took 1.3 s with Numeric and 4.7 s with > numarray. The main culprits for the slow-down were: > * compress() - factor 10 > * average() - factor 7 (average() is missing from Numeric and I > hence had to > write a little function myself) > * LinearAlgebra.singular_value_decomposition() - factor 10 > but a lot of extra time is also spent in uufunc.py and various > numarraycore.py > routines. > > Memory efficiency: > I hoped numarray would solve some of the Out-of-memory problems > that I get > with Numeric but it turns out that it is rather less memory > efficient for my > kind of applications. Slicing an array that takes up 800MB on > disc just about > runs through with Numeric (and heavy swapping) but gives an Out-of-memory > with numarray. > > Suggestions: > OK, it's easy to make clever comments without contributing any > real work... > - compress(), take(), etc, really need some optimization > - a C-coded average() routine would be helpful > - faster LinearAlgebra routines are necessary > > Our sysadmin noted that unlike Numeric, numarray is not using any > external > math libraries (like LAPACK) that have been speed-optimized for > decades and > are available in CPU-optimized variants (e.g. ATLAS). It's > probably difficult > to match this efficiency with any new code ... > > Greetings > Raik > > PS: > I didn't find any useful HowTo for the translation from Numeric > to numarray. > The practical issues were the different nonzero() return value, the more > restrictive boolean comparison, that take doesn't support 'O' arrays any > longer, and the missing average(). > > -- > ----------------------------------------------------- > Raik Gr?nberg | Bioinformatique Structurale > | Institut Pasteur > | Paris, France > ----------------------------------------------------- > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to > find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From perry at stsci.edu Tue Oct 5 18:14:00 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Oct 5 18:14:00 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code In-Reply-To: Message-ID: Faheem Mitha wrote: > Dear People, > > I want to write some C++ code to link with Python, using the > Boost.Python interface. I need to generate random numbers in the C++ > code, and I was wondering as to the best way of doing this. > > Note that it is important that the random number generation interoperate > seamlessly with Python, in the sense that the behavior of the calls to > the RNG is the same whether calls are made at the C level or the Python > level. I hope the reasons why this is important are obvious. > > I was thinking that the method should go like this. > > 1) When C/C++ code called, reads seed from python random state. > > 2) Does its stuff. > > 3) Writes seed back to python level when it exits. > > After doing a little investigation of the numarray.random_array python > library and associated extension modules, it seems possible that the > answer is simpler than I had supposed. However, I would appreciate it if > someone would tell me if my understanding is incorrect in some places. > > Summary: It seems that I can just call all the C entry point routines > defined in ranlib.h, without worrying about getting or setting seeds. > > Rationale: > > The structure of this random number facility has three parts, all > files in > Packages/RandomArray2/Src. > > 1) low-level C routines: Packages/RandomArray2/Src/com.c and > Packages/RandomArray2/Src/ranlib.c. > > com.c: basic RNG stuff; getting and setting seeds etc. > ranlib.c: Random number generator algorithms for different distributions > etc. > > 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c. > > This interfaces the stuff in com.c and ranlib.c. > > 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py. > > This wraps the C interface. In most cases it does not do much > else besides > some basic argument error checking. > > From my perspective, the important thing is that the random > number seed is > only defined at C level as a static object, all the RNG stuff > happens at C > level, and the Python code just calls the C code as necessary. (I'm > sketchy about the details of what is defined as the seed etc.) > > This is in contrast with the R RNG facility (the only other RNG > facility I > am familiar with), which uses macros SetRNGstate() and GetRNGstate() to > read and write the seed, which is defined at R level. > > Therefore, the upshot is that the C routines in ranlib.h read and write > the same seed as the python level functions do, so no special action is > necessary with regard to the seed. > > Is this correct? > > In any case, it would be nice if something like the above was documented, > so lost souls like myself don't have to go trawling through the source > code to figure out what is going on. Of course it is nice that the source > code is available, otherwise even that would be impossible. > > R documents this stuff in the "Writing R Extensions" manual, online at > http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray > manual could have a small section about this too. > > Regards, Faheem. > I'm not sure I understand what you want to do. Do you want to link directly to the extension code from your C++ code? If so I'm wondering why. It would make the most sense if the C++ code needed obtain small numbers of random numbers in some iterative loop, and you wish to use the same random number library that that numarray is using. Otherwise, I would normally obtain the random number array in python, then call the C++ extension. Perhaps I didn't read carefully enough. Normally linking to an extension module involves some hacks that I'm not sure were done for the randomarray module (the gory details are in the python docs for extension modules), Todd can check on that, I'm not sure I will have time (a superficial check seems to indicate that it doesn't support direct linking, though one could link to the underlying library I suppose). As an aside, it is likely that a better module can be done as some have suggested, we just took what Numeric had at the time. Doing that is not a high priority with us at the moment (anyone else want to tackle that?). Right now integration with scipy is our biggest priority so things like this will have to take a back seat for a while. Furthermore, we did what we needed to to port these modules from Numeric, but that didn't necessarily make us experts in how they worked. I wish we were, but we've generally been directing our energy elsewhere. I'd presume that the sensible way for the module to work is to initialize its seed from a time-based seed in the absence of any other seed initialization, and to keep the seed state in the extension module, but I could be wrong. Perry From faheem at email.unc.edu Tue Oct 5 18:41:02 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Oct 5 18:41:02 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code In-Reply-To: References: Message-ID: On Tue, 5 Oct 2004, Perry Greenfield wrote: > I'm not sure I understand what you want to do. Do you want to link > directly to the extension code from your C++ code? Yes. > If so I'm wondering why. It would make the most sense if the C++ code > needed obtain small numbers of random numbers in some iterative loop, > and you wish to use the same random number library that that numarray is > using. I need to obtain an arbitrary (not known in advance) number of random numbers in the C++ code. I'm thinking of using the same random number library mostly because I assumed that using the same seed across the python/C interface would be supported. This is how it works in R (the only other place I have used this). Also, I had been using the same routines in the Python code I'm trying to convert to C++, so it would be a relatively smooth transfer. If I was to use a pure C/C++ library, I'd have to worry about copying the seed back and forth between Python and C. Is this what I'll have to do then? > Otherwise, I would normally obtain the random number array in python, > then call the C++ extension. Yes, this is what everyone suggests. But in my case, the number of random variates required is not known in advance. I get the feeling this situation does not arise very often for most people, but I work with stochastic processes which terminate according to some stopping criterion, and that is the standard situation in this case. Also generating these numbers in Python would give rise to serious performance issues. > Perhaps I didn't read carefully enough. Normally linking to an extension > module involves some hacks that I'm not sure were done for the > randomarray module (the gory details are in the python docs for > extension modules), Todd can check on that, I'm not sure I will have > time (a superficial check seems to indicate that it doesn't support > direct linking, though one could link to the underlying library I > suppose). Hmm. Well, this is unwelcome news. You mean I cannot link to ranlib.so? I assumed that including the ranlib.h header and linking my C++ module against ranlib.so would be enough. I suppose that was too optimistic. > As an aside, it is likely that a better module can be done as some > have suggested, we just took what Numeric had at the time. Doing that > is not a high priority with us at the moment (anyone else want to > tackle that?). Right now integration with scipy is our biggest > priority so things like this will have to take a back seat for > a while. > Furthermore, we did what we needed to to port these modules from > Numeric, but that didn't necessarily make us experts in how they > worked. I wish we were, but we've generally been directing our > energy elsewhere. I'd presume that the sensible way for the module > to work is to initialize its seed from a time-based seed in the > absence of any other seed initialization, and to keep the seed > state in the extension module, but I could be wrong. Yes. That is how R does it, anyway. Specifically, you declare the seed static, and then it persists across the Python/C interface. That is what I thought you had in the numarray code. Would it be hard to make it work like this? I'm no expert either. Faheem. From southey at uiuc.edu Wed Oct 6 07:01:38 2004 From: southey at uiuc.edu (Bruce Southey) Date: Wed Oct 6 07:01:38 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code Message-ID: Hi, My understanding is that you can use the Ranlib, R math, and GNU Scientific libraries in the manner you suggest or directly include the random number generator in your code. Usually you define the seed that should provide the same psuedo-random number stream every time these are used. If you don't use a seed then it is usually impossible to get the same stream of psuedo-random numbers. So I do not understand what you need to keep the same random number state. Not to mention that the common generators do repeat, some sooner than others. In your response to Perry, you indicate that you do not need an array of random numbers but rather the stream of random numbers. This is very different and I think you need to refine your algorithm to identify what parts need to be C/C++ and what need to be in Python/numarray. Since you currently have Python code, I would profile it to see what parts actually need extending - some times Python is rather surprising on how quick some things can be done (like using dictionaries). Providing those parts may be more fruitful to you than my vague responses. Regards Bruce ---- Original message ---- >Date: Tue, 5 Oct 2004 18:43:48 -0400 (EDT) >From: Faheem Mitha >Subject: Re: [Numpy-discussion] numarray.random_array number generation in C code >To: Bruce Southey >Cc: numpy-discussion > > > >On Tue, 5 Oct 2004, Bruce Southey wrote: > >> Hi, >> It is rather hard to suggest anything without more detail on what you want to >> actually do. > >I could give you more details if you were interested. > >> As you describe it, why do you need the 'seed' returned? It would only >> make sense if you were going in and out of Python multiple times - a >> somewhat undesirable situation due to the overhead costs. > >Not really. One might (and I frequently do) want to run the same function >(which in this case might be all in C++ code), interactively with >different parameters. The kind of thing that I'm doing is akin to >exploratory data analysis, and the specific code in question is a >stochastic search algorithm. Doing all this in C++ would not be very >interactive. Also, one often wants to postprocess data output using Python >scripts. This involves multiple calls to C++ code, and would be impossible >to do using C++, since one has to call other Python libraries. > > > I see at least three options: > >> 1) Do everything in Python/numarray. > >That's my current situation. > >> 2) Do parts in Python and the other in C/C++. >> For example, pass a matrix of random numbers to your code from Python. The >> 'seed' never needs to leave Python. > >This doesn't work very well unless you know in advance how many random >numbers are needed (not the case, for example, for stochastic search >algorithms), and in any case is a rather clumsy way to do things. No >offense intended. > >> 3) Do it all in C/C++ - pass the 'seed' into your code that includes the >> random number generator(s) - there is C/C++ code around for this. Do you stuff >> and then return the 'seed' back with whatever else is required. > >Yes, but part of the point of mixed programming is that you have an >interpreted front end which can easily hook into other routines. Also, in >this case, you would not be passing the seed in, since there is nothing to >pass it in from. One would simply call system time or something similar to >obtain the seed. > >> You can email me privately if you want. > >I'll keep sending this to the list unless someone objects, since I think >this is of some general interest. > >Really, my main question was to whether my understanding of how to use the >Numarray random number facilities in C was correct or not. > > Faheem. From jmiller at stsci.edu Wed Oct 6 23:47:31 2004 From: jmiller at stsci.edu (Todd Miller) Date: Wed Oct 6 23:47:31 2004 Subject: [Numpy-discussion] numarray.random_array number generation in C code In-Reply-To: References: Message-ID: <1097073394.31512.76.camel@halloween.stsci.edu> On Tue, 2004-10-05 at 21:10, Perry Greenfield wrote: > Faheem Mitha wrote: > > > Dear People, > > > > I want to write some C++ code to link with Python, using the > > Boost.Python interface. I need to generate random numbers in the C++ > > code, and I was wondering as to the best way of doing this. > > > > Note that it is important that the random number generation interoperate > > seamlessly with Python, in the sense that the behavior of the calls to > > the RNG is the same whether calls are made at the C level or the Python > > level. I hope the reasons why this is important are obvious. > > > > I was thinking that the method should go like this. > > > > 1) When C/C++ code called, reads seed from python random state. > > > > 2) Does its stuff. > > > > 3) Writes seed back to python level when it exits. > > > > After doing a little investigation of the numarray.random_array python > > library and associated extension modules, it seems possible that the > > answer is simpler than I had supposed. However, I would appreciate it if > > someone would tell me if my understanding is incorrect in some places. > > > > Summary: It seems that I can just call all the C entry point routines > > defined in ranlib.h, without worrying about getting or setting seeds. > > > > Rationale: > > > > The structure of this random number facility has three parts, all > > files in > > Packages/RandomArray2/Src. > > > > 1) low-level C routines: Packages/RandomArray2/Src/com.c and > > Packages/RandomArray2/Src/ranlib.c. > > > > com.c: basic RNG stuff; getting and setting seeds etc. > > ranlib.c: Random number generator algorithms for different distributions > > etc. > > > > 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c. > > > > This interfaces the stuff in com.c and ranlib.c. > > > > 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py. > > > > This wraps the C interface. In most cases it does not do much > > else besides > > some basic argument error checking. > > > > From my perspective, the important thing is that the random > > number seed is > > only defined at C level as a static object, all the RNG stuff > > happens at C > > level, and the Python code just calls the C code as necessary. (I'm > > sketchy about the details of what is defined as the seed etc.) > > > > This is in contrast with the R RNG facility (the only other RNG > > facility I > > am familiar with), which uses macros SetRNGstate() and GetRNGstate() to > > read and write the seed, which is defined at R level. > > > > Therefore, the upshot is that the C routines in ranlib.h read and write > > the same seed as the python level functions do, so no special action is > > necessary with regard to the seed. > > > > Is this correct? > > > > In any case, it would be nice if something like the above was documented, > > so lost souls like myself don't have to go trawling through the source > > code to figure out what is going on. Of course it is nice that the source > > code is available, otherwise even that would be impossible. > > > > R documents this stuff in the "Writing R Extensions" manual, online at > > http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray > > manual could have a small section about this too. > > > > Regards, Faheem. > > > I'm not sure I understand what you want to do. Do you want to link > directly to the extension code from your C++ code? If so I'm wondering > why. It would make the most sense if the C++ code needed obtain > small numbers of random numbers in some iterative loop, and you wish > to use the same random number library that that numarray is using. > Otherwise, I would normally obtain the random number array > in python, then call the C++ extension. Perhaps I didn't read carefully > enough. Normally linking to an extension module involves some hacks > that I'm not sure were done for the randomarray module (the gory > details are in the python docs for extension modules), Todd can > check on that, I checked and there's no C level export of the ranlib interface, at least not in the "hacked" sense of an extension module C-API where the linkage is made indirect via an API pointer and bizarre macros. > I'm not sure I will have time (a superficial check > seems to indicate that it doesn't support direct linking, though > one could link to the underlying library I suppose). Ordinary C linkage to numarray.random_array.ranlib2 may be supported since as an extension it is also a shared library, but I've never tried it myself and I wonder if it would actually work. If anyone has tried something like that I'd be interested in hearing how it turned out. Without a really compelling reason, I'd avoid it myself. Regards, Todd From dd55 at cornell.edu Sun Oct 10 12:51:58 2004 From: dd55 at cornell.edu (Darren Dale) Date: Sun Oct 10 12:51:58 2004 Subject: [Numpy-discussion] ieeespecial Message-ID: <200410101547.18413.dd55@cornell.edu> Hello, I am getting invalid numeric result exceptions when dividing a complex array by zero. Is this the desired behavior? Also, while trying to find a way around the above problem, I ran ieeespecial.test and got the following output. I am running numarray 1.1 on python 2.3.3. Todd, this might be correlated with the numerix package in matplotlib. I tried importing numarray and ieeespecial without matplotlib and the ieeespecial.test was successful. Thanks, Darren In [31]: ieeespecial.test() Out[31]: inf ***************************************************************** Failure in example: inf # the repr() of inf may vary from platform to platform from line #6 of numarray.ieeespecial Expected: inf Got: Out[31]: nan ***************************************************************** Failure in example: nan # the repr() of nan may vary from platform to platform from line #8 of numarray.ieeespecial Expected: nan Got: Out[31]: (array([0, 2]), array([0, 3])) ***************************************************************** Failure in example: getinf(b) from line #20 of numarray.ieeespecial Expected: (array([0, 2]), array([0, 3])) Got: Out[31]: array([[ 999., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 999.], [ 12., 13., 14., 15.]]) ***************************************************************** Failure in example: a from line #26 of numarray.ieeespecial Expected: array([[ 999., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 999.], [ 12., 13., 14., 15.]]) Got: Out[31]: (array([0, 1, 2]), array([1, 2, 3])) ***************************************************************** Failure in example: getnan(a) from line #35 of numarray.ieeespecial Expected: (array([0, 1, 2]), array([1, 2, 3])) Got: ***************************************************************** 1 items had failures: 5 of 11 in numarray.ieeespecial ***Test Failed*** 5 failures. Out[31]: (5, 11) -- Darren From dd55 at cornell.edu Sun Oct 10 13:57:43 2004 From: dd55 at cornell.edu (Darren Dale) Date: Sun Oct 10 13:57:43 2004 Subject: [Numpy-discussion] ieeespecial In-Reply-To: <200410101547.18413.dd55@cornell.edu> References: <200410101547.18413.dd55@cornell.edu> Message-ID: <200410101653.51172.dd55@cornell.edu> On Sunday 10 October 2004 03:47 pm, Darren Dale wrote: > Hello, > > I am getting invalid numeric result exceptions when dividing a complex > array by zero. Is this the desired behavior? > > Also, while trying to find a way around the above problem, I ran > ieeespecial.test and got the following output. I am running numarray 1.1 on > python 2.3.3. Todd, this might be correlated with the numerix package in > matplotlib. I tried importing numarray and ieeespecial without matplotlib > and the ieeespecial.test was successful. > On a related note, ieeespecial.getnan appears to be incompatible with complex arrays, see below. I didnt mention in my last email that I built numarray for my existing blas/lapack libraries, will this change the behavior on my system from the default? Thanks, Darren >>> from numarray import * >>> from numarray.ieeespecial import * >>> b=arange(10,typecode=Complex64) >>> a=b/0 Warning: Encountered invalid numeric result(s) in divide >>> a array([ nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj, nan +nanj]) >>> getnan(a) Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 117, ingetnan return _spec.index(a, _spec.NAN) File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 95, in index return _na.nonzero(mask(a, msk)) File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 87, in mask f = _na.ieeemask(a, m) File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 883, in _cache_miss2 mode, win1, win2, wout, cfunc, ufargs = \ File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 929, in _setup convtype1, convtype2, outtype, ucfunc \ File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 471, in _typematch newInputSignature = (self._typePromoter(intype, atypelist),)*2 File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 498, in _typePromoter raise TypeError("unable to find type to promote to") TypeError: unable to find type to promote to >>> getnan(a.real) (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),) >>> From aisaac at american.edu Sun Oct 10 15:57:18 2004 From: aisaac at american.edu (Alan G Isaac) Date: Sun Oct 10 15:57:18 2004 Subject: [Numpy-discussion] documentation error Message-ID: In the Numeric manual, there are two different defintions of the 'diagonal' function. The second definition appears to be incorrect. p.39: diagonal(a, k=0, axis1=0, axis2 = 1) returns the entries along the k th diagonal of a (k is an offset from the main diagonal). This is designed for 2d arrays. For larger arrays, it will return the diagonal of each 2d sub-array. p.44 diagonal(a, offset=0, axis1=0, axis2=1) The diagonal function takes an array a, and returns an array of rank 1 containing all of the elements of a such that the difference between their indices along the specified axes is equal to the specified offset. With the default values, this corresponds to all of the elements of the diagonal of a along the last two axes. fwiw, Alan Isaac From jmiller at stsci.edu Sun Oct 10 17:43:34 2004 From: jmiller at stsci.edu (Todd Miller) Date: Sun Oct 10 17:43:34 2004 Subject: [Numpy-discussion] ieeespecial In-Reply-To: <200410101653.51172.dd55@cornell.edu> References: <200410101547.18413.dd55@cornell.edu> <200410101653.51172.dd55@cornell.edu> Message-ID: <1097454870.3741.48.camel@localhost.localdomain> On Sun, 2004-10-10 at 16:53, Darren Dale wrote: > On Sunday 10 October 2004 03:47 pm, Darren Dale wrote: > > Hello, > > > > I am getting invalid numeric result exceptions when dividing a complex > > array by zero. Is this the desired behavior? > > > > Also, while trying to find a way around the above problem, I ran > > ieeespecial.test and got the following output. I am running numarray 1.1 on > > python 2.3.3. Todd, this might be correlated with the numerix package in > > matplotlib. I tried importing numarray and ieeespecial without matplotlib > > and the ieeespecial.test was successful. > > > > On a related note, ieeespecial.getnan appears to be incompatible with complex > arrays, see below. Thanks for pointing this out. It's an oversight in the implementation of ieeespecial and I'll fix it. > I didnt mention in my last email that I built numarray for > my existing blas/lapack libraries, will this change the behavior on my system > from the default? Regarding ieeespecial and complex division by zero, I am pretty sure blas/lapack linkage is irrelevant. But... I very rarely link with an external blas/lapack, so if there is an issue, I'm unlikely to have come across it myself. Still, off the top of my head, blas/lapack is unrelated. Regards, Todd > Thanks, > Darren > > >>> from numarray import * > >>> from numarray.ieeespecial import * > >>> b=arange(10,typecode=Complex64) > >>> a=b/0 > Warning: Encountered invalid numeric result(s) in divide > >>> a > array([ nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj, > nan +nanj]) > >>> getnan(a) > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 117, > ingetnan > return _spec.index(a, _spec.NAN) > File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 95, in > index > return _na.nonzero(mask(a, msk)) > File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 87, in > mask > f = _na.ieeemask(a, m) > File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 883, in > _cache_miss2 > mode, win1, win2, wout, cfunc, ufargs = \ > File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 929, in > _setup > convtype1, convtype2, outtype, ucfunc \ > File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 471, in > _typematch > newInputSignature = (self._typePromoter(intype, atypelist),)*2 > File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 498, in > _typePromoter > raise TypeError("unable to find type to promote to") > TypeError: unable to find type to promote to > > >>> getnan(a.real) > (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),) > >>> From dd55 at cornell.edu Sun Oct 10 18:08:10 2004 From: dd55 at cornell.edu (Darren Dale) Date: Sun Oct 10 18:08:10 2004 Subject: [Numpy-discussion] ieeespecial In-Reply-To: <1097454560.3741.41.camel@localhost.localdomain> References: <200410101547.18413.dd55@cornell.edu> <1097454560.3741.41.camel@localhost.localdomain> Message-ID: <200410102103.42221.dd55@cornell.edu> On Sunday 10 October 2004 08:29 pm, you wrote: > On Sun, 2004-10-10 at 15:47, Darren Dale wrote: > > Hello, > > > > I am getting invalid numeric result exceptions when dividing a complex > > array by zero. Is this the desired behavior? > > This is what I would have expected, and examining the definition I have > for complex division in numarray/Include/numarray/numcomplex.h, I don't > see a problem. The definition should probably be checked by an extra > set of eyes. Looks OK to me. Hi Todd, Sorry, I wasnt clear. I was wondering if it should raise a divide by zero exception and return an inf, as the real datatypes do, instead of an invalid numeric result and a nan. As it stands now, we have to handle divide by zero differently for different data types, if we need to filter/replace such values. Thanks, Darren From jmiller at stsci.edu Sun Oct 10 18:44:38 2004 From: jmiller at stsci.edu (Todd Miller) Date: Sun Oct 10 18:44:38 2004 Subject: [Numpy-discussion] ieeespecial In-Reply-To: <200410101547.18413.dd55@cornell.edu> References: <200410101547.18413.dd55@cornell.edu> Message-ID: <1097454560.3741.41.camel@localhost.localdomain> On Sun, 2004-10-10 at 15:47, Darren Dale wrote: > Hello, > > I am getting invalid numeric result exceptions when dividing a complex array > by zero. Is this the desired behavior? This is what I would have expected, and examining the definition I have for complex division in numarray/Include/numarray/numcomplex.h, I don't see a problem. The definition should probably be checked by an extra set of eyes. Looks OK to me. > Also, while trying to find a way around the above problem, I ran > ieeespecial.test and got the following output. I am running numarray 1.1 on > python 2.3.3. Todd, this might be correlated with the numerix package in > matplotlib. I tried importing numarray and ieeespecial without matplotlib and > the ieeespecial.test was successful. > I tried this with an ordinary Python shell and ieeespecial.test() completed without errors. Looking at your test output, I noticed it was skewed, and guessed there was an I/O synchronization issue messing up doctest. I tried the same test under IPython w/o matplotlib and duplicated your results, so I think the problem is an IPython/doctest issue. Regards, Todd > Thanks, > > Darren > > > In [31]: ieeespecial.test() > Out[31]: inf > ***************************************************************** > Failure in example: > inf # the repr() of inf may vary from platform to platform > from line #6 of numarray.ieeespecial > Expected: inf > Got: > Out[31]: nan > ***************************************************************** > Failure in example: > nan # the repr() of nan may vary from platform to platform > from line #8 of numarray.ieeespecial > Expected: nan > Got: > Out[31]: (array([0, 2]), array([0, 3])) > ***************************************************************** > Failure in example: getinf(b) > from line #20 of numarray.ieeespecial > Expected: (array([0, 2]), array([0, 3])) > Got: > Out[31]: > array([[ 999., 1., 2., 3.], > [ 4., 5., 6., 7.], > [ 8., 9., 10., 999.], > [ 12., 13., 14., 15.]]) > ***************************************************************** > Failure in example: a > from line #26 of numarray.ieeespecial > Expected: > array([[ 999., 1., 2., 3.], > [ 4., 5., 6., 7.], > [ 8., 9., 10., 999.], > [ 12., 13., 14., 15.]]) > Got: > Out[31]: (array([0, 1, 2]), array([1, 2, 3])) > ***************************************************************** > Failure in example: getnan(a) > from line #35 of numarray.ieeespecial > Expected: (array([0, 1, 2]), array([1, 2, 3])) > Got: > ***************************************************************** > 1 items had failures: > 5 of 11 in numarray.ieeespecial > ***Test Failed*** 5 failures. > Out[31]: (5, 11) -- From aisaac at american.edu Sun Oct 10 18:59:17 2004 From: aisaac at american.edu (Alan G Isaac) Date: Sun Oct 10 18:59:17 2004 Subject: [Numpy-discussion] location of tutorial Message-ID: p.29 of the Numeric manual refers to http://www.python.org/doc/tut/functional.html which no longer exists. I suggest substituting http://docs.python.org/tut/tut.html fwiw, Alan Isaac From jmiller at stsci.edu Mon Oct 11 04:28:51 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Oct 11 04:28:51 2004 Subject: [Numpy-discussion] ieeespecial In-Reply-To: <200410102103.42221.dd55@cornell.edu> References: <200410101547.18413.dd55@cornell.edu> <1097454560.3741.41.camel@localhost.localdomain> <200410102103.42221.dd55@cornell.edu> Message-ID: <1097493501.2619.26.camel@localhost.localdomain> On Sun, 2004-10-10 at 21:03, Darren Dale wrote: > On Sunday 10 October 2004 08:29 pm, you wrote: > > On Sun, 2004-10-10 at 15:47, Darren Dale wrote: > > > Hello, > > > > > > I am getting invalid numeric result exceptions when dividing a complex > > > array by zero. Is this the desired behavior? > > > > > > This is what I would have expected, and examining the definition I have > > for complex division in numarray/Include/numarray/numcomplex.h, I don't > > see a problem. The definition should probably be checked by an extra > > set of eyes. Looks OK to me. > > Hi Todd, > > Sorry, I wasn't clear. I was wondering if it should raise a divide by zero > exception and return an inf, as the real data types do, instead of an invalid > numeric result and a nan. As it stands now, we have to handle divide by zero > differently for different data types, if we need to filter/replace such > values. Numarray's error handling system is pretty flexible, and can raise exceptions on divide by zero if configured properly, or can ignore them altogether. See section 4.9 in the numarray-1.1 manual here: http://prdownloads.sourceforge.net/numpy/numarray-1.1.pdf?download It's an interesting question regarding the inf vs. nan. Looking at the complex division macro (NUM_CDIV) in numcomplex.h, I don't understand why we're getting nans now and not infs; it might be a bug in the macro, but I don't see it. Regards, Todd From stephen.walton at csun.edu Mon Oct 11 20:16:55 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Mon Oct 11 20:16:55 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: References: Message-ID: <1097550159.2568.5.camel@localhost.localdomain> On Sun, 2004-10-10 at 11:33, Alan G Isaac wrote: > In the Numeric manual, there are two different defintions of the > 'diagonal' function. The second definition appears to be incorrect. > > p.39: > diagonal(a, k=0, axis1=0, axis2 = 1) > p.44 > diagonal(a, offset=0, axis1=0, axis2=1) Are you sure? On my system, it appears that the second definition is correct in both Numeric 23.3 and numarray 1.1. From a.schmolck at gmx.net Tue Oct 12 02:40:55 2004 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Tue Oct 12 02:40:55 2004 Subject: [Numpy-discussion] A disconnected numarray rant Message-ID: Hi, I'm taking a 1 month break from computers (i.e. I will be completely off-line), and I have to catch a train in an hour; but I've recently bitten the bullet and made a matrix class I've been using for some time work with numarray; I've written down a number of things that occured to me while I was doing it, including some things which I think are bugs in numarray, so I thought at least posting the bugs would be a useful service; the rest is very raw and essentially unedited cut-and-paste of these notes -- sorry about that and I hope it doesn't contain anything particularly offensive. P.S. just dumped the code for the matrix class (nummat) at http://www.dcs.ex.ac.uk/~aschmolc/Stuff/ 'as The following are my notes: Things that fairly clearly seem to be bugs: - numarray.Int32 etc. can't be pickled - ``a = array(1+0j); a.imag = a.real * 10`` => IndexError - array(0, type=Float64) + 1e3000 => `inf` with right error modes but array(0, type=Float32) + 1e3000 => `OverflowError` - numarray.array(10)/numarray.array(0) => 0 - numarray.array(10000000000000L) => array(1316134912) - numarray.where(0,1,0) => array([0]) - l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l => [1, 2, 3] a = array([1,2,3]); numarray.put(a,numarray.array([1,2,0]),[0,0,0]); a => array([0, 0, 0]) - repr(numarray.array([],typecode='i')) (etc. etc.) => "numarray.array([])" - getattr(array([1,2,3]), '_aligned') => SystemError - obscure: numarray.where(0, matrix(568, convert_scalars=True),2) => ValueError (tries __len__ which fails, as len(array(568)) also fails) Numeric incompatiblilities (that are either undocumented or bug-like) - numarray.array('a', typecode='O') => TypeError (object arrays) - for extra fun try: numarray.array(1, type=numarray.Object) -=> RuntimeError something entirely different - nonzero is completely incompatible - shape(None) etc. no longer works (IMHO a bug) - cross_correlate & average missing - left_shift et al missing - numarray.sqrt(a,a) is None (*not* the result, as it used to be) - num.put(a, [0,1,2,3], [10,20]) style behavior seems unavailable (without numarray.numeric) put(array([[ 0., 1., 2.], [ 3., 4., 5.]]), [1, 4], [10,40]) fails - boolean testing (not even bool(array(0)) works; I'm not sure this is good) - Generally different handling of rank0-arrays; e.g. ``type(num.array(1.0) + 0) is float``; one potentially very nasty gotcha are inplace operations (e.g. a**=2) which have totally different semantics for python scalars and rank0 arrays, which, unlike Attribute errors on ``a.shape``, can lead to nasty bugs in corner cases (e.g. when a reduction just infrequently yields scalar ``a``) -- I think this should be mentioned in a gotchas section (another possible entry would be the need to use .copy() to **save** memory on slicing and 1xN, Nx1 matrices versus vectors (people are not used to thinking properly about rank from mathematical training or matlab exposure)). - asarray downcasts arrays (e.g.: asarray(array([1.,2.,3.]),'i')) - numarray.ones(-5) => MemoryError (ValueError would be nicer) - numarray.ones(2.0), numarray.ones([2]) fail (cf. numarray.range(2.0)) b=num.array([[1,2,3,4],[5,6,7,8]]*2) assert eq(num.diagonal(b), [1,6,3,8]) assert eq(num.diagonal(b, -1), [5,2,7]) c = num.array([b,b]) assert eq(num.diagonal(c,1), [[2,7,4], [2,7,4]]) - no a.toscalar() !!! - matrixmultiply in the docs - what's the point of swapaxes (i.e. why not have a generalized in-place transpose?) - what's the point of innerproduct? - indexing by a list is different from indexing by tuple (I haven't had time to look closely at the docs whether that's intentional) - doesn't know about Numeric's bizzarre '\x0b' typecode - numarray.sqrt.reduce([]) raises (sensibly) TypeError, not ValueError - len(array(1)) or array(1)[0] won't work anymore (understandable, but should be documented) - (should maximim, minimum reduce to -inf and inf?) - is not a very helpful repr; should be possible to get to the ufunc itself - as in Numeric numarray.maximum.reduce(numarray.array([0,-0.])) => -0.0 - __array__ protocol no longer supported (how can a non-derived class convert itself efficiently to an array?) Documentation Gotchas - p. 34 IMO row vector is used incorrectly; row and column vectors are really matrices (i.e. have rank 2) so ``array([[1,2,3]])`` would be a row vector - No proper explanation of differences between Numeric and numarray, or numarray.numeric module differences to proper (e.g. argmin) - No migration and best-practice advice (e.g. there should be a standard way for packages which work with both numarray and numeric as backends to let the user choose his preference; how about setting an environment var NumPy or something?) Waffle ------ - there *really* ought to be an array equality function (with optional tolerance); it's quite difficult to get right for are normal user (nans; zero-size arrays etc.) and it's often required, especially for testing - rank preserving reduction seems useful as an option would be nice -- e.g. to subtract out or divide by the reduced portion (which currently won't e.g. work for columns without adding a unit-dimension by hand). Design The (AFAICS) benefit-free but downside-rich introduction of `type` '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' Is there any reason that Typecode objects that compare as desired to the relevant strings ("i", "d") wouldn't have done? Now there is an explosion and confusion of interfaces -- some numpy code will now only except type(code)s as "typecode" keyword parameter (even in numarray! see numarray.mlab!) and other stuff Never mind that type already is a highly overused word in the python world. The big method bloat. ''''''''''''''''''''' As it says in the Numeric manual introductions there were "good reasons" for "very few array methods" -- now there are **56** public methods and 8 public attributes (public == not starting with '_'); of those 56 methods about 11 are accessors and of the rest about half are redundant or worse (i.e. they either also exist as numarray functions (argmin, argmax, diagonal, ...) or they really ought to be functions (mean, stddev) or they are quite confusing (``a.min``, ``a.max`` which behave quite differenlty from ``a.argmin`` and ``a.argmax``, never mind ``numarray.minimum``) or simply utterly pointless (``a.nelements`` == ``a.size``)). - argmin, argmax : what's wrong with numarray.argmin, numarray.argmax??? Why do argmin/argmax and max/min have completely different interfaces??? If there really is a need for these (there isn't) anything a.min and a.max should be called a.flatmin, a.flatmax - diagonal, mean, nelements, nonzero, ... - perversely the **only** function that I can think off that could have sensibly become a method hasn't: ``put`` (it used to work only on arrays under Numeric and not without reason, so making it a method would have been sensible; numarray.put of course also "works" on non-arrays, it just doesn't do anything with them) Test Code ''''''''' numtest.py doesn't inspire full confidence (it's about 1000 lines of actual code but it doesn't seem that clearly structured and AFAICT contains no single loop (and that despite the diversity of shapes, types etc. that exist in numarray -- why not try something slightly more systematic?)). From avhot at email.msn.com Tue Oct 12 06:11:30 2004 From: avhot at email.msn.com (Shelia Mendez) Date: Tue Oct 12 06:11:30 2004 Subject: [Numpy-discussion] Cheap software for you please. 6610536 Message-ID: <43647672541191164755429@email.msn.com> An HTML attachment was scrubbed... URL: From aisaac at american.edu Tue Oct 12 07:03:18 2004 From: aisaac at american.edu (Alan G Isaac) Date: Tue Oct 12 07:03:18 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: <1097550159.2568.5.camel@localhost.localdomain> References: <1097550159.2568.5.camel@localhost.localdomain> Message-ID: > On Sun, 2004-10-10 at 11:33, Alan G Isaac wrote: >> In the Numeric manual, there are two different defintions of the >> 'diagonal' function. The second definition appears to be incorrect. On Mon, 11 Oct 2004, Stephen Walton apparently wrote: > Are you sure? On my system, it appears that the second definition is > correct in both Numeric 23.3 and numarray 1.1. You did not quote the problematic portion: The diagonal function takes an array a, and returns an array of rank 1 ... With the default values, this corresponds to all of the elements of the diagonal of a along the last two axes. Contrast: >>> import Numeric >>> Numeric.__version__ '23.1' >>> x=[[[1,2],[3,4]],[[5,6],[7,8]]] >>> Numeric.diagonal(x) array([[1, 4], [5, 8]]) fwiw, Alan Isaac From stephen.walton at csun.edu Tue Oct 12 08:42:04 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Tue Oct 12 08:42:04 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: References: <1097550159.2568.5.camel@localhost.localdomain> Message-ID: <1097595580.24491.4.camel@freyer.sfo.csun.edu> On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote: > On Mon, 11 Oct 2004, Stephen Walton apparently wrote: > > Are you sure? On my system, it appears that the second definition is > > correct in both Numeric 23.3 and numarray 1.1. > > > You did not quote the problematic portion: > The diagonal function takes an array a, and returns > an array of rank 1 ... Ah, I thought you were referring to the fact that, in the first version in the documentation, the second, named argument is given as "k" but in the second version it is "offset". A look at the source reveals the second keyword name is the correct one. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From aisaac at american.edu Tue Oct 12 12:25:01 2004 From: aisaac at american.edu (Alan G Isaac) Date: Tue Oct 12 12:25:01 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: <1097595580.24491.4.camel@freyer.sfo.csun.edu> References: <1097550159.2568.5.camel@localhost.localdomain><1097595580.24491.4.camel@freyer.sfo.csun.edu> Message-ID: > On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote: >> You did not quote the problematic portion: >> The diagonal function takes an array a, and returns >> an array of rank 1 ... On Tue, 12 Oct 2004, Stephen Walton apparently wrote: > A look at the source reveals the > second keyword name is the correct one. OK then, we have a double problem. The first version gives the correct description but uses the wrong keyword. The second version gives the wrong description but uses the correct keyword. So, how do we file a documentation bug? Cheers, Alan Isaac From perry at stsci.edu Tue Oct 12 12:31:17 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Oct 12 12:31:17 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: Message-ID: > So, how do we file a documentation bug? > > Cheers, > Alan Isaac > I'd say just like any other kind of bug. Perry From jmiller at stsci.edu Tue Oct 12 12:40:19 2004 From: jmiller at stsci.edu (Todd Miller) Date: Tue Oct 12 12:40:19 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: References: <1097550159.2568.5.camel@localhost.localdomain> <1097595580.24491.4.camel@freyer.sfo.csun.edu> Message-ID: <1097609991.30171.556.camel@halloween.stsci.edu> On Tue, 2004-10-12 at 12:40, Alan G Isaac wrote: > > On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote: > >> You did not quote the problematic portion: > >> The diagonal function takes an array a, and returns > >> an array of rank 1 ... > > > > On Tue, 12 Oct 2004, Stephen Walton apparently wrote: > > A look at the source reveals the > > second keyword name is the correct one. > > > OK then, we have a double problem. > The first version gives the correct description > but uses the wrong keyword. > The second version gives the wrong description > but uses the correct keyword. > > So, how do we file a documentation bug? > Go here: http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse then "Submit New", and set the "category" to "documentation. Regards, Todd > Cheers, > Alan Isaac > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From pearu at scipy.org Wed Oct 13 06:02:48 2004 From: pearu at scipy.org (Pearu Peterson) Date: Wed Oct 13 06:02:48 2004 Subject: [Numpy-discussion] ANN: SciPy 0.3.2 Released Message-ID: Hi, Scipy 0.3.2 has been released and binaries are available from the scipy.org site: http://www.scipy.org Scipy 0.3.2 is a bug fix release of Scipy 0.3 including the following new features: - wxPython 2.5 support - reading/writing dense/sparse matrices in Matrix Market format - iterative solvers, new functions sqrtm, hessenberg - Constrained Optimization BY Linear Approximation - discrete Boltzmann, Planck, Levy distributions - Scipy tests pass now also on 64-bit systems and Mac OSX etc. The complete release notes can be found here: http://www.scipy.org/download/scipy_release_notes_0.3.2.html Best regards, Pearu BTW Scipy is: ------------- Scipy is an open source library of scientific tools for Python. Scipy supplements the popular Numeric module, gathering a variety of high level science and engineering modules together as a single package. Scipy includes modules for graphics and plotting, optimization, integration, special functions, signal and image processing, genetic algorithms, ODE solvers, and others. From jmiller at stsci.edu Wed Oct 13 14:35:08 2004 From: jmiller at stsci.edu (Todd Miller) Date: Wed Oct 13 14:35:08 2004 Subject: [Numpy-discussion] A disconnected numarray rant In-Reply-To: References: Message-ID: <1097703239.631.923.camel@halloween.stsci.edu> Hi Alexander, Thanks for taking the time to provide us with feedback. I've responded to many of your points below. [and in the interest of keeping the text bloat down, I've interjected my own comments in brackets--Perry] On Tue, 2004-10-12 at 05:37, Alexander Schmolck wrote: > Hi, > > I'm taking a 1 month break from computers (i.e. I will be completely > off-line), and I have to catch a train in an hour; but I've recently > bitten > the bullet and made a matrix class I've been using for some time work > with > numarray; I've written down a number of things that occured to me > while I was > doing it, including some things which I think are bugs in numarray, so > I > thought at least posting the bugs would be a useful service; the rest > is very > raw and essentially unedited cut-and-paste of these notes -- sorry > about that > and I hope it doesn't contain anything particularly offensive. > > P.S. just dumped the code for the matrix class (nummat) at > http://www.dcs.ex.ac.uk/~aschmolc/Stuff/ > > 'as > > The following are my notes: > > > Things that fairly clearly seem to be bugs: > - numarray.Int32 etc. can't be pickled Known limitation, but OK. Arrays can be pickled, as can Numeric typecodes so I'm not sure how critical this omission is. > - ``a = array(1+0j); a.imag = a.real * 10`` => IndexError > - array(0, type=Float64) + 1e3000 => `inf` with right error modes > but array(0, type=Float32) + 1e3000 => `OverflowError` > - numarray.array(10)/numarray.array(0) => 0 > - numarray.array(10000000000000L) => array(1316134912) > - numarray.where(0,1,0) => array([0]) There seems to be an infinity of rank-0 issues and so little justification for having them that at one point we considered ripping them out altogether. Noted, but low priority. [Amen. If I had known the problems that rank-0 zero arrays would cause I think I would have excluded them. I'm not sure I see the need for them now that coercion rules have changed and helper functions to change scalars into rank-1 len-1 arrays which serve almost all other purposes. I'm interested in seeing what real purpose they serve now (I understand the backward compatibility issue, but backward compatibility is not the be all and end all for numarray; more on that later)] > - l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l > => [1, 2, 3] Should raise a TypeError I guess. > a = array([1,2,3]); > numarray.put(a,numarray.array([1,2,0]),[0,0,0]); a => array([0, 0, 0]) I don't see what's wrong here. > - repr(numarray.array([],typecode='i')) (etc. etc.) => > "numarray.array([])" Zero length arrays are rather like rank-0 arrays: low priority. Agreed... this is a small wart. > - getattr(array([1,2,3]), '_aligned') => SystemError Interesting. I've been thinking about ripping out the _align and _contiguous self-test hacks for a long time. You've made up my mind. > - obscure: numarray.where(0, matrix(568, convert_scalars=True),2) > => > ValueError (tries __len__ which fails, as len(array(568)) also > fails) I think this may boil down to "no where() for object arrays". numarray.where() can't handle object arrays and there is no numarray.objects.where(). Not implemented yet. > Numeric incompatiblilities (that are either undocumented or bug-like) The best Numeric compatibility in numarray comes from: import numarray.numeric as Numeric It's still not perfect, but it is more compatible than ordinary numarray. > - numarray.array('a', typecode='O') => TypeError (object arrays) > - for extra fun try: numarray.array(1, type=numarray.Object) -=> > RuntimeError > something entirely different Object arrays in numarray do not have the synergy they have in Numeric. In particular, numarray.array() can't create them, only numarray.objects.array(). [At the time we added object arrays, we noticed that they were not safe in Numeric; that is, Numeric was not properly handling reference counts of objects in arrays for at least some operations and it was possible to segfault object arrays. This may have changed since then; we haven't had a chance to check the current status. But the point is that handling object arrays safely is a lot more than just loading them with object pointers. Any function that can set values in arrays needs to handle their refcounts, and that isn't all that trivial. We took a short cut of using a Python implementation for object arrays that doesn't have all the old functionality, but also didn't have the problems that they did at the time.] > - nonzero is completely incompatible numarray.numeric covers this. numarray's nonzero() is more powerful, capable of handling multidimensional arrays, so it returns a tuple of values rather than a single value. It's unfortunate that we chose to use the name nonzero() for the "new" function; it has the right interface and the wrong name. Keep in mind though, our compatibility goals have grown immensely since we started. > - shape(None) etc. no longer works (IMHO a bug) This may be related to the object array synergy. I think numarray.asarray() is the problem here, since it doesn't know how to create object arrays. > - cross_correlate & average missing I think cross_correlate is in numarray.convolve.correlate. It was a conscious choice not to put it in core numarray. Average has never been implemented and should be, especially since it has different semantics than the mean() method. > - left_shift et al missing These were renamed lshift and rshift. Note that << works fine. Synonyms should probably be added. > - numarray.sqrt(a,a) is None (*not* the result, as it used to be) What do you want here? What we have now is, IMO, correct. [Amen. This was intentionally changed from Numeric.] > - num.put(a, [0,1,2,3], [10,20]) style behavior seems unavailable > (without numarray.numeric) I wasn't exactly sure what the expected behavior was for this, but guessed is was some kind of repeat. If that's what the behavior was, Perry and I don't really like it. Besides, numarray.numeric.put *is* Numeric.put, modulo numarray underpinnings. > put(array([[ 0., 1., 2.], [ 3., 4., 5.]]), [1, 4], [10,40]) > fails numarray.put() does have different semantics for multi-dimensional destinations... you need multi-dimensional indexes (i.e. a tuple of index arrays). Again, there's now numarray.numeric.put(). > - boolean testing (not even bool(array(0)) works; I'm not sure this is > good) [I am. This was a clear and explicit decision to not replicate Numeric behavior. I'm convinced that it is the right decision. There is just too much confusion about what the truth value of an array should be. Helper functions should be used to make it unambiguous.] > - Generally different handling of rank0-arrays; e.g. > ``type(num.array(1.0) + > 0) is float``; one potentially very nasty gotcha are inplace > operations > (e.g. a**=2) which have totally different semantics for python > scalars and > rank0 arrays, which, unlike Attribute errors on ``a.shape``, can > lead to > nasty bugs in corner cases (e.g. when a reduction just infrequently > yields > scalar ``a``) -- I think this should be mentioned in a gotchas > section We have areduce() for this case, which always returns an array. > (another possible entry would be the need to use .copy() to **save** > memory > on slicing and 1xN, Nx1 matrices versus vectors (people are not used > to > thinking properly about rank from mathematical training or matlab > exposure)). [You will need to elaborate about what you mean here. E.g., as to the first: I'm guessing you mean when a slice is taken and then the original array is deleted. But it isn't clear.] > - asarray downcasts arrays (e.g.: asarray(array([1.,2.,3.]),'i')) True enough. Is there some reason why the method should silently succeed (I know we wanted that) and the function should not? > - numarray.ones(-5) => MemoryError (ValueError would be nicer) Easy to change. > - numarray.ones(2.0), This fails, and that's fine by me. The idea of floating point shapes seems bogus. > numarray.ones([2]) AFIK, this works, and should work. > fail (cf. numarray.range(2.0)) IMHO, arange() is a special case and not really equivalent to numarray.ones(). > b=num.array([[1,2,3,4],[5,6,7,8]]*2) > assert eq(num.diagonal(b), [1,6,3,8]) > assert eq(num.diagonal(b, -1), [5,2,7]) > c = num.array([b,b]) > assert eq(num.diagonal(c,1), [[2,7,4], [2,7,4]]) > - no a.toscalar() !!! a.toscalar() is written a[()] in numarray. [This is one method that shouldn't be there IMO. What would people expect it to do for arrays with len>1 ?] > - matrixmultiply in the docs OK. > - what's the point of swapaxes (i.e. why not have a generalized > in-place > transpose?) It's a very common function in implementation of numarray/Numeric. [In many cases it is far easier to use than an generalized transpose (which does exist, but requires all axes to be explicitly given)] > - what's the point of innerproduct? Compatibility. [For a while the flavor is: "dammit, why aren't you compatible?" Now it's: "dammit, why are you compatible?"] > - indexing by a list is different from indexing by tuple (I haven't > had time > to look closely at the docs whether that's intentional) It's intentional. Indexing by a list is "array" indexing. Indexing by a tuple is not. Thus, a 3D array by [1,2,3] is pulling out 2D blocks, while (1,2,3) is pulling out a single scalar. [In particular, tuples have a special meaning for indexing; this distinction is unavoidable since it is a Python language issue.] > - doesn't know about Numeric's bizzarre '\x0b' typecode Me either. Should we add this? [Not unless there is a good reason. What's it for? Why are you using it (particularly since you called it bizarre)?] > - numarray.sqrt.reduce([]) raises (sensibly) TypeError, not ValueError Got lucky I guess. > - len(array(1)) or array(1)[0] won't work anymore (understandable, but > should be documented) OK. > - (should maximim, minimum reduce to -inf and inf?) Don't they? > - is not > a very helpful repr; should be possible to get to the ufunc itself Doesn't this comment fly in the face of Python itself? [I imagine it is possible, but why? repr(dir) doesn't give you a usable function creator, nor does it work in Numeric.] > - as in Numeric numarray.maximum.reduce(numarray.array([0,-0.])) => > -0.0 Talk about fine points... noted. I think the problem is that 0.0 == -0.0, so there's no way for the reduction to get it right without adding special code to look for this case, and that isn't gonna happen without a strong case being made. [Again, a very good case needs to be made for handling this. I doubt that it is important to many, and as Todd mentions, not easy to handle.] > - __array__ protocol no longer supported (how can a non-derived class > convert > itself efficiently to an array?) Maybe an old-timer can explain how this worked for Numeric. I think this is only partially implemented in numarray and that maybe we need to add a check for an __array__() method to numarray.array(). > Documentation Gotchas > - p. 34 IMO row vector is used incorrectly; row and column vectors are > really > matrices (i.e. have rank 2) so ``array([[1,2,3]])`` would be a > row vector Sounds reasonable. > - No proper explanation of differences between Numeric and numarray, > or > numarray.numeric module differences to proper (e.g. argmin) If there is, I don't know where it is. Noted, but I'm not really an encyclopedia of these facts myself. > - No migration and best-practice advice (e.g. there should be a > standard way > for packages which work with both numarray and numeric as backends > to let > the user choose his preference; how about setting an environment var > NumPy > or something?) We're just working this out ourselves. [Let me elaborate more. We haven't really had much experience yet porting tons of Numeric code (MA is about the only example). We are working on scipy now so I expect that in a few months we will know much better what the most important porting issues are. At the moment, this is better documented by others.] > Waffle [meaning?] > ------ > > - there *really* ought to be an array equality function (with optional > tolerance); it's quite difficult to get right for are normal user > (nans; > zero-size arrays etc.) and it's often required, especially for > testing You're right. Want submit one? [Make sure it isn't dependent on the underlying C compiler's libraries for testing floating point special values!] > - rank preserving reduction seems useful as an option would be nice -- > e.g. to > subtract out or divide by the reduced portion (which currently won't > e.g. > work for columns without adding a unit-dimension by hand). Sounds like an interesting idea, but also method bloat. > Design > > The (AFAICS) benefit-free but downside-rich introduction of `type` > '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' > > Is there any reason that Typecode objects that compare as desired to > the > relevant strings ("i", "d") wouldn't have done? Now there is an > explosion > and confusion of interfaces -- some numpy code will now only except > type(code)s as "typecode" keyword parameter (even in numarray! see > numarray.mlab!) and other stuff > > Never mind that type already is a highly overused word in the python > world. Personally, I like type because it's succinct and we have type objects, not single character codes. More importantly, Perry likes type, and the bottom line is that it's his shot to call and he's called it. [We wrestled with this a while. Given that the representation of the type had changed from a character code, typecode is clearly misleading and inappropriate. It is there only for backward compatibility; for new code to be used under numarray only, people shouldn't use it. Type certainly seemed by far the most descriptive and accurate term. It does have the drawback of overloading the type function. Other considerations were things like atype, but type is what we went with.] > The big method bloat. > ''''''''''''''''''''' > > As it says in the Numeric manual introductions there were "good > reasons" for I actually don't buy the reasons myself. Some methods are natural, convenient, and good so I need to hear more voices arguing this point before I'll budge. Clearly there is *some* bloat, but identifying what to ax is more difficult. I suppose we could do a vote to clean this up. > "very few array methods" -- now there are **56** public methods and > 8 public > attributes (public == not starting with '_'); of those 56 methods > about 11 > are accessors and of the rest about half are redundant or worse > (i.e. they > either also exist as numarray functions (argmin, argmax, diagonal, > ...) or Which of the public attributes do you have a problem with? Which accessors? > they really ought to be functions (mean, stddev) or they are quite > confusing The need for these is common so I thought it would be good to add them. Functions could be added as well. > (``a.min``, ``a.max`` These require tricks to get right so we added them. The doc-strings explain what they do. > which behave quite differenlty from ``a.argmin`` and > ``a.argmax``, Good point. These are inconsistent with min and max, which were added independently at a later date. I'm thinking we should deprecate the argmin and argmax methods, which I added hoping to do polymorphism for strings and records and if I recall correctly never did anyway. IMHO, min(), max(), mean(), and stddev() are simple, useful, and should remain. > never mind ``numarray.minimum``) or min != minimum, and because it is a little tricky to get right, we codified it as a method. > simply utterly pointless > (``a.nelements`` == ``a.size``)). I added nelements() because I needed it and didn't know about a.size()... simple as that. a.size() came later for compatibility only. [I'll argue that nelements is far clearer in meaning. What does size mean? Total bytes? Total number of elements? Sorry, I disagree on this one.] > If there really is a need for these (there isn't) if anything a.min > and a.max > should be called a.flatmin, a.flatmax flatmin is certainly clear, but the min/max docstrings also explain it with no fuzz. > - diagonal, mean, nelements, nonzero, ... nonzero(), and diagonal() I could care less about so they can probably be deprecated and removed. I like mean(). > - perversely the **only** function that I can think off that could > have > sensibly become a method hasn't: ``put`` (it used to work only on > arrays > under Numeric and not without reason, so making it a method would > have > been sensible; numarray.put of course also "works" on non-arrays, > it just > doesn't do anything with them) Well, we need the numarray.put() function for compatibility, and there's already a more succinct syntax for put(), which is array based indexing so I don't see any point in adding a put() method. > Test Code > ''''''''' > numtest.py doesn't inspire full confidence (it's about 1000 lines of > actual > code but it doesn't seem that clearly structured and AFAICT contains > no > single loop (and that despite the diversity of shapes, types etc. > that exist > in numarray -- why not try something slightly more systematic?)) Testing could certainly be better. unittest might work better for this kind of thing than doctest. I agree that we should test for a wider variety of shapes, types, sizes, and behaviors but it takes time and effort to do it so it hasn't been done yet. There's little doubt we'd find bugs and the system would be better for it. [On the other hand, is it the most important thing to do next? Any volunteers to improve the test suite? It may not be the most complete and systematic one out there, but it's at least as good as the one for Numeric ;-)] There's a lot of input here. We'll see what we can do. Thanks again. Regards, Todd [A few more editorial comments. When we started numarray, compatibility was not high on the list of priorities, so the initial implementation didn't focus on it. A number of the problems you point out reflect that origin. While it is more important, it isn't the only guide. We seek compatibility when there is no strong reason to be incompatible. But there are a number of issues where we definitely wanted different behavior (if it were to be completely compatible, we wouldn't have bothered in the first place; we needed some changes). Given the odd corners you've run into, it makes me curious to see the code that generated this; particularly with regard to rank-0 arrays. If I get a chance I'll take a look at the link you provided. I wonder if it is typical of what other users will encounter or not. I guess our experience in porting scipy will give us a better indication. To summarize what we see as work that should be done to address the points made: rank-0 issues: 1) a.imag doesn't work 2) array(0, type=Float64) + 1e3000 => `inf` with right error modes but array(0, type=Float32) + 1e3000 => `OverflowError` 3) numarray.array(10)/numarray.array(0) => 0 4) numarray.array(10000000000000L) => array(1316134912) 5) numarray.where(0,1,0) => array([0]) 6) documentation of behavior (how to turn into scalar, that len and [0] indexing doesn't work, etc.) Others 1) puts into lists should raise Type error l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l => [1, 2, 3] 2) repr for zero length arrays needs to show type and other info. 3) rip out _align and _contiguous self-test hacks 4) improved object array handling (e.g., where and the like) 5) average function 6) change MemoryError to ValueError for ones(-5) 7) document matrixmultiply 8) support for __array__ protocol? 9) Documentation fix for p34 row vector usage. 10) Numeric to numarray conversion guide 11) Better tests Most of these are not likely to get immediate attention as our focus now is on integrating scipy. To the extent they make it easier to do, their priority may be raised. There are a lot of "should"s but we have limited resources just like anyone else; we can't do it all at once.] From jmiller at stsci.edu Thu Oct 14 06:11:22 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 14 06:11:22 2004 Subject: [Numpy-discussion] character arrays supported by C API? In-Reply-To: References: <1095253587.4624.380.camel@halloween.stsci.edu> Message-ID: <1097759076.4219.39.camel@halloween.stsci.edu> On Thu, 2004-10-14 at 04:20, Faheem Mitha wrote: > On Wed, 15 Sep 2004, Todd Miller wrote: > > > On Wed, 2004-09-15 at 00:52, Faheem Mitha wrote: > >> Dear People, > >> > >> Are character arrays supported by the Numarray C API? My impression from > >> the documentation is no, but I would appreciate a confirmation. Thanks. > >> > >> Faheem. > > > > Yes and no. CharArray is not as well supported from C as NumArray; > > there are no easy to call functions which will convert a nested sequence > > of strings into a CharArray. > > > > However, it is possible to call the Python functions in the CharArray > > module from C, and a pre-existing CharArray is a PyArrayObject so it > > can be manipulated in C as a struct; it's shape and strides are > > visible, it's itemsize is the length of the string, etc. > > > > What is it you want to do? What functions do you think would help? > > Hi. Sorry about the slow reply. > > What I want to do is extremely simple. I want to convert (in C++) a C++ > character array to a CharArray. The simplest way of doing this would be to > create an array of the appropriate size, and write character strings into > it element by element. > > So, a utility function which creates a character array of appropriate > dimensions would be useful. Also a utility function which convert a list > of strings into a Character Array would also be desirable. > > Currently I am having to work around this limitation by returning lists of > strings back to Python. I'd prefer to not have to do that. That's a sensible addition, but right now, such a function does not exist, and I don't have time to add it myself. The way to achieve this without C-API support by CharArray is to do a Python callback. The steps in C would be roughly: 0. Import the numarray.strings module. PyImport_ImportModule(). 1. Get the module's dictionary object. PyModule_GetDict(). 2. Get a pointer to CharArray by looking it up in the dictionary. PyDict_GetItemString(). 3. Construct an argument tuple which contains the constructor parameters. Py_BuildValue(). 4. Call the constructor using the arg tuple. The return value is the CharArray. PyObject_CallFunction(). Similar steps are done for NumArray in the current C-API in newarray.ch in NA_NewAllFromBuffer(). Regards, Todd From akulla at comcast.net Thu Oct 14 06:44:20 2004 From: akulla at comcast.net (akulla at comcast.net) Date: Thu Oct 14 06:44:20 2004 Subject: [Numpy-discussion] Slow operation of nd_image.generic_filter Message-ID: <101420041338.1510.416E8157000325EE000005E622007456720E04049A050E@comcast.net> Hi all, Could it be that the execution of the following function lasts more than 25 seconds, for an array of shape (256, 480)? ... def myFunc(anArray, winSize=5): return numarray.nd_image.generic_filter(\ input=anArray, function=lambda win: win.mean(), size=winSize, mode='constant') ... Python 2.3, numarray 1.0 (XP, P4) Regards, Alban From falted at pytables.org Fri Oct 15 04:27:55 2004 From: falted at pytables.org (Francesc Alted) Date: Fri Oct 15 04:27:55 2004 Subject: [Numpy-discussion] numarray and ATLAS Message-ID: <200410151318.40035.falted@pytables.org> Hi, Perhaps this is a too recurrent subject, but I'm having problems when making numarray to use ATLAS instead of the mini-lapack included. I've installed ATLAS 3.6.0 on my pentium IV machine. I've made it a completely featured LAPACK by following the instructions in: http://math-atlas.sourceforge.net/errata.html#completelp and I'm pretty sure that the resulting library works. Now, after exporting USE_LAPACK and set the appropiate directory for lapack_dirs in addons.py, the compilation went well (however, I can see that lapack_litemodule.c is still being compiled, and I don't know if that's normal or not). The command I've used to install is: $ python setup.py install --gencode --home=/users/exp/alted/bin-i686 And the error that happens during the test phase follows: $ python Python 2.3.4 (#1, Jul 22 2004, 20:47:54) [GCC 3.3.2 20031022 (Red Hat Linux 3.3.2-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numarray.testall as testall >>> testall.test() numarray: ((0, 1199), (0, 1199)) numarray.records: (0, 48) numarray.strings: (0, 176) numarray.memmap: (0, 82) numarray.objects: (0, 105) numarray.memorytest: (0, 16) numarray.examples.convolve: ((0, 20), (0, 20), (0, 20), (0, 20)) numarray.convolve: (0, 52) Traceback (most recent call last): File "", line 1, in ? File "/users/exp/alted/bin-i686/lib/python/numarray/testall.py", line 24, in test result = eval(p+".test()") File "", line 0, in ? File "/users/exp/alted/bin-i686/lib/python/numarray/fft/FFT.py", line 326, in test import dtest File "/users/exp/alted/bin-i686/lib/python/numarray/fft/dtest.py", line 238, in ? import numarray.random_array as random_array File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/__init__.py", line 7, in ? from RandomArray2 import * File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/RandomArray2.py", line 3, in ? import numarray.linear_algebra as linalg File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/__init__.py", line 1, in ? from LinearAlgebra2 import * File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/LinearAlgebra2.py", line 23, in ? import lapack_lite2 ImportError: /users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/lapack_lite2.so: undefined symbol: dgesdd_ I've checked that dgesdd symbol exists on my liblapack.a: $ strings ~/bin-i686/lib/atlas/liblapack.a | grep dgesdd dgesdd.o/ 1097832195 2514 515 100644 13788 ` but not a dgesdd_, as you can see. I'm missing something? -- Francesc Alted From falted at pytables.org Fri Oct 15 10:07:40 2004 From: falted at pytables.org (Francesc Alted) Date: Fri Oct 15 10:07:40 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <200410151318.40035.falted@pytables.org> References: <200410151318.40035.falted@pytables.org> Message-ID: <200410151903.41288.falted@pytables.org> Hi, Despite de fact that some errors arise, I've checked the numarray version linked against ATLAS, and it seems like it doesn't get the expected ATLAS boost: >>> import timeit >>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)") >>> t1.repeat(3,10) [3.7274820804595947, 3.8542821407318115, 3.7117569446563721] However, Numeric seems to get it: >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))") >>> t3.repeat(3,10) [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281] i.e. almost 300 faster than numarray Anyone is getting the acceleration boost with numarray & ATLAS? Cheers, A Divendres 15 Octubre 2004 13:18, Francesc Alted va escriure: > Hi, > > Perhaps this is a too recurrent subject, but I'm having problems when > making numarray to use ATLAS instead of the mini-lapack included. > > I've installed ATLAS 3.6.0 on my pentium IV machine. I've made it a > completely featured LAPACK by following the instructions in: > > http://math-atlas.sourceforge.net/errata.html#completelp > > and I'm pretty sure that the resulting library works. Now, after exporting > USE_LAPACK and set the appropiate directory for lapack_dirs in addons.py, > the compilation went well (however, I can see that lapack_litemodule.c is > still being compiled, and I don't know if that's normal or not). The command > I've used to install is: > > $ python setup.py install --gencode --home=/users/exp/alted/bin-i686 > > And the error that happens during the test phase follows: > > $ python > Python 2.3.4 (#1, Jul 22 2004, 20:47:54) > [GCC 3.3.2 20031022 (Red Hat Linux 3.3.2-1)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numarray.testall as testall > >>> testall.test() > numarray: ((0, 1199), (0, 1199)) > numarray.records: (0, 48) > numarray.strings: (0, 176) > numarray.memmap: (0, 82) > numarray.objects: (0, 105) > numarray.memorytest: (0, 16) > numarray.examples.convolve: ((0, 20), (0, 20), (0, 20), (0, 20)) > numarray.convolve: (0, 52) > Traceback (most recent call last): > File "", line 1, in ? > File "/users/exp/alted/bin-i686/lib/python/numarray/testall.py", line 24, in test > result = eval(p+".test()") > File "", line 0, in ? > File "/users/exp/alted/bin-i686/lib/python/numarray/fft/FFT.py", line 326, in test > import dtest > File "/users/exp/alted/bin-i686/lib/python/numarray/fft/dtest.py", line 238, in ? > import numarray.random_array as random_array > File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/__init__.py", line 7, in ? > from RandomArray2 import * > File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/RandomArray2.py", line 3, in ? > import numarray.linear_algebra as linalg > File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/__init__.py", line 1, in ? > from LinearAlgebra2 import * > File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/LinearAlgebra2.py", line 23, in ? > import lapack_lite2 > ImportError: > /users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/lapack_lite2.so: > undefined symbol: dgesdd_ > > I've checked that dgesdd symbol exists on my liblapack.a: > > $ strings ~/bin-i686/lib/atlas/liblapack.a | grep dgesdd > dgesdd.o/ 1097832195 2514 515 100644 13788 ` > > but not a dgesdd_, as you can see. > > I'm missing something? > -- Francesc Alted From dd55 at cornell.edu Fri Oct 15 14:18:41 2004 From: dd55 at cornell.edu (Darren Dale) Date: Fri Oct 15 14:18:41 2004 Subject: [Numpy-discussion] how to deal with large arrays Message-ID: <200410151714.38492.dd55@cornell.edu> Hello, I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to sum q(r) over R, so I take a dot product RQ and then sum along one axis to get a 1-by-q result. I'm doing this with dot products because it is much faster than the equivalent for or while loop. The intermediate r-by-q array can get very large though (200MB in my case), so I was wondering if there is a better way to go about it? If not, I can slice up R and deal with it one chunk at a time, then the intermediate arrays fit within the available system resources. Would somebody offer a suggestion of how to do this intelligently? Should the intermediate array be about the size of the processor cache, some fraction of the available memory, or is there something else I need to consider? Thank you, Darren From tim.hochberg at cox.net Fri Oct 15 15:11:05 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Oct 15 15:11:05 2004 Subject: [Numpy-discussion] how to deal with large arrays In-Reply-To: <200410151714.38492.dd55@cornell.edu> References: <200410151714.38492.dd55@cornell.edu> Message-ID: <41704A3C.5080802@cox.net> Darren Dale wrote: >Hello, > >I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to sum >q(r) over R, so I take a dot product RQ and then sum along one axis to get a >1-by-q result. > >I'm doing this with dot products because it is much faster than the equivalent >for or while loop. The intermediate r-by-q array can get very large though >(200MB in my case), so I was wondering if there is a better way to go about >it? > > I think so. I believe you are doing something like this: result_1 = na.sum(na.dot(R,Q), 0) I'm fairly certain (but I urge you to double check), that this reduces to: result_2 = na.dot(na.sum(R, 0), Q) which will take up much less intermediate storage and be faster to boot. In more quasi-mathematical notations: result_1 => sum_i sum_j R_ij Qjk = sum_j sum_i R_ij Q_jk = sum_j Q_jk sum_i R_ij => result_2 A quick test seems to confirm this: import numarray as na from numarray import random_array q = 10 r = 12 R = random_array.random((r,3)) Q = random_array.random((3,q)) x1 = na.sum(na.dot(R,Q), 0) x2 = na.dot(na.sum(R, 0), Q) print na.allclose(x1, x2) -tim >If not, I can slice up R and deal with it one chunk at a time, then the >intermediate arrays fit within the available system resources. Would somebody >offer a suggestion of how to do this intelligently? Should the intermediate >array be about the size of the processor cache, some fraction of the >available memory, or is there something else I need to consider? > >Thank you, >Darren > > >------------------------------------------------------- >This SF.net email is sponsored by: IT Product Guide on ITManagersJournal >Use IT products in your business? Tell us what you think of them. Give us >Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more >http://productguide.itmanagersjournal.com/guidepromo.tmpl >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From dd55 at cornell.edu Fri Oct 15 16:29:03 2004 From: dd55 at cornell.edu (Darren Dale) Date: Fri Oct 15 16:29:03 2004 Subject: [Numpy-discussion] how to deal with large arrays In-Reply-To: <41704A3C.5080802@cox.net> References: <200410151714.38492.dd55@cornell.edu> <41704A3C.5080802@cox.net> Message-ID: <200410151927.54005.dd55@cornell.edu> Thank you for your response, Tim, On Friday 15 October 2004 06:07 pm, Tim Hochberg wrote: > Darren Dale wrote: > >Hello, > > > >I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to > > sum q(r) over R, so I take a dot product RQ and then sum along one axis > > to get a 1-by-q result. > > > >I'm doing this with dot products because it is much faster than the > > equivalent for or while loop. The intermediate r-by-q array can get very > > large though (200MB in my case), so I was wondering if there is a better > > way to go about it? > > I'm fairly certain (but I urge you to double check), that this reduces to: > > result_2 = na.dot(na.sum(R, 0), Q) > Yes. As usual, I left out a bit of information that turned out to be important. See below A modified test: from numarray import * from numarray import random_array q = 10 r = 12 R = random_array.random((r,3)) Q = random_array.random((3,q)) x1 = sum( exp(1j*dot(R,Q)), 0) #note complex argument to exp() x2 = exp(1j*dot(sum(R, 0), Q)) print allclose(x1, x2) The complex arithmetic changes things. I am still learning how to keep my code efficient. The following code is actually almost as fast as using the large dot product, apparently I had some other sinks in my original tests: phase = zeros(len(Q[0]),'d') for i in range(len(Q[0])): phase[i] = phase[i] + sum(exp(1j*dot(R,Q[:,i])), 0) If q=1000 and r=2500, the for loop takes about 13% longer than the dot product method. Incredibly, if q=10,000 and r=2500, the for loop is 17% faster. So I am going to use it instead. Apparently I had some other time sink in my original test. from numarray import * from numarray import random_array from time import clock q = 10000 r = 2500 R = random_array.random((r,3)) Q = random_array.random((3,q)) t0 = clock() x1 = sum(exp(1j*dot(R,Q)), 0) #note complex argument to exp() t1 = clock() dt1 = t1-t0 phase = zeros(len(Q[0]),'d') for i in range(len(Q[0])): phase[i] = phase[i] + sum(exp(1j*dot(R,Q[:,i])), 0) t2 = clock() dt2 = t2-t1 print (dt2-dt1)/dt1 -- Darren From falted at pytables.org Sat Oct 16 04:29:02 2004 From: falted at pytables.org (Francesc Alted) Date: Sat Oct 16 04:29:02 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <200410151903.41288.falted@pytables.org> References: <200410151318.40035.falted@pytables.org> <200410151903.41288.falted@pytables.org> Message-ID: <200410161327.47485.falted@pytables.org> A Divendres 15 Octubre 2004 19:03, Francesc Alted va escriure: > >>> import timeit > >>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)") > >>> t1.repeat(3,10) > [3.7274820804595947, 3.8542821407318115, 3.7117569446563721] > > However, Numeric seems to get it: > > >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))") > >>> t3.repeat(3,10) > [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281] > > i.e. almost 300 faster than numarray Ooops! The Numeric test had a bug on it. The correct test would be: >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');m1=Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');m2=Numeric.reshape(m2,(dim1,dim1))") >>> t3.repeat(3,10) [0.47363090515136719, 0.47403502464294434, 0.47770595550537109] which is 8 times faster, more or less, than numarray (or Numeric) without ATLAS. Just to clarify things ;) -- Francesc Alted From aisaac at american.edu Sat Oct 16 15:53:01 2004 From: aisaac at american.edu (Alan G Isaac) Date: Sat Oct 16 15:53:01 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: <1097609991.30171.556.camel@halloween.stsci.edu> References: <1097550159.2568.5.camel@localhost.localdomain><1097595580.24491.4.camel@freyer.sfo.csun.edu><1097609991.30171.556.camel@halloween.stsci.edu> Message-ID: On 12 Oct 2004, Todd Miller apparently wrote: > Go here: > http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse > then "Submit New", and set the "category" to "documentation. Done. Thanks, Alan Isaac From aisaac at american.edu Sat Oct 16 15:53:02 2004 From: aisaac at american.edu (Alan G Isaac) Date: Sat Oct 16 15:53:02 2004 Subject: [Numpy-discussion] matrixmultiply: return type Message-ID: Being new to numerical Python applications, I was a little puzzled/concerned when I read http://sourceforge.net/tracker/index.php?func=detail&aid=984368&group_id=1369&atid=450446 I *think* the answer is: matrixmultiply will always return an array. Is there a stable view about what type of object will be returned by matrixmultiply? Currently, to my initial surprise, it returns an array when the arguments are matrices. Is this stable? Might an optional argument to specify the return type be desirable? Thank you, Alan Isaac From jmiller at stsci.edu Sat Oct 16 18:27:04 2004 From: jmiller at stsci.edu (Todd Miller) Date: Sat Oct 16 18:27:04 2004 Subject: [Numpy-discussion] documentation error In-Reply-To: References: <1097550159.2568.5.camel@localhost.localdomain> <1097595580.24491.4.camel@freyer.sfo.csun.edu> <1097609991.30171.556.camel@halloween.stsci.edu> Message-ID: <1097976412.3744.159.camel@localhost.localdomain> On Sat, 2004-10-16 at 17:17, Alan G Isaac wrote: > On 12 Oct 2004, Todd Miller apparently wrote: > > Go here: > > http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse > > then "Submit New", and set the "category" to "documentation. > > > Done. > > Thanks, > Alan Isaac As it turns out, I misdirected you. The above link is for numarray bugs. This link is for Numeric bugs: http://sourceforge.net/tracker/?group_id=1369&atid=101369 I moved the diagonal doc bug report to the Numeric bugs tracker. Regards, Todd From jmiller at stsci.edu Sat Oct 16 18:50:04 2004 From: jmiller at stsci.edu (Todd Miller) Date: Sat Oct 16 18:50:04 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <200410161327.47485.falted@pytables.org> References: <200410151318.40035.falted@pytables.org> <200410151903.41288.falted@pytables.org> <200410161327.47485.falted@pytables.org> Message-ID: <1097977801.3744.184.camel@localhost.localdomain> On Sat, 2004-10-16 at 07:27, Francesc Alted wrote: > A Divendres 15 Octubre 2004 19:03, Francesc Alted va escriure: > > >>> import timeit > > >>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)") > > >>> t1.repeat(3,10) > > [3.7274820804595947, 3.8542821407318115, 3.7117569446563721] > > > > However, Numeric seems to get it: > > > > >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))") > > >>> t3.repeat(3,10) > > [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281] > > > > i.e. almost 300 faster than numarray > > Ooops! The Numeric test had a bug on it. The correct test would be: > > >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');m1=Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');m2=Numeric.reshape(m2,(dim1,dim1))") > >>> t3.repeat(3,10) > [0.47363090515136719, 0.47403502464294434, 0.47770595550537109] > > which is 8 times faster, more or less, than numarray (or Numeric) without > ATLAS. > > Just to clarify things ;) Hi Francesc, I don't think numarray dot() will pick up any boost at all from ATLAS because it's not written to do it. Besides that, there are two performance problems I know of with numarray's dot() which may dominate or dilute any ATLAS benefits: 1. dot() requires array creation. 2. dot() requires array copies. Because it has a class hierarchy and a memory buffer object, numarray is at a disadvantage for (1). (2) just hasn't been optimized yet for noncontiguous arrays which (I think) are always present when dot() starts with two contiguous array parameters. Regards, Todd From stephen.walton at csun.edu Sun Oct 17 17:35:03 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Sun Oct 17 17:35:03 2004 Subject: [Numpy-discussion] New LAPACK and ScaLAPACK planned Message-ID: <1098059497.5110.5.camel@localhost.localdomain> From volume 4 #37 of the NA-Digest mailing list. I hope this is of enough interest to this list to justify the cross post. From dongarra at cs.utk.edu Fri Oct 15 04:10:44 2004 From: dongarra at cs.utk.edu (Jack Dongarra) Date: Fri, 15 Oct 2004 04:10:44 -0400 Subject: New Release of LAPACK and ScaLAPACK Planned Message-ID: New Release of LAPACK and ScaLAPACK planned. We are pleased to announce that we recently received NSF funding for new releases of the LAPACK and ScaLAPACK linear algebra libraries. The proposal pointed out the new and better algorithms that have been developed by many people in the community since the first releases of these libraries, as well as more obvious gaps and possible improvements. The proposal listed a large number of activities, which we now need to prioritize. There are a number of design decisions that still need to be made, for which we are interested in your input. For this purpose, we would like to remind you of a web page to collect your input that we originally announced on NA-Digest while we were preparing the proposal: http://icl.cs.utk.edu/lapack-survey.html In addition to the questions on that form, we are interested in your opinion on all aspects of the proposal, a copy of which you may find at: http://www.cs.berkeley.edu/~demmel/Sca-LAPACK-Proposal.pdf Thanks, Jim Demmel and Jack Dongarra --=20 Stephen Walton Dept. of Physics & Astronomy, CSU Northridge --=-vf5K3It096b9Vx529EKP Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQBBcw7pURWByv7S9xcRAms0AJ0YE13AXJ127J/5UVRs2t+BUYMIUQCgnd8I kvjNlPBX6phVfhjclKGExPY= =1kTj -----END PGP SIGNATURE----- --=-vf5K3It096b9Vx529EKP-- From falted at pytables.org Mon Oct 18 01:30:01 2004 From: falted at pytables.org (Francesc Alted) Date: Mon Oct 18 01:30:01 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <1097977801.3744.184.camel@localhost.localdomain> References: <200410151318.40035.falted@pytables.org> <200410161327.47485.falted@pytables.org> <1097977801.3744.184.camel@localhost.localdomain> Message-ID: <200410181029.14879.falted@pytables.org> Hi Todd, A Diumenge 17 Octubre 2004 03:50, Todd Miller va escriure: > I don't think numarray dot() will pick up any boost at all from ATLAS > because it's not written to do it. Besides that, there are two > performance problems I know of with numarray's dot() which may dominate > or dilute any ATLAS benefits: > > 1. dot() requires array creation. Yes, but my guess is that for large arrays, this time should be negligible compared with the multiplication time. > 2. dot() requires array copies. Mmm, you mean even for well-behaved arrays? Sorry, but I don't understand why. May I ask if there is any plan to complete a better integration of external LAPACK libraries in numarray or this is considered low priority? Never mind, I don't need this functionality right now. It's just that I'm preparing a series of 'hands-on' sessions about Python and Scientific Computing, and I was trying to understand the current advantages and limitations of numarray compared with NumPy. Cheers, -- Francesc Alted From jmiller at stsci.edu Mon Oct 18 04:53:01 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Oct 18 04:53:01 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <200410181029.14879.falted@pytables.org> References: <200410151318.40035.falted@pytables.org> <200410161327.47485.falted@pytables.org> <1097977801.3744.184.camel@localhost.localdomain> <200410181029.14879.falted@pytables.org> Message-ID: <1098100329.3741.96.camel@localhost.localdomain> On Mon, 2004-10-18 at 04:29, Francesc Alted wrote: > Hi Todd, > > A Diumenge 17 Octubre 2004 03:50, Todd Miller va escriure: > > I don't think numarray dot() will pick up any boost at all from ATLAS > > because it's not written to do it. Besides that, there are two > > performance problems I know of with numarray's dot() which may dominate > > or dilute any ATLAS benefits: > > > > 1. dot() requires array creation. > > Yes, but my guess is that for large arrays, this time should be negligible > compared with the multiplication time. > Probably true. I should measure this. For small computations, it's an issue. > > 2. dot() requires array copies. > > Mmm, you mean even for well-behaved arrays? Sorry, but I don't understand > why. I looked at this some this morning, trying to figure out why this is a problem only for numarray. It turns out that Numeric strides its arrays to get around the copy. When I implemented numarray, I chose not to stride because I thought it would be too slow... Recently I realized that one input array to dot() is *always* transposed and therefore likely noncontiguous and therefore copied. I think it's now possible to simply port the Numeric code so I'll look into that. > May I ask if there is any plan to complete a better integration of external > LAPACK libraries in numarray or this is considered low priority? Perry may answer this. I have no immediate plans for it... it does sound like enough people need this that it should be done. Regards, Todd From perry at stsci.edu Mon Oct 18 05:21:02 2004 From: perry at stsci.edu (Perry Greenfield) Date: Mon Oct 18 05:21:02 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <1098100329.3741.96.camel@localhost.localdomain> References: <200410151318.40035.falted@pytables.org> <200410161327.47485.falted@pytables.org> <1097977801.3744.184.camel@localhost.localdomain> <200410181029.14879.falted@pytables.org> <1098100329.3741.96.camel@localhost.localdomain> Message-ID: On Oct 18, 2004, at 7:52 AM, Todd Miller wrote: > On Mon, 2004-10-18 at 04:29, Francesc Alted wrote: > >> May I ask if there is any plan to complete a better integration of >> external >> LAPACK libraries in numarray or this is considered low priority? > > Perry may answer this. I have no immediate plans for it... it does > sound like enough people need this that it should be done. > Like Todd says, it does sound like this needs to be done. I think it takes a back seat to doing the scipy integration in general, but will need to be addressed soon thereafter. Perry From frank.horowitz at csiro.au Mon Oct 18 23:33:03 2004 From: frank.horowitz at csiro.au (Frank Horowitz) Date: Mon Oct 18 23:33:03 2004 Subject: [Numpy-discussion] Numeric Underflow Exceptions: Recommendations? Message-ID: <1098167541.8538.48.camel@localhost> Hi all, Using Numeric 23.5 I've been bitten by the dreaded 'floating point underflow throws an "OverflowError: math range error" instead of silently returning zero' bug. My setup is Debian unstable (Sid) on an i386, and I am using Debian's binary package "python-numeric". I understand from googling past discussions that this is (used to be?) phase-of-the-moon stuff, depending mostly upon architecture, options at libm compilation time of libc6. Several references to a trick of adding "-lieee" to the link list succeeding in taming the bug were mentioned around the era of Python2.0. My questions are these: Is there some higher level way of dealing with underflow now in Numeric? Or am I going to have to track down wherever "-lieee" has disappeared to in Debian, and recompile Numeric in the hopes that that still cures the problem? Any other tricks up people's sleeves for dealing with this? (I already know about exp_safe in Fernando Perez' IPython/numutils.py, BTW. I'm kind of hoping for a library level fix though, since my code is littered with "Numeric.exp()" calls.) TIA for any help you might be able to provide! Cheers, Frank Horowitz From falted at pytables.org Tue Oct 19 01:35:05 2004 From: falted at pytables.org (Francesc Alted) Date: Tue Oct 19 01:35:05 2004 Subject: [Numpy-discussion] numarray and ATLAS In-Reply-To: <1098100329.3741.96.camel@localhost.localdomain> References: <200410151318.40035.falted@pytables.org> <200410181029.14879.falted@pytables.org> <1098100329.3741.96.camel@localhost.localdomain> Message-ID: <200410191034.08018.falted@pytables.org> A Dilluns 18 Octubre 2004 13:52, Todd Miller va escriure: > > > 1. dot() requires array creation. > > > > Yes, but my guess is that for large arrays, this time should be negligible > > compared with the multiplication time. > > > > Probably true. I should measure this. For small computations, it's an > issue. Well, for small arrays ATLAS (or any other optimized LAPACK library) can't probably do much better than lapack lite, so I think you should not worry about this anyway. > Perry may answer this. I have no immediate plans for it... it does > sound like enough people need this that it should be done. Ok. Thanks for information, -- Francesc Alted From flin at broadpark.no Wed Oct 20 02:23:30 2004 From: flin at broadpark.no (Frank Lindseth) Date: Wed Oct 20 02:23:30 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 Message-ID: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> Hi, I need numeric in a python2.4 / win32 project. Is there a binary installer somewhere? I tried to compile it from source but ran into the following problem (se below): Where are the libs supposed to come from? - Frank C:\users\frankl\download\Numeric-23.5>c:\Python24\python.exe setup.py install running install running build running build_py running build_ext building 'lapack_lite' extension C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link.exe /DLL /nologo /INCREMENTAL:NO /LIBPATH:/usr/lib/atlas /LIBPATH:c:\Python24\libs /LIBPATH:c:\P ython24\PCBuild lapack.lib cblas.lib f77blas.lib atlas.lib g2c.lib /EXPORT:initl apack_lite build\temp.win32-2.4\Release\Src\lapack_litemodule.obj /OUT:build\lib .win32-2.4\lapack_lite.pyd /IMPLIB:build\temp.win32-2.4\Release\Src\lapack_lite. lib LINK : fatal error LNK1181: cannot open input file 'lapack.lib' error: command '"C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link .exe"' failed with exit status 1181 C:\users\frankl\download\Numeric-23.5> -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen.walton at csun.edu Wed Oct 20 09:22:37 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Oct 20 09:22:37 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> Message-ID: <1098288859.7182.11.camel@sunspot.csun.edu> On Wed, 2004-10-20 at 11:17 +0200, Frank Lindseth wrote: > LINK : fatal error LNK1181: cannot open input file 'lapack.lib' Edit setup.py, setting the variables library_dirs_list and libraries_list to empty lists, and try again. List: shouldn't this be the default? Right now Numeric looks for ATLAS by default. -- Stephen Walton, Professor of Physics and Astronomy, California State University, Northridge stephen.walton at csun.edu From flin at broadpark.no Wed Oct 20 11:46:27 2004 From: flin at broadpark.no (flin at broadpark.no) Date: Wed Oct 20 11:46:27 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <1098288859.7182.11.camel@sunspot.csun.edu> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu> Message-ID: <1098297610.4176b10a3b4d2@webmail.broadpark.no> Thank you for the replay Stephen, I did as you suggested: library_dirs_list = [] libraries_list = [] #library_dirs_list = ['/usr/lib/atlas'] #libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] but it still woun't install (se below) Any suggestions? C:\users\frankl\download\Numeric-23.5>c:\Python24\python.exe setup.py install running install running build running build_py running build_ext building 'lapack_lite' extension C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link.exe /DLL /nologo /INCREMENTAL:NO /LIBPATH:c:\Python24\libs /LIBPATH:c:\Python24\PCBuild /EXPORT: initlapack_lite build\temp.win32-2.4\Release\Src\lapack_litemodule.obj /OUT:buil d\lib.win32-2.4\lapack_lite.pyd /IMPLIB:build\temp.win32-2.4\Release\Src\lapack_ lite.lib Creating library build\temp.win32-2.4\Release\Src\lapack_lite.lib and object build\temp.win32-2.4\Release\Src\lapack_lite.exp lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgeev_ refere nced in function _lapack_lite_dgeev lapack_litemodule.obj : error LNK2019: unresolved external symbol _dsyevd_ refer enced in function _lapack_lite_dsyevd lapack_litemodule.obj : error LNK2019: unresolved external symbol _zheevd_ refer enced in function _lapack_lite_zheevd lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgelsd_ refer enced in function _lapack_lite_dgelsd lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgesv_ refere nced in function _lapack_lite_dgesv lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgesdd_ refer enced in function _lapack_lite_dgesdd lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgetrf_ refer enced in function _lapack_lite_dgetrf lapack_litemodule.obj : error LNK2019: unresolved external symbol _dpotrf_ refer enced in function _lapack_lite_dpotrf lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgeev_ refere nced in function _lapack_lite_zgeev lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgelsd_ refer enced in function _lapack_lite_zgelsd lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgesv_ refere nced in function _lapack_lite_zgesv lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgesdd_ refer enced in function _lapack_lite_zgesdd lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgetrf_ refer enced in function _lapack_lite_zgetrf lapack_litemodule.obj : error LNK2019: unresolved external symbol _zpotrf_ refer enced in function _lapack_lite_zpotrf build\lib.win32-2.4\lapack_lite.pyd : fatal error LNK1120: 14 unresolved externa ls error: command '"C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link .exe"' failed with exit status 1120 C:\users\frankl\download\Numeric-23.5> Quoting Stephen Walton : > On Wed, 2004-10-20 at 11:17 +0200, Frank Lindseth wrote: > > > LINK : fatal error LNK1181: cannot open input file 'lapack.lib' > > Edit setup.py, setting the variables library_dirs_list and > libraries_list to empty lists, and try again. > > List: shouldn't this be the default? Right now Numeric looks for ATLAS > by default. > > -- > Stephen Walton, Professor of Physics and Astronomy, > California State University, Northridge > stephen.walton at csun.edu > > From stephen.walton at csun.edu Wed Oct 20 12:09:00 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Oct 20 12:09:00 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <1098297610.4176b10a3b4d2@webmail.broadpark.no> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu> <1098297610.4176b10a3b4d2@webmail.broadpark.no> Message-ID: <1098299055.7182.33.camel@sunspot.csun.edu> On Wed, 2004-10-20 at 20:40 +0200, flin at broadpark.no wrote: > Thank you for the replay Stephen, > I did as you suggested: > library_dirs_list = [] > libraries_list = [] > #library_dirs_list = ['/usr/lib/atlas'] > #libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] > > but it still woun't install (se below) > Any suggestions? I'm guessing you still have files left over from last time. On Unix, you can run the 'makeclean.sh' script. On Windows, manually deleting the directories listed in that script (they are all called build) should do the trick. Then try the 'setup.py build' again. -- Stephen Walton, Professor of Physics and Astronomy, California State University, Northridge stephen.walton at csun.edu From flin at broadpark.no Wed Oct 20 15:59:45 2004 From: flin at broadpark.no (flin at broadpark.no) Date: Wed Oct 20 15:59:45 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <1098299055.7182.33.camel@sunspot.csun.edu> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu> <1098297610.4176b10a3b4d2@webmail.broadpark.no> <1098299055.7182.33.camel@sunspot.csun.edu> Message-ID: <1098312989.4176ed1d83de1@webmail.broadpark.no> Thanks again Stephen. Still no success. I deleted the whole Numeric-directory-tree, unzipped a newly downloaded src-file, edited the setup.py as you suggested, tried to run the installer, same error. I'm not sure what to du next? (what canm't somebody just make a binary installer for python2.4, after all it's in beta now...) - Frank -------- Quoting Stephen Walton : > On Wed, 2004-10-20 at 20:40 +0200, flin at broadpark.no wrote: > > Thank you for the replay Stephen, > > I did as you suggested: > > library_dirs_list = [] > > libraries_list = [] > > #library_dirs_list = ['/usr/lib/atlas'] > > #libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] > > > > but it still woun't install (se below) > > Any suggestions? > > I'm guessing you still have files left over from last time. On Unix, > you can run the 'makeclean.sh' script. On Windows, manually deleting > the directories listed in that script (they are all called build) should > do the trick. Then try the 'setup.py build' again. > > -- > Stephen Walton, Professor of Physics and Astronomy, > California State University, Northridge > stephen.walton at csun.edu > > From stephen.walton at csun.edu Wed Oct 20 16:47:52 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Oct 20 16:47:52 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <1098312989.4176ed1d83de1@webmail.broadpark.no> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu> <1098297610.4176b10a3b4d2@webmail.broadpark.no> <1098299055.7182.33.camel@sunspot.csun.edu> <1098312989.4176ed1d83de1@webmail.broadpark.no> Message-ID: <1098315982.7159.2.camel@freyer.sfo.csun.edu> On Wed, 2004-10-20 at 15:56, flin at broadpark.no wrote: > Thanks again Stephen. > Still no success. Sorry. Being a Linux user I'm afraid I can't help much. > I'm not sure what to du next? Download SciPy from http://www.scipy.org/? It is much more than you actually need, being all of Scientific Python as well as Numeric, but at least it's an all-in-one installer. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From mdehoon at ims.u-tokyo.ac.jp Wed Oct 20 21:40:04 2004 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Wed Oct 20 21:40:04 2004 Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003 In-Reply-To: <1098312989.4176ed1d83de1@webmail.broadpark.no> References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu> <1098297610.4176b10a3b4d2@webmail.broadpark.no> <1098299055.7182.33.camel@sunspot.csun.edu> <1098312989.4176ed1d83de1@webmail.broadpark.no> Message-ID: <41772BE1.5020403@ims.u-tokyo.ac.jp> flin at broadpark.no wrote: > Thanks again Stephen. > Still no success. > I deleted the whole Numeric-directory-tree, > unzipped a newly downloaded src-file, > edited the setup.py as you suggested, > tried to run the installer, > same error. Previously I managed to compile Numeric for Python 2.4 on Windows, using the MinGW compiler and Atlas. If you still need it, I can send you the binaries. > > I'm not sure what to du next? > (what canm't somebody just make a binary installer for python2.4, > after all it's in beta now...) There is a bug in Python 2.4 that prevents users from running bdist_wininst to create a binary installer. python setup.py install fails too. See bug 1021756 on sourceforge. --Michiel, U Tokyo. -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From stark at tuebingen.mpg.de Wed Oct 20 23:49:23 2004 From: stark at tuebingen.mpg.de (Sebastian Stark) Date: Wed Oct 20 23:49:23 2004 Subject: [Numpy-discussion] Re: numarray and ATLAS Message-ID: <200410210846.09275.stark@tuebingen.mpg.de> > Perhaps this is a too recurrent subject, but I"m having problems when > making numarray to use ATLAS instead of the mini-lapack included. I had to change lapack_libs and lapack_dirs in addons.py to read: lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm'] lapack_dirs = ['/usr/local/lib/ATLAS'] I have all my .a files in /usr/local/lib/ATLAS so I can control which ones I'm actually linking against. mosel ~ % ls -l /usr/local/lib/ATLAS total 14608 -rw-r--r-- 1 root staff 7952316 Oct 20 10:03 libatlas.a -rw-r--r-- 1 root staff 277592 Oct 20 10:03 libcblas.a -rw-r--r-- 1 root staff 261060 Oct 20 10:45 libf2c.a -rw-r--r-- 1 root staff 353278 Oct 20 10:03 libf77blas.a -rw-r--r-- 1 root staff 5734736 Oct 20 10:42 liblapack.a -rw-r--r-- 1 root staff 324968 Oct 20 10:03 libtstatlas.a -Sebastian (and yes, I get a significant speed boost from ATLAS) -- Sebastian Stark -- http://www.kyb.tuebingen.mpg.de/~stark Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen Phone: +49 7071 601 555 -- Fax: +49 7071 601 552 From stark at tuebingen.mpg.de Wed Oct 20 23:56:14 2004 From: stark at tuebingen.mpg.de (Sebastian Stark) Date: Wed Oct 20 23:56:14 2004 Subject: [Numpy-discussion] indexing on uninitialized arrays Message-ID: <200410210852.02285.stark@tuebingen.mpg.de> In matlab I can do: >> x = [] x = [] >> x(2) = 1.4 x = 0 1.4000 >> x(2,4) = 2.9 x = 0 1.4000 0 0 0 0 0 2.9000 which means x expands as necessary depending on "how far" my indexing goes. Now I'm thinking about how to realize this with numarray. I could imagine to define a derived array type "SelfInflatingArray" which catches the IndexError exception and does the right thing then. Any better ideas? -Sebastian -- Sebastian Stark -- http://www.kyb.tuebingen.mpg.de/~stark Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen Phone: +49 7071 601 555 -- Fax: +49 7071 601 552 From falted at pytables.org Thu Oct 21 00:32:31 2004 From: falted at pytables.org (Francesc Alted) Date: Thu Oct 21 00:32:31 2004 Subject: [Numpy-discussion] Re: numarray and ATLAS In-Reply-To: <200410210846.09275.stark@tuebingen.mpg.de> References: <200410210846.09275.stark@tuebingen.mpg.de> Message-ID: <200410210929.18477.falted@pytables.org> A Dijous 21 Octubre 2004 08:46, Sebastian Stark va escriure: > > > Perhaps this is a too recurrent subject, but I"m having problems when > > making numarray to use ATLAS instead of the mini-lapack included. > > I had to change lapack_libs and lapack_dirs in addons.py to read: > > lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm'] > lapack_dirs = ['/usr/local/lib/ATLAS'] I've done something similar: lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas'] lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib'] Mmm, I can see that you have added 'f2c'. However, I don't have it installed. Could that be the cause that tests would not pass in my case? > (and yes, I get a significant speed boost from ATLAS) Great, it's good to know that. Thank you very much for your feedback, -- Francesc Alted From rkern at ucsd.edu Thu Oct 21 02:05:51 2004 From: rkern at ucsd.edu (Robert Kern) Date: Thu Oct 21 02:05:51 2004 Subject: [Numpy-discussion] Re: numarray and ATLAS In-Reply-To: <200410210929.18477.falted@pytables.org> References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org> Message-ID: <41777422.8040205@ucsd.edu> Francesc Alted wrote: > A Dijous 21 Octubre 2004 08:46, Sebastian Stark va escriure: > >>>Perhaps this is a too recurrent subject, but I"m having problems when >>>making numarray to use ATLAS instead of the mini-lapack included. >> >>I had to change lapack_libs and lapack_dirs in addons.py to read: >> >> lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm'] >> lapack_dirs = ['/usr/local/lib/ATLAS'] > > > I've done something similar: > > lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas'] > lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib'] > > Mmm, I can see that you have added 'f2c'. However, I don't have it > installed. Could that be the cause that tests would not pass in my case? If you are compiling with gcc, add 'g2c' after 'f77blas'. It's g77's FORTRAN runtime library. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From falted at pytables.org Thu Oct 21 02:33:28 2004 From: falted at pytables.org (Francesc Alted) Date: Thu Oct 21 02:33:28 2004 Subject: [Numpy-discussion] Re: numarray and ATLAS In-Reply-To: <41777422.8040205@ucsd.edu> References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org> <41777422.8040205@ucsd.edu> Message-ID: <200410211126.42729.falted@pytables.org> A Dijous 21 Octubre 2004 10:32, Robert Kern va escriure: > > Mmm, I can see that you have added 'f2c'. However, I don't have it > > installed. Could that be the cause that tests would not pass in my case? > > If you are compiling with gcc, add 'g2c' after 'f77blas'. It's g77's > FORTRAN runtime library. Yeah, that made the trick!. So for a gcc compiler, this works just fine: lapack_libs = ['lapack', 'f77blas', 'g2c', 'cblas', 'atlas', 'm'] Many thanks!, -- Francesc Alted From stephen.walton at csun.edu Thu Oct 21 10:59:05 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Thu Oct 21 10:59:05 2004 Subject: [Numpy-discussion] Counting array elements Message-ID: <1098381332.8249.12.camel@freyer.sfo.csun.edu> Is there some simple way of counting the number of array elements which satisfy a certain condition? It is easy to do A[A<=1].sum() to sum all the values of A which are less than 1, but there doesn't seem to be a count() method. I tried (A<=1).sum() but this throws an exception at numarray 1.1. If I try sum(A<=value) I have to nest multiple sums if A has rank greater than 1, plus the sum overflows if A is large, apparently because boolean gets treated as Int8. (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)). You get zero.) The following works: array(A<=1024,type=Int32).sum() but is awkward. Am I missing an obvious better alternative? If not, I'm going to file an RFE :-) . -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From Chris.Barker at noaa.gov Thu Oct 21 11:33:03 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Thu Oct 21 11:33:03 2004 Subject: [Numpy-discussion] Re: numarray and ATLAS In-Reply-To: <41777422.8040205@ucsd.edu> References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org> <41777422.8040205@ucsd.edu> Message-ID: <4177FFF6.40006@noaa.gov> Robert Kern wrote: > Francesc Alted wrote: >>> I had to change lapack_libs and lapack_dirs in addons.py to read: >>> lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm'] >>> lapack_dirs = ['/usr/local/lib/ATLAS'] >> I've done something similar: >> >> lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas'] >> lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib'] For what it's worth, this is what worked for me on Gentoo Linux: lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm'] -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jmiller at stsci.edu Thu Oct 21 11:33:46 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 21 11:33:46 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098381332.8249.12.camel@freyer.sfo.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> Message-ID: <1098383430.3644.4.camel@halloween.stsci.edu> On Thu, 2004-10-21 at 13:55, Stephen Walton wrote: > Is there some simple way of counting the number of array elements which > satisfy a certain condition? It is easy to do > > A[A<=1].sum() > > to sum all the values of A which are less than 1, but there doesn't seem > to be a count() method. I tried > > (A<=1).sum() > > but this throws an exception at numarray 1.1. If I try This works now in CVS and will be part of numarray-1.2. Another more tedious approach which works for numarray-1.1 is: (A <= 1).astype('Int32').sum() > sum(A<=value) > > I have to nest multiple sums if A has rank greater than 1, plus the sum > overflows if A is large, apparently because boolean gets treated as > Int8. (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)). You get > zero.) The following works: > > array(A<=1024,type=Int32).sum() > > but is awkward. Am I missing an obvious better alternative? If not, > I'm going to file an RFE :-) . I don't think there's any need for an RFE, provided you're satisfied with (A<=1).sum(). Regards, Todd From rkern at ucsd.edu Thu Oct 21 12:22:20 2004 From: rkern at ucsd.edu (Robert Kern) Date: Thu Oct 21 12:22:20 2004 Subject: [Numpy-discussion] argmin and unsigned types Message-ID: <41780BE5.4070009@ucsd.edu> argmin locates the minimum by finding the maximum of the negative of the input. Unfortunately, for unsigned arrays, the negative has nothing to do with the actual numerical negative. Example: >>> from numarray import * >>> a = arange(10).astype(UInt8) >>> print a [0 1 2 3 4 5 6 7 8 9] >>> print -a [ 0 255 254 253 252 251 250 249 248 247] >>> argmin(a) 1 We need a separate argmin to handle these arrays properly. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From jmiller at stsci.edu Thu Oct 21 15:04:04 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 21 15:04:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098383430.3644.4.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <1098383430.3644.4.camel@halloween.stsci.edu> Message-ID: <1098396116.3644.129.camel@halloween.stsci.edu> On Thu, 2004-10-21 at 14:30, Todd Miller wrote: > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote: > > Is there some simple way of counting the number of array elements which > > satisfy a certain condition? It is easy to do > > > > A[A<=1].sum() > > > > to sum all the values of A which are less than 1, but there doesn't seem > > to be a count() method. I tried > > > > (A<=1).sum() > > > > but this throws an exception at numarray 1.1. If I try > > This works now in CVS and will be part of numarray-1.2. Stephen tried this and it turns out my earlier statement was untrue, (A<=1).sum() doesn't do anything reasonable, even in CVS. The problem is that sum() is written (without direct C support) to conserve storage. As a result, it doesn't do implicit > Another more > tedious approach which works for numarray-1.1 is: > > (A <= 1).astype('Int32').sum() > There's also a prettier approach that works for 1.1 that I forgot about: (A <= 1).sum('Int32') > > sum(A<=value) > > > > I have to nest multiple sums if A has rank greater than 1, plus the sum > > overflows if A is large, apparently because boolean gets treated as > > Int8. (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)). You get > > zero.) The following works: > > > > array(A<=1024,type=Int32).sum() > > > > but is awkward. Am I missing an obvious better alternative? If not, > > I'm going to file an RFE :-) . > > I don't think there's any need for an RFE, provided you're satisfied > with (A<=1).sum(). > > Regards, > Todd > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From jmiller at stsci.edu Thu Oct 21 15:08:52 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 21 15:08:52 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098396116.3644.129.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <1098383430.3644.4.camel@halloween.stsci.edu> <1098396116.3644.129.camel@halloween.stsci.edu> Message-ID: <1098396420.28271.0.camel@halloween.stsci.edu> On Thu, 2004-10-21 at 18:01, Todd Miller wrote: > On Thu, 2004-10-21 at 14:30, Todd Miller wrote: > > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote: > > > Is there some simple way of counting the number of array elements which > > > satisfy a certain condition? It is easy to do > > > > > > A[A<=1].sum() > > > > > > to sum all the values of A which are less than 1, but there doesn't seem > > > to be a count() method. I tried > > > > > > (A<=1).sum() > > > > > > but this throws an exception at numarray 1.1. If I try > > > > This works now in CVS and will be part of numarray-1.2. > > Stephen tried this and it turns out my earlier statement was untrue, > (A<=1).sum() doesn't do anything reasonable, even in CVS. The problem > is that sum() is written (without direct C support) to conserve > storage. As a result, it doesn't do implicit > > Another more > > tedious approach which works for numarray-1.1 is: > > > > (A <= 1).astype('Int32').sum() > > > > There's also a prettier approach that works for 1.1 that I forgot about: > > (A <= 1).sum('Int32') > > > > sum(A<=value) > > > > > > I have to nest multiple sums if A has rank greater than 1, plus the sum > > > overflows if A is large, apparently because boolean gets treated as > > > Int8. (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)). You get > > > zero.) The following works: > > > > > > array(A<=1024,type=Int32).sum() > > > > > > but is awkward. Am I missing an obvious better alternative? If not, > > > I'm going to file an RFE :-) . > > > > I don't think there's any need for an RFE, provided you're satisfied > > with (A<=1).sum(). > > > > Regards, > > Todd > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > > Use IT products in your business? Tell us what you think of them. Give us > > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > > http://productguide.itmanagersjournal.com/guidepromo.tmpl > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From jmiller at stsci.edu Thu Oct 21 15:11:23 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 21 15:11:23 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098396116.3644.129.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <1098383430.3644.4.camel@halloween.stsci.edu> <1098396116.3644.129.camel@halloween.stsci.edu> Message-ID: <1098396569.28351.0.camel@halloween.stsci.edu> On Thu, 2004-10-21 at 18:01, Todd Miller wrote: > On Thu, 2004-10-21 at 14:30, Todd Miller wrote: > > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote: > > > Is there some simple way of counting the number of array elements which > > > satisfy a certain condition? It is easy to do > > > > > > A[A<=1].sum() > > > > > > to sum all the values of A which are less than 1, but there doesn't seem > > > to be a count() method. I tried > > > > > > (A<=1).sum() > > > > > > but this throws an exception at numarray 1.1. If I try > > > > This works now in CVS and will be part of numarray-1.2. > > Stephen tried this and it turns out my earlier statement was untrue, > (A<=1).sum() doesn't do anything reasonable, even in CVS. The problem > is that sum() is written (without direct C support) to conserve > storage. As a result, it doesn't do implicit > > Another more > > tedious approach which works for numarray-1.1 is: > > > > (A <= 1).astype('Int32').sum() > > > > There's also a prettier approach that works for 1.1 that I forgot about: > > (A <= 1).sum('Int32') > > > > sum(A<=value) > > > > > > I have to nest multiple sums if A has rank greater than 1, plus the sum > > > overflows if A is large, apparently because boolean gets treated as > > > Int8. (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)). You get > > > zero.) The following works: > > > > > > array(A<=1024,type=Int32).sum() > > > > > > but is awkward. Am I missing an obvious better alternative? If not, > > > I'm going to file an RFE :-) . > > > > I don't think there's any need for an RFE, provided you're satisfied > > with (A<=1).sum(). > > > > Regards, > > Todd > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > > Use IT products in your business? Tell us what you think of them. Give us > > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more > > http://productguide.itmanagersjournal.com/guidepromo.tmpl > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From jmiller at stsci.edu Thu Oct 21 16:41:29 2004 From: jmiller at stsci.edu (Todd Miller) Date: Thu Oct 21 16:41:29 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098396569.28351.0.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <1098383430.3644.4.camel@halloween.stsci.edu> <1098396116.3644.129.camel@halloween.stsci.edu> <1098396569.28351.0.camel@halloween.stsci.edu> Message-ID: <1098401959.3744.34.camel@localhost.localdomain> On Thu, 2004-10-21 at 18:09, Todd Miller wrote: > On Thu, 2004-10-21 at 18:01, Todd Miller wrote: > > On Thu, 2004-10-21 at 14:30, Todd Miller wrote: > > > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote: > > > > Is there some simple way of counting the number of array elements which > > > > satisfy a certain condition? It is easy to do > > > > > > > > A[A<=1].sum() > > > > > > > > to sum all the values of A which are less than 1, but there doesn't seem > > > > to be a count() method. I tried > > > > > > > > (A<=1).sum() > > > > > > > > but this throws an exception at numarray 1.1. If I try > > > > > > This works now in CVS and will be part of numarray-1.2. > > > > Stephen tried this and it turns out my earlier statement was untrue, > > (A<=1).sum() doesn't do anything reasonable, even in CVS. The problem > > is that sum() is written (without direct C support) to conserve > > storage. As a result, it doesn't do implicit up-casting. I'm pretty sure this was a conscious and discussed choice (this is actually the 2nd time sum() has been wrong). IMHO, the typing for sum() should be revised because it is too dangerous the way it is now. Regards, Todd From nwagner at mecha.uni-stuttgart.de Fri Oct 22 02:17:16 2004 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Fri Oct 22 02:17:16 2004 Subject: [Numpy-discussion] Problems with complex matrices Message-ID: <4178CEFF.2050608@mecha.uni-stuttgart.de> Hi all, Another bug is revealed Traceback (most recent call last): File "complex_it.py", line 6, in ? res=dot(A,x)-r File "/usr/lib/python2.3/site-packages/Numeric/dotblas/__init__.py", line 55, in dot if multiarray.array(a).shape == () or multiarray.array(b).shape == (): TypeError: a float is required Nils -------------- next part -------------- A non-text attachment was scrubbed... Name: complex_it.py Type: text/x-python Size: 139 bytes Desc: not available URL: From Sebastien.deMentendeHorne at electrabel.com Fri Oct 22 02:44:46 2004 From: Sebastien.deMentendeHorne at electrabel.com (Sebastien.deMentendeHorne at electrabel.com) Date: Fri Oct 22 02:44:46 2004 Subject: [Numpy-discussion] Problems with complex matrices Message-ID: <035965348644D511A38C00508BF7EAEB145CB168@seacex03.eib.electrabel.be> gmres returns a tuple so you should have used res = dot(A, x[0]) - r seb > -----Original Message----- > From: Nils Wagner [mailto:nwagner at mecha.uni-stuttgart.de] > Sent: vendredi 22 octobre 2004 11:13 > To: SciPy Users List; numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] Problems with complex matrices > > > Hi all, > > Another bug is revealed > > Traceback (most recent call last): > File "complex_it.py", line 6, in ? > res=dot(A,x)-r > File > "/usr/lib/python2.3/site-packages/Numeric/dotblas/__init__.py", > line 55, in dot > if multiarray.array(a).shape == () or > multiarray.array(b).shape == (): > TypeError: a float is required > > Nils > > From Chris.Barker at noaa.gov Fri Oct 22 11:07:32 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Oct 22 11:07:32 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098392607.8249.20.camel@freyer.sfo.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> Message-ID: <41794B47.4090909@noaa.gov> Stephen Walton wrote: > There is a difference between the sum() Ufunc and the sum() method which > is not mentioned in the documentation: the function works along an > axis, while the method works on the whole array. That is, A.sum() and > A.flat.sum() are equivalent regardless of the rank of A. Bummer. I was hoping this was a move to a more object-oriented style, rather than different functionality. Also, it's pretty confusing terminology, particularly if it's not documented! Why not .SumAll() or something? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rowen at u.washington.edu Fri Oct 22 11:20:36 2004 From: rowen at u.washington.edu (Russell E Owen) Date: Fri Oct 22 11:20:36 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <41794B47.4090909@noaa.gov> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> Message-ID: At 11:02 AM -0700 2004-10-22, Chris Barker wrote: >Stephen Walton wrote: > >> There is a difference between the sum() Ufunc and the sum() method which >> is not mentioned in the documentation: the function works along an >> axis, while the method works on the whole array. That is, A.sum() and >> A.flat.sum() are equivalent regardless of the rank of A. > > >Bummer. I was hoping this was a move to a more object-oriented >style, rather than different functionality. Also, it's pretty >confusing terminology, particularly if it's not documented! Why not >.SumAll() or something? I agree. Numarray is already confusing enough without identically named functions and methods that do different things. (nElements and size are another pet peeve, with size used in several places and nElements appearing exactly once. Though I am grateful to whoever added size as a workalike for nElements; formerly you had to know what kind of array you had before you knew how to find out how many elements it had.) -- Russell From strawman at astraw.com Fri Oct 22 11:25:58 2004 From: strawman at astraw.com (Andrew Straw) Date: Fri Oct 22 11:25:58 2004 Subject: [Numpy-discussion] floating point exception weirdness In-Reply-To: <411A08FA.7000601@astraw.com> References: <4119BBFC.6020304@astraw.com> <1092221365.3752.32.camel@localhost.localdomain> <411A08FA.7000601@astraw.com> Message-ID: <41795006.1040807@astraw.com> I've isolated a bug I first reported on this mailing list in August. I've now confined it to a small code snippet using entirely open-source software (previously I saw it while using Intel's IPP). In a nutshell, importing numarray.ieeespecial triggers a floating point exception (which kills my program) when I call Numeric's singular_value_decomposition() function: import Numeric from LinearAlgebra import singular_value_decomposition if want_FPE: import numarray.ieeespecial A= [[-5.7, 2.2, -0.53, 46.0], [-2.3, -5.5, -1.0, 1091.0], [5.9, 1.4, -0.1, -142.0], [-1.3, 5.7, -1.5, 2673.0]] A=Numeric.array(A) u,s,v = singular_value_decomposition(A) # FPE triggered here Here's my setup: $ python Python 2.3.4 (#2, Sep 24 2004, 08:39:09) [GCC 3.3.4 (Debian 1:3.3.4-12)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import Numeric >>> Numeric.__version__ '23.6' >>> import numarray >>> numarray.__version__ '1.2a' $ gcc -v Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.4/specs Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux Thread model: posix gcc version 3.3.4 (Debian 1:3.3.4-13) Now, for the clue: the above error is ONLY triggered when I compile Numeric to use system blas and friends, not when I use lapack_lite included with Numeric. This leads me to suspect it is related to the SSE2 unit -- I have Debian sarge's atlas3-base, atlas3-see, atlas3-sse2, blas, lapack, lapack3, and refblas3 packages installed on my P4 machine. So, to propose a hypothesis: numarray.ieeespecial sets the FPE bit in the SSE2 hardware, but for some reason this does not raise SIGFPE. However, when the next call that touches SSE2 happens, the kernel sees that error bit and throws the signal. Does this explanation make sense? Is it easy to fix? Cheers! Andrew From jmiller at stsci.edu Fri Oct 22 14:19:17 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Oct 22 14:19:17 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> Message-ID: <1098479844.29804.260.camel@halloween.stsci.edu> On Fri, 2004-10-22 at 14:17, Russell E Owen wrote: > At 11:02 AM -0700 2004-10-22, Chris Barker wrote: > >Stephen Walton wrote: > > > >> There is a difference between the sum() Ufunc and the sum() method which > >> is not mentioned in the documentation: the function works along an > >> axis, while the method works on the whole array. That is, A.sum() and > >> A.flat.sum() are equivalent regardless of the rank of A. > > > > > >Bummer. I was hoping this was a move to a more object-oriented > >style, rather than different functionality. Also, it's pretty > >confusing terminology, particularly if it's not documented! Why not > >.SumAll() or something? sumAll() would certainly be better. Unless there are objections, I'll rename the current sum() method to sumAll() and re-write sum() to give a deprecation warning before calling sumAll(). Eventually, it'll go away altogether. I reviewed the discussion of the sum() result type from a year ago: "[Numpy-discussion] sum and mean methods behaviour". We discussed sum() in depth and AFIK I implemented the recommendations. The results need to be documented. By default, sum() now uses the maximum type of the type family of the array, so families Bool, Integer, UnsignedInteger, Float, or Complex result in max types Bool, Int64, UInt64, Float64, Complex64. I'm not sure why we segregated Bool and it looks like a mistake to me now. I'm thinking the Bool "family" should just go away and be re-classified as UnsignedInteger. These ideas are captured by the numerictypes.MaximumType() function which is also potentially useful for any reduction. > I agree. Numarray is already confusing enough without identically > named functions and methods that do different things. True enough. This'll be fixed. > (nElements and > size are another pet peeve, with size used in several places and > nElements appearing exactly once. Though I am grateful to whoever > added size as a workalike for nElements; formerly you had to know > what kind of array you had before you knew how to find out how many > elements it had.) I'm not sure what you mean here. When I grepped, I got 52 hits for nelements() in the numarray source, let alone what users have done with it. Right now, IMHO, it's not clearly broken and there are bigger fish to fry. Regards, Todd From stephen.walton at csun.edu Fri Oct 22 14:37:05 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Fri Oct 22 14:37:05 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> Message-ID: <1098480955.11372.19.camel@freyer.sfo.csun.edu> On Fri, 2004-10-22 at 11:17, Russell E Owen wrote about the sum() Ufunc vs. the sum() method: > Numarray is already confusing enough without identically > named functions and methods that do different things When I went through the Numarray docs and made suggestions for improvements (see the list I posted at Sourceforge), I didn't make any comments about functional changes, only what the documentation said. Since the sum() method is documented using 1-D arrays, you can't tell that it in fact behaves differently than the sum() Ufunc. On reflection, I also agree that the Ufuncs and methods should behave the same way. Why do you say 'numarray is confusing'? What in the docs would help un-confuse it, in your view? -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From rowen at u.washington.edu Fri Oct 22 14:48:03 2004 From: rowen at u.washington.edu (Russell E Owen) Date: Fri Oct 22 14:48:03 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> Message-ID: At 5:17 PM -0400 2004-10-22, Todd Miller wrote: >On Fri, 2004-10-22 at 14:17, Russell E Owen wrote: >> I agree. Numarray is already confusing enough without identically >> named functions and methods that do different things. > >True enough. This'll be fixed. Great! >> (nElements and >> size are another pet peeve, with size used in several places and >> nElements appearing exactly once. Though I am grateful to whoever >> added size as a workalike for nElements; formerly you had to know >> what kind of array you had before you knew how to find out how many >> elements it had.) > >I'm not sure what you mean here. When I grepped, I got 52 hits for >nelements() in the numarray source, let alone what users have done with >it. Right now, IMHO, it's not clearly broken and there are bigger fish >to fry. Since you ask... I'm counting the number of implementations in the public interface of the numarray package. There are four implementations of size (including the numarray array method, which is simply a synonym for nelements), but only one implementation of nelements. When I started using numarray, the following was true: * numarray had a function named size. * numarray.ma had the same function * numarray.ma arrays had method size * All of these worked the same way: size(array, axis=None) size returns the number of elements in an array or along the specified axis. BUT numarray arrays had no method size. Instead there was a method nelements, which did the same thing as size, but had no "axis" argument. This was very confusing, and I got tripped up badly because I was trying to count array elements and was using both "normal" numarray arrays and masked arrays. I filed PR 934514 and some kind soul patched the problem by making size a synonym for nelements. There is a bit of residual mess because the new size does not have the axis argument. And then there's the historical clutter of two ways to do the same thing, but presumably one just lives with that. Though it seems a bit strange to me not to deprecate nelements and stop using it internally. -- Russell From Fernando.Perez at colorado.edu Fri Oct 22 14:50:04 2004 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Fri Oct 22 14:50:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> Message-ID: <41797FFE.8090802@colorado.edu> Todd Miller wrote: > sumAll() would certainly be better. > > Unless there are objections, I'll rename the current sum() method to > sumAll() and re-write sum() to give a deprecation warning before calling > sumAll(). Eventually, it'll go away altogether. silly, minor nit: can we avoid mixed case names? Either sum_all or SumAll? I'm not too fond of CamelCase, but camelCase looks even worse to me :) As I said, it's just a minor nit. I don't know if there's an official naming policy for numarray, so please don't get angry at me if my comment is out of place. Best, f From Chris.Barker at noaa.gov Fri Oct 22 15:12:01 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Oct 22 15:12:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> Message-ID: <4179853F.8040800@noaa.gov> Todd Miller wrote: > By default, sum() now uses the maximum type of the type family of the > array, so families Bool, Integer, UnsignedInteger, Float, or Complex > result in max types Bool, Int64, UInt64, Float64, Complex64. I'm not > sure why we segregated Bool and it looks like a mistake to me now. I'm > thinking the Bool "family" should just go away and be re-classified as > UnsignedInteger. Well, I think that the idea of a bool being different than an int is often useful. In this case, we want Bool to behave like an integer, so that we can use some version of sum() to add up all the true values. This is handy, but maybe we need more complete support for boolean arrays, rather than getting rid of them. For instance, there could be a NumTrue() function or method, for this case. I would probably maintain the easy conversion of a Bool array to an Int array, for when you really do need to do math with them. We'd want a compete set, many of which already exist. A few off the top of my head: sometrue alltrue numtrue Maybe mirrors for false: somefalse allfalse numfalse What else would be needed? My vote would be for all of these to be methods of a Bool array, but I'm partial to methods over functions anyway. On the other hand, Python itself is sub classing Bool from integer, so maybe there's little point. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From aisaac at american.edu Fri Oct 22 15:14:07 2004 From: aisaac at american.edu (Alan G Isaac) Date: Fri Oct 22 15:14:07 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu><417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu><41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> Message-ID: On 22 Oct 2004, Todd Miller apparently wrote: > sumAll() would certainly be better. > Unless there are objections, I'll rename the current sum() method to > sumAll() and re-write sum() to give a deprecation warning before calling > sumAll(). Eventually, it'll go away altogether. Just two thoughts from a new user. i. I agree that .sumAll is better than the current name confusion. ii. even better, I propose, would be for .sum to take an axis argument, with default matching the sum function, and possible value axis="all". For the transition, the axis argument can be required. fwiw, Alan Isaac From rowen at u.washington.edu Fri Oct 22 15:19:02 2004 From: rowen at u.washington.edu (Russell E Owen) Date: Fri Oct 22 15:19:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098480955.11372.19.camel@freyer.sfo.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098480955.11372.19.camel@freyer.sfo.csun.edu> Message-ID: At 2:35 PM -0700 2004-10-22, Stephen Walton wrote: >On Fri, 2004-10-22 at 11:17, Russell E Owen wrote about the sum() Ufunc >vs. the sum() method: > >> Numarray is already confusing enough without identically >> named functions and methods that do different things > >When I went through the Numarray docs and made suggestions for >improvements (see the list I posted at Sourceforge), I didn't make any >comments about functional changes, only what the documentation said. >Since the sum() method is documented using 1-D arrays, you can't tell >that it in fact behaves differently than the sum() Ufunc. On >reflection, I also agree that the Ufuncs and methods should behave the >same way. > >Why do you say 'numarray is confusing'? What in the docs would help >un-confuse it, in your view? OK, since I seem to be in a grumpy mood today, here are some examples (probably nothing new here): - I'll expose my ignorance, but I find the take stuff and fancy indexing nearly incomprehensible. I've tried to follow the examples (several times--i.e. every time I need to do something fancy), but generally I either flail around until I find something that works, or give up and write a C extension. - I'd like to write C/C++ code that would work on multiple array types. This seems a natural use of C++ templates, but that doesn't seem to be "how it's done". I hate to think how the internal code is managing this without being a horrible sphaghetti of code repeated for each array type. The nd_image package is the closest I've come to finding source code that makes any sense to me in this areay. But it uses so many custom-defined specialized functions that I figured it was just too much work to figure out w/out a manual (and risky to rely on these functions since they are internal to the package). So I gave up and just support the one data type I really need now. Very disappointing. - Important functions are sometimes buried in a non-obvious (to me) sub-package. For example: try to find that location at which an array has a minimum value (if there's more than one such point, pick any). You'd think it'd be a standard numarray function, wouldn't you? After all, you can ask for the minimum value. Now try to find it. Well, I started out by trying to figure out how to get argmin to do the job. Horrible. Fortunately I finally found minimum_position buried in nd_image. - Masked arrays are not integrated. Thus a lot of important filtering and stuff simply cannot be done on masked data without writing custom extensions. For instance I'd like to do a median-filter that ignores masked data (taking the median of non-masked data only). - For 2-d images x and y are reversed. I know this isn't going to change, but it is a headache every time I have to write new image processing code. - I keep wanting more support for dealing with arrays of indices, e.g. "give me all the indices for which this is true", then use that to process the data in an array. Numarray seems to do that kind of operation in an entirely different way, suggesting I'm not "with it" on the underlying philosophy. Unfortunately no really good examples come to mind at the moment (it's been awhile since I've created new code using numarray), though I was fairly well convinced that if I had enough support for this I could code an efficient radial profile function w/out using a C extension. -- Russell From perry at stsci.edu Fri Oct 22 16:50:01 2004 From: perry at stsci.edu (Perry Greenfield) Date: Fri Oct 22 16:50:01 2004 Subject: [Numpy-discussion] In case there are any questions about numarray... In-Reply-To: Message-ID: Todd and I will be away most of next week at a conference and will likely not have a chance to respond to questions about numarray or continue the current discussions about the proper numarray interface or improvements to the documentation. Perry From aisaac at american.edu Fri Oct 22 19:17:02 2004 From: aisaac at american.edu (Alan G Isaac) Date: Fri Oct 22 19:17:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <4179853F.8040800@noaa.gov> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu><4179853F.8040800@noaa.gov> Message-ID: More new user feedback ... On Fri, 22 Oct 2004, Chris Barker apparently wrote: > Well, I think that the idea of a bool being different than > an int is often useful. Yes. E.g., applications to directed graphs. > we can use some version of sum() to add up all the > true values. Unclear, but given the existence of sometrue, it seems natural enough to let sum treat a Bool as an integer. Products work naturally, of course. > I would probably maintain > the easy conversion of a Bool array to an Int array, for when you really > do need to do math with them. I would rephrase this. Boolean arrays have a naturally different math, which it would be nice to have supported. It would also be nice to easily convert to Int, when that representation captures the math needed. > We'd want a compete set, many of which already exist. A few off the top > of my head: > sometrue > alltrue > numtrue I'd just let sum handle numtrue. > Maybe mirrors for false: > somefalse, allfalse, numfalse I'd just rely on alltrue, sometrue, and (size less sum) for these. fwiw, Alan From stephen.walton at csun.edu Fri Oct 22 22:23:02 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Fri Oct 22 22:23:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098480955.11372.19.camel@freyer.sfo.csun.edu> Message-ID: <1098508579.3403.6.camel@localhost.localdomain> I had no idea my innocent question would generate so much discussion. Mindful that Perry and Todd are at ADASS in Pasadena next week: On Fri, 2004-10-22 at 15:18 -0700, Russell E Owen wrote: > At 2:35 PM -0700 2004-10-22, Stephen Walton wrote: > > > >Why do you say 'numarray is confusing'? What in the docs would help > >un-confuse it, in your view? > > - I'll expose my ignorance, but I find the take stuff and fancy > indexing nearly incomprehensible. I agree. It took me much experimentation to figure out exactly how it worked. I'd appreciate it very much if you would look at my suggested rewrite of this section of the documentation at http://sourceforge.net/tracker/index.php?func=detail&aid=1047889&group_id=1369&atid=101369 and give me any further thoughts for clarification (post them as comments to the bug report itself). > - I'd like to write C/C++ code that would work on multiple array > types. I can't help much here, other than to say that C and C++ are pretty low level languages, not well suited for this level of abstraction. > - Important functions are sometimes buried in a non-obvious (to me) > sub-package. > For example: try to find that location at which an array has a > minimum value The current index to the documentation seems to include only the function names but not concepts, which is a problem. I myself was trying to remember how to do type conversion; there is no entry in the index for 'conversion' or 'coercion' and I finally grepped my local copy of the HTML files to re-find astype(). > - Masked arrays are not integrated. I haven't tried these yet personally, but I agree that such a feature is a very important one. IRAF got partway along on this but didn't finish it either. Having said that, my workaround/technique for both MATLAB and numarray is to simply put NaN's in the places where this not valid data and do something like sum(sum(A(~isnan(A))) This is MATLAB syntax of course. Something similar in numarray would go a long way to helping me. For example, I have full disk solar images and I'd like to be able to operate on just the sunspot pixels, or just the sky pixels, in a straightforward way. > - For 2-d images x and y are reversed. Are you referring to the fact that C and numarray are row major and Fortran is column major? Or to how images get displayed in the various plot packages? > - I keep wanting more support for dealing with arrays of indices, > e.g. "give me all the indices for which this is true", then use that > to process the data in an array. Numarray seems to do that kind of > operation in an entirely different way, suggesting I'm not "with it" > on the underlying philosophy. There are two ways to do this, both of which work. For example: A=arange(25) sum(A[A<=7]) will work just as you expect. A bool array used as an index picks out those values for which the bool is True. Essentially identical syntax now works in MATLAB too. If you want an index array instead: >>> index=where(A<7) >>> A[index] will do the trick. For arrays of rank greater than 1: >>> A=arange(25,shape=(5,5)) >>> where(A<7) (array([0, 0, 0, 0, 0, 1, 1]), array([0, 1, 2, 3, 4, 0, 1])) which is a tuple of two arrays that can be used to index A: >>> ind1,ind2=where(A<7) >>> A[ind1,ind2] array([0, 1, 2, 3, 4, 5, 6]) >>> A[ind1,ind2]=[6,5,4,3,2,1,0] # assignment works too >>> A array([[ 6, 5, 4, 3, 2], [ 1, 0, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]]) Does this help? -- Stephen Walton Physics & Astronomy CSUN From verveer at embl-heidelberg.de Sat Oct 23 04:14:04 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Sat Oct 23 04:14:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098480955.11372.19.camel@freyer.sfo.csun.edu> Message-ID: <9633F2FA-24E4-11D9-B9D4-000D932805AC@embl-heidelberg.de> I thought I just give my point of view on this, since I do believe we should give these some thought. On Oct 23, 2004, at 12:18 AM, Russell E Owen wrote: > OK, since I seem to be in a grumpy mood today, here are some examples > (probably nothing new here): > - I'll expose my ignorance, but I find the take stuff and fancy > indexing nearly incomprehensible. I've tried to follow the examples > (several times--i.e. every time I need to do something fancy), but > generally I either flail around until I find something that works, or > give up and write a C extension. I agree, it is very complicated, I always have trouble getting understanding what is going on when I use take and indexing. More documentation may help. > - I'd like to write C/C++ code that would work on multiple array > types. This seems a natural use of C++ templates, but that doesn't > seem to be "how it's done". I hate to think how the internal code is > managing this without being a horrible sphaghetti of code repeated for > each array type. This is a good point. If you look at examples for implementing something in C, you always see that the code only handles a single data type, usually converting all input to double type. That is not always a good way to write an extension if you want it to be of generic use (e.g. the FFT module does not handle 32 bits floating point well, which is a problem for big arrays). Some support in writing functions that handle multiple data types would be good. > The nd_image package is the closest I've come to finding source code > that makes any sense to me in this areay. But it uses so many > custom-defined specialized functions that I figured it was just too > much work to figure out w/out a manual (and risky to rely on these > functions since they are internal to the package). The internal nd_image C functions are indeed not exported and should not be used to implement extensions. That is going to stay that way since I do not plan to document these, and in any case, exposing such functions is not the purpose of the module. On the other hand, some of the techniques use may be generally useful. I could try to factor some of the functions and macros out and write something up on the use of these to write extensions that handle multiple data types. > So I gave up and just support the one data type I really need now. > Very disappointing. Yes, it should be easier to do this, I agree. Using C macros as a 'poor man' templating system is in fact not too complicated (although pretty ugly). Another approach that I have tried to use in nd_image is to provide generic functions that take a python or a C function to implement functionality. For instance to implement an arbitrary filter function in nd_image you only need to implement a function that calculates the filter at one point. You then call a generic filter function that does the heavy lifting of dealing with multiple array types, iterating over the array, dealing with borders and such, applying the function at each array element. The filter function can be in python, but can also be a C function, communicated by a CObject. Maybe some of these type functions could be provided with the numarray package. This could simplify writing extensions a lot. Would there be interest for a package of such functions? If there is I could think about it a bit more, and propose (and implement) something in the form of an extension. > - Important functions are sometimes buried in a non-obvious (to me) > sub-package. > > For example: try to find that location at which an array has a minimum > value (if there's more than one such point, pick any). You'd think > it'd be a standard numarray function, wouldn't you? After all, you can > ask for the minimum value. Now try to find it. Agreed, this bothered me too. > Well, I started out by trying to figure out how to get argmin to do > the job. Horrible. > > Fortunately I finally found minimum_position buried in nd_image. It is there because numarray did not provide it... But it is also there because it offers much functionality that would not be appropriate for the main package. It is part of the object measurement functions. A simpler, possibly more efficient routine should maybe be part of the main package. > - Masked arrays are not integrated. Thus a lot of important filtering > and stuff simply cannot be done on masked data without writing custom > extensions. For instance I'd like to do a median-filter that ignores > masked data (taking the median of non-masked data only). I agree very much! To be honest, I do not like the ma package much. I don't like the idea of having to use a separate package with a different array type that duplicates the functionality in the main package. I think it would be much better if all functions (where it makes sense) in numarray would accept an optional mask argument. To me it makes more sense to provide the mask with the operation, not as part of the array like in ma (a package like ma could still be layered on top.) I realize it would be a lot of work to make all numarray functions mask aware, but it is something to think about maybe. > - For 2-d images x and y are reversed. I know this isn't going to > change, but it is a headache every time I have to write new image > processing code. This is not really a problem I think, but you have to get used to it. If you treat the last dimension always as X and the first as Y, you have the same layout in memory as is usual in most image processing software. So X corresponds to axis=1 and Y to axis=0. Or use axis=-1 and axis=-2. Cheers, Peter From aisaac at american.edu Sat Oct 23 12:01:04 2004 From: aisaac at american.edu (Alan G Isaac) Date: Sat Oct 23 12:01:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu><417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu><41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> Message-ID: On Fri, 22 Oct 2004 Alan G Isaac apparently wrote: > Just two thoughts from a new user. > i. I agree that .sumAll is better than the current name > confusion. > ii. even better, I propose, would be for .sum to take > an axis argument, with default matching the sum function, > and possible value axis="all". > For the transition, the axis argument can be required. That should have been: axis=None fwiw, Alan Isaac From stephen.walton at csun.edu Sun Oct 24 19:22:03 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Sun Oct 24 19:22:03 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <41797FFE.8090802@colorado.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> Message-ID: <1098670236.1907.21.camel@localhost.localdomain> On Fri, 2004-10-22 at 14:47, Fernando Perez wrote: > silly, minor nit: can we avoid mixed case names? Either sum_all or SumAll? I'm > not too fond of CamelCase, but camelCase looks even worse to me :) I agree with Fernando about CamelCase (which among other things seriously bites one when moving from case-sensitive to case-insensitive OS's). But I want to make a broader point: I don't think we need sumall. The methods and the functions should simply work the same way. If one wants sumall, use A.flat.sum() or, if you can't use the methods or attributes on your old version of Python, sum(ravel(A)). If you start writing sumall, then you'll need meanall, stdall, prodall, etc, etc. The flat attribute and ravel function/method already provide all the needed functionality. Just trying to save Todd some work. Steve -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From verveer at embl-heidelberg.de Mon Oct 25 01:37:05 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 01:37:05 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098670236.1907.21.camel@localhost.localdomain> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> Message-ID: <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> On 25 Oct 2004, at 04:17, Stephen Walton wrote: > On Fri, 2004-10-22 at 14:47, Fernando Perez wrote: > >> silly, minor nit: can we avoid mixed case names? Either sum_all or >> SumAll? I'm >> not too fond of CamelCase, but camelCase looks even worse to me :) > > I agree with Fernando about CamelCase (which among other things > seriously bites one when moving from case-sensitive to case-insensitive > OS's). But I want to make a broader point: > > I don't think we need sumall. The methods and the functions should > simply work the same way. If one wants sumall, use A.flat.sum() or, if > you can't use the methods or attributes on your old version of Python, > sum(ravel(A)). If you start writing sumall, then you'll need meanall, > stdall, prodall, etc, etc. The flat attribute and ravel > function/method > already provide all the needed functionality. I think this may be inefficient, because ravel and flat may make a copy of the data. Also I think using flat/ravel in such a way is plain ugly and a complex way to do it. But I do agree that it is not a good idea to introduce another set of names. In my opinion functions that calculate a statistic like sum should return the total in the first place, rather then over a single axis. But I guess it is too late to change that for sum, because of backward compatibility. Cheers, Peter From stephen.walton at csun.edu Mon Oct 25 09:20:02 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Mon Oct 25 09:20:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> Message-ID: <1098721171.19183.12.camel@sunspot.csun.edu> On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote: > On 25 Oct 2004, at 04:17, Stephen Walton wrote: > > > > I don't think we need sumall. The methods and the functions should > > simply work the same way. If one wants sumall, use A.flat.sum() or, if > > you can't use the methods or attributes on your old version of Python, > > sum(ravel(A)). > > I think this may be inefficient, because ravel and flat may make a copy > of the data. Also I think using flat/ravel in such a way is plain ugly > and a complex way to do it. You may be right about the copying, I couldn't say. I don't think sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array, but ugly is in the eye of the beholder. > In my opinion functions that calculate a statistic like sum > should return the total in the first place, rather then over a single > axis. It depends on the data. I use rank-2 arrays which are images and are therefore homogeneous. Even there, though, I often want the sum of all rows or all columns. For heterogeneous data (e.g., columns of different Y's as a function of X), the present sum() makes sense. In other words, we will always need ways to sum over just one dimension and over all dimensions. By analogy with MATLAB (I'm guessing), sum() in Numeric and numarray does a one-D sum. -- Stephen Walton, Professor of Physics and Astronomy, California State University, Northridge stephen.walton at csun.edu From tim.hochberg at cox.net Mon Oct 25 09:32:01 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Oct 25 09:32:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> Message-ID: <417D2A3C.7010108@cox.net> Stephen Walton wrote: >On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote: > > >>On 25 Oct 2004, at 04:17, Stephen Walton wrote: >> >> >>>I don't think we need sumall. The methods and the functions should >>>simply work the same way. If one wants sumall, use A.flat.sum() or, if >>>you can't use the methods or attributes on your old version of Python, >>>sum(ravel(A)). >>> >>> >>I think this may be inefficient, because ravel and flat may make a copy >>of the data. Also I think using flat/ravel in such a way is plain ugly >>and a complex way to do it. >> >> > >You may be right about the copying, I couldn't say. I don't think >sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array, >but ugly is in the eye of the beholder. > > I'm not sure how feasible it is, but I'd much rather an efficient, non-copying, 1-D view of an noncontiguous array (from an enhanced version of flat or ravel or whatever) than a bunch of extra methods. The former allows all of the standard methods to just work efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, etc]. Making special whole array methods for everything just leads to method eplosion. -tim > > >>In my opinion functions that calculate a statistic like sum >>should return the total in the first place, rather then over a single >>axis. >> >> > >It depends on the data. I use rank-2 arrays which are images and are >therefore homogeneous. Even there, though, I often want the sum of all >rows or all columns. For heterogeneous data (e.g., columns of different >Y's as a function of X), the present sum() makes sense. In other words, >we will always need ways to sum over just one dimension and over all >dimensions. By analogy with MATLAB (I'm guessing), sum() in Numeric and >numarray does a one-D sum. > > > From stephen.walton at csun.edu Mon Oct 25 09:35:06 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Mon Oct 25 09:35:06 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> Message-ID: <1098722079.19183.22.camel@sunspot.csun.edu> On Mon, 2004-10-25 at 09:19 -0700, Stephen Walton wrote: > On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote: > > > I think this may be inefficient, because ravel and flat may make a copy > > of the data. Also I think using flat/ravel in such a way is plain ugly > > and a complex way to do it. > > You may be right about the copying, I couldn't say. I just looked at the source (numeric-1.1/Lib/generic.py). The comment to the ravel() function states that it returns a view, not a copy; but it calls reshape() which does make a copy if the input array is not contiguous. I just tested this: A=arange(25,shape=(5,5)) A.transpose() # now A is not contiguous v=ravel(A) A[2,2]=-17 v # verifies that v did not change. So, in the above, it does look like ravel() made a copy, and your fears about inefficiency are warranted. Another test shows that changing ravel(A) to A.flat above also results in a copy. Mayhaps we need sumall() after all. -- Stephen Walton, Professor of Physics and Astronomy, California State University, Northridge stephen.walton at csun.edu From verveer at embl-heidelberg.de Mon Oct 25 09:44:04 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 09:44:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> Message-ID: <0BC8D972-26A5-11D9-9F77-000A95C92C8E@embl-heidelberg.de> On 25 Oct 2004, at 18:19, Stephen Walton wrote: > On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote: >> On 25 Oct 2004, at 04:17, Stephen Walton wrote: >>> >>> I don't think we need sumall. The methods and the functions should >>> simply work the same way. If one wants sumall, use A.flat.sum() or, >>> if >>> you can't use the methods or attributes on your old version of >>> Python, >>> sum(ravel(A)). >> >> I think this may be inefficient, because ravel and flat may make a >> copy >> of the data. Also I think using flat/ravel in such a way is plain ugly >> and a complex way to do it. > > You may be right about the copying, I couldn't say. I don't think > sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array, > but ugly is in the eye of the beholder. It does not look worse, I agree with that! But I would argue it should have been sum(A) in the first place to sum over al axes... The sumall would not have been needed, and summing over one (or a sub-set) axis could have been implemented as a an optional argument to sum(). > >> In my opinion functions that calculate a statistic like sum >> should return the total in the first place, rather then over a single >> axis. > > It depends on the data. I use rank-2 arrays which are images and are > therefore homogeneous. Even there, though, I often want the sum of all > rows or all columns. For heterogeneous data (e.g., columns of > different > Y's as a function of X), the present sum() makes sense. In other > words, > we will always need ways to sum over just one dimension and over all > dimensions. By analogy with MATLAB (I'm guessing), sum() in Numeric > and > numarray does a one-D sum. I agree it is a useful feature, and it should still be possible to do that using an optional axis argument, even better I would love to be able to sum over several axes in one go, I find the one-dimensional character of reduce limiting, but I digress. In any case, I suppose we will stick with the current behaviour for backwards compatibility. Cheers, Peter From verveer at embl-heidelberg.de Mon Oct 25 09:47:01 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 09:47:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098722079.19183.22.camel@sunspot.csun.edu> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <1098722079.19183.22.camel@sunspot.csun.edu> Message-ID: <60595242-26A5-11D9-9F77-000A95C92C8E@embl-heidelberg.de> On 25 Oct 2004, at 18:34, Stephen Walton wrote: > On Mon, 2004-10-25 at 09:19 -0700, Stephen Walton wrote: >> On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote: >> >>> I think this may be inefficient, because ravel and flat may make a >>> copy >>> of the data. Also I think using flat/ravel in such a way is plain >>> ugly >>> and a complex way to do it. >> >> You may be right about the copying, I couldn't say. > > I just looked at the source (numeric-1.1/Lib/generic.py). The comment > to the ravel() function states that it returns a view, not a copy; but > it calls reshape() which does make a copy if the input array is not > contiguous. I just tested this: > > A=arange(25,shape=(5,5)) > A.transpose() # now A is not contiguous > v=ravel(A) > A[2,2]=-17 > v # verifies that v did not change. > > So, in the above, it does look like ravel() made a copy, and your fears > about inefficiency are warranted. Another test shows that changing > ravel(A) to A.flat above also results in a copy. Mayhaps we need > sumall() after all. Yes, we do I guess, but I do not like such things creeping into an otherwise elegant package if I may be frank... Peter From strang at nmr.mgh.harvard.edu Mon Oct 25 09:53:00 2004 From: strang at nmr.mgh.harvard.edu (Gary Strangman) Date: Mon Oct 25 09:53:00 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <417D2A3C.7010108@cox.net> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> Message-ID: > I'm not sure how feasible it is, but I'd much rather an efficient, > non-copying, 1-D view of an noncontiguous array (from an enhanced version of > flat or ravel or whatever) than a bunch of extra methods. The former allows > all of the standard methods to just work efficiently using sum(ravel(A)) or > sum(A.flat) [ and max and min, etc]. Making special whole array methods for > everything just leads to method eplosion. I completely agree with this ... an efficient flat/ravel would seem to solve many of the issues being raised. Forgive the potentially naive question here, but is there any reason such an efficient, enhanced view can't be implemented for the .flat method? I like the concept of .flat, but I regularly call functions with arguments that may-or-may-not be contiguous. For robustness, such functions _must_ be coded with ravel() because .flat fails for non-contiguous arrays. I never fully understood why there were two ways of "flattening" in the first place. Gary -------------------------------------------------------------- Gary Strangman, PhD | Director, Neural Systems Group Office: 617-724-0662 | Massachusetts General Hospital Fax: 617-726-4078 | 149 13th Street, Ste 10018 | Charlestown, MA 02129 From verveer at embl-heidelberg.de Mon Oct 25 10:09:05 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 10:09:05 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> Message-ID: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> On 25 Oct 2004, at 18:51, Gary Strangman wrote: > >> I'm not sure how feasible it is, but I'd much rather an efficient, >> non-copying, 1-D view of an noncontiguous array (from an enhanced >> version of flat or ravel or whatever) than a bunch of extra methods. >> The former allows all of the standard methods to just work >> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, >> etc]. Making special whole array methods for everything just leads to >> method eplosion. > > I completely agree with this ... an efficient flat/ravel would seem to > solve many of the issues being raised. Forgive the potentially naive > question here, but is there any reason such an efficient, enhanced > view can't be implemented for the .flat method? I believe it is not possible without copying data. The strides between elements of a noncontiguous array are not always the same, so you cannot efficiently view it as a 1D array. > I like the concept of .flat, but I regularly call functions with > arguments that may-or-may-not be contiguous. For robustness, such > functions _must_ be coded with ravel() because .flat fails for > non-contiguous arrays. Functions should be coded in the first place to take multi-dimensional nature into account in my opinion. One of the points of numarray is that it is multi-dimensional. If a function can work over multiple dimensions, but it only works for 1D arrays, it is broken in my opinion. In my opinion sum() _is_ broken, and introducing a separate sum_all() is an ugly hack. > I never fully understood why there were two ways of "flattening" in > the first place. I suppose it is for efficiency reasons, flat may not always works, but if it does, it is efficient since it would not need to copy any data. Peter From Chris.Barker at noaa.gov Mon Oct 25 10:10:20 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Mon Oct 25 10:10:20 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1098508579.3403.6.camel@localhost.localdomain> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098480955.11372.19.camel@freyer.sfo.csun.edu> <1098508579.3403.6.camel@localhost.localdomain> Message-ID: <417D3309.9070302@noaa.gov> A few comments on a number of posts in this thread: Stephen Walton wrote: >>- I'd like to write C/C++ code that would work on multiple array >>types. > > I can't help much here, other than to say that C and C++ are pretty low > level languages, not well suited for this level of abstraction. Well, this is certainly true for C, but not so much for C++. I'm not expert, but C++ templates could be very handy here. When the numarray projects was just getting started, there was some discussion about using a template-based array package as the base, perhaps Blitz++. I still this this was a great idea, but I think the biggest issue at the time was that templates were still not constantly well supported by the wide variety of compilers that numarray should work with. Personally I think that anything supported by gcc should be fine, as anyone can use gcc on virtually any platform, if they want. Anyway, it's too late to re-write numarray, but maybe a numarray <--> blitz++ conversion package would make it easy to write numarray extensions with blitz++. Perhaps even integrate it with Boost.Python. Another option would be to write a template-based wrapper around the existing Numarray objects. By the way, my other issue with extensions is the difficulty of writing extensions that support discontinuous arrays, in addition to multiple data types. It seems someone smarter than me could use C++ classes to solve this one as well. Peter Verveer wrote: > But I do agree that it is not a good idea to introduce another set of > names. In my opinion functions that calculate a statistic like sum > should return the total in the first place, rather then over a single > axis. Absolutely not! I'm far more likely to want it over a single axis, it's the core of "vectorizing" your code. If the data are mean the same thing, why aren't you storing it in a 1-d array? That being said, it should be easy to do various reductions over all axis, which I think .flat() does nicely. I thought .flat() never made a copy: am I wrong? Stephen Walton wrote: > It depends on the data. I use rank-2 arrays which are images and are > therefore homogeneous. OK, good example.... I take back some of what I said above! > By analogy with MATLAB (I'm guessing), sum() in Numeric and > numarray does a one-D sum. except Matab does it worse. If your 2-d array happens to have only one row, you get the sum over that..yecch! Tim Hochberg wrote: > I'm not sure how feasible it is, but I'd much rather an efficient, > non-copying, 1-D view of an noncontiguous array (from an enhanced > version of flat or ravel or whatever) than a bunch of extra methods. The > former allows all of the standard methods to just work efficiently using > sum(ravel(A)) or sum(A.flat) [ and max and min, etc]. Making special > whole array methods for everything just leads to method eplosion. here! here! I thought that was exactly what .flat() was for. Shows what I know! -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rowen at u.washington.edu Mon Oct 25 10:33:02 2004 From: rowen at u.washington.edu (Russell E Owen) Date: Mon Oct 25 10:33:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> Message-ID: At 7:08 PM +0200 2004-10-25, Peter Verveer wrote: >On 25 Oct 2004, at 18:51, Gary Strangman wrote: > >> >>> I'm not sure how feasible it is, but I'd much rather an >>>efficient, non-copying, 1-D view of an noncontiguous array (from >>>an enhanced version of flat or ravel or whatever) than a bunch of >>>extra methods. The former allows all of the standard methods to >>>just work efficiently using sum(ravel(A)) or sum(A.flat) [ and max >>>and min, etc]. Making special whole array methods for everything >>>just leads to method eplosion. >> >> I completely agree with this ... an efficient flat/ravel would >>seem to solve many of the issues being raised. Forgive the >>potentially naive question here, but is there any reason such an >>efficient, enhanced view can't be implemented for the .flat method? > >I believe it is not possible without copying data. The strides >between elements of a noncontiguous array are not always the same, >so you cannot efficiently view it as a 1D array. How about providing an iterator that counts through all the elements of an array (e.g. arr.itervalues()). So long as C extensions could efficiently make use of such an iterator, I think it'd do the job. One could also imagine: - arr.iteritems(), which returned (index, value) for each item - a mask argument: a boolean array the same shape as the data array; True means elide the corresponding value from the data array - general support for indexing More generally, I agree that sum should work the same as a function and a method, and that an extra axis argument could be a good thing (it is so common elsewhere, e.g. size). I'd be tempted to break backwards compatibility to fix this, since numarray is still new and the current situation is very confusing. -- Russell From strang at nmr.mgh.harvard.edu Mon Oct 25 10:38:01 2004 From: strang at nmr.mgh.harvard.edu (Gary Strangman) Date: Mon Oct 25 10:38:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> Message-ID: >> I completely agree with this ... an efficient flat/ravel would seem to >> solve many of the issues being raised. Forgive the potentially naive >> question here, but is there any reason such an efficient, enhanced view >> can't be implemented for the .flat method? > > I believe it is not possible without copying data. The strides between > elements of a noncontiguous array are not always the same, so you cannot > efficiently view it as a 1D array. And it gets even worse for different-stride slices of N-D arrays (though I'm not yet ready to say it's impossible to do without copying). Maybe it's just me, but it does seem somewhat non-pythonic for a function/method to break for an inefficient case, instead of dropping back to less efficient (i.e., copying) behavior. > Functions should be coded in the first place to take multi-dimensional nature > into account in my opinion. One of the points of numarray is that it is > multi-dimensional. If a function can work over multiple dimensions, but it > only works for 1D arrays, it is broken in my opinion. In my opinion sum() > _is_ broken, and introducing a separate sum_all() is an ugly hack. +1. ;-) Hence the thought to make flattening a single "enhanced" method/fcn ... to essentially eliminate the need for such ugly hacks. Typically, my functions accept N-D arguments, and can operate over a user-selected subset of these dimensions. I may pass a whole array, or every other column, or whatever. Judging from the history of this thread, I think a .flat that is as-efficient-as-possible and also robust to all forms of non-contiguity would benefit many, while also reducing the learning-curve issues associated with .flat vs ravel(). As for where/when/how to introduce .newandimprovedflat, welllllll, that's for another thread. ;-) Gary -------------------------------------------------------------- Gary Strangman, PhD | Director, Neural Systems Group Office: 617-724-0662 | Massachusetts General Hospital Fax: 617-726-4078 | 149 13th Street, Ste 10018 | Charlestown, MA 02129 From verveer at embl-heidelberg.de Mon Oct 25 10:42:03 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 10:42:03 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <417D3309.9070302@noaa.gov> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098480955.11372.19.camel@freyer.sfo.csun.edu> <1098508579.3403.6.camel@localhost.localdomain> <417D3309.9070302@noaa.gov> Message-ID: <1A9085AC-26AD-11D9-9F77-000A95C92C8E@embl-heidelberg.de> > Stephen Walton wrote: >>> - I'd like to write C/C++ code that would work on multiple array >>> types. >> I can't help much here, other than to say that C and C++ are pretty >> low >> level languages, not well suited for this level of abstraction. > > Well, this is certainly true for C, but not so much for C++. I'm not > expert, but C++ templates could be very handy here. When the numarray > projects was just getting started, there was some discussion about > using a template-based array package as the base, perhaps Blitz++. I > still this this was a great idea, but I think the biggest issue at the > time was that templates were still not constantly well supported by > the wide variety of compilers that numarray should work with. > Personally I think that anything supported by gcc should be fine, as > anyone can use gcc on virtually any platform, if they want. I think having the option of using C++ would be cool. But as soon as we would 'require' it, I would not develop for numarray anymore. C++ is a big pain in my opinion, although I do agree that a well written templating system like Blitz++ is nice if you actually use C++. > Anyway, it's too late to re-write numarray, but maybe a numarray <--> > blitz++ conversion package would make it easy to write numarray > extensions with blitz++. Perhaps even integrate it with Boost.Python. > Another option would be to write a template-based wrapper around the > existing Numarray objects. yes, it would be nice to have the option. There is no reason why there could not be a C++ API which would include the use of templates layered on top of the current C API for those people that would like to use it. > By the way, my other issue with extensions is the difficulty of > writing extensions that support discontinuous arrays, in addition to > multiple data types. It seems someone smarter than me could use C++ > classes to solve this one as well. I had to deal with that problem too in nd_image. It is doable, albeit ugly if you depend on plain C. Probably C++ could do it differently and more nicely, Blitz++ possible does. Again, not for me. > Peter Verveer wrote: > >> But I do agree that it is not a good idea to introduce another set of >> names. In my opinion functions that calculate a statistic like sum >> should return the total in the first place, rather then over a single >> axis. > > Absolutely not! I'm far more likely to want it over a single axis, > it's the core of "vectorizing" your code. If the data are mean the > same thing, why aren't you storing it in a 1-d array? I agree that it is important, I am just saying that both are very common operations. Why not support operations over an axis by a optional argument, you will often have to specify which axis you want anyway. > That being said, it should be easy to do various reductions over all > axis, which I think .flat() does nicely. I thought .flat() never made > a copy: am I wrong? Unfortunately, flattening an array is not always possible without copying, due to the fact that arrays may be not contiguous in memory. > Tim Hochberg wrote: >> I'm not sure how feasible it is, but I'd much rather an efficient, >> non-copying, 1-D view of an noncontiguous array (from an enhanced >> version of flat or ravel or whatever) than a bunch of extra methods. >> The former allows all of the standard methods to just work >> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, >> etc]. Making special whole array methods for everything just leads to >> method eplosion. > > here! here! I thought that was exactly what .flat() was for. Shows > what I know! It is however not feasible I think to do it efficiently. It seems to me that a set of functions is necessary to do things like sum, minimum and so on, that work on the whole array. I would also prefer they are not methods. Introducing a whole array of sum_all() like functions is also not great. Cheers, Peter From verveer at embl-heidelberg.de Mon Oct 25 11:04:01 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 11:04:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> Message-ID: <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> On 25 Oct 2004, at 19:32, Russell E Owen wrote: > At 7:08 PM +0200 2004-10-25, Peter Verveer wrote: >> On 25 Oct 2004, at 18:51, Gary Strangman wrote: >> >>> >>>> I'm not sure how feasible it is, but I'd much rather an efficient, >>>> non-copying, 1-D view of an noncontiguous array (from an enhanced >>>> version of flat or ravel or whatever) than a bunch of extra >>>> methods. The former allows all of the standard methods to just work >>>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, >>>> etc]. Making special whole array methods for everything just leads >>>> to method eplosion. >>> >>> I completely agree with this ... an efficient flat/ravel would seem >>> to solve many of the issues being raised. Forgive the potentially >>> naive question here, but is there any reason such an efficient, >>> enhanced view can't be implemented for the .flat method? >> >> I believe it is not possible without copying data. The strides >> between elements of a noncontiguous array are not always the same, so >> you cannot efficiently view it as a 1D array. > > How about providing an iterator that counts through all the elements > of an array (e.g. arr.itervalues()). So long as C extensions could > efficiently make use of such an iterator, I think it'd do the job. It would still be slower, because you would need a function call at each element that returns a value. Not a problem if you do a lot of work at each element, but if you are just adding values you want a custom written C function. You can do it a the C level with macros or so, (I do that in nd_image) but that would not help at the python level. > One could also imagine: > - arr.iteritems(), which returned (index, value) for each item > - a mask argument: a boolean array the same shape as the data array; > True means elide the corresponding value from the data array > - general support for indexing Essentially you are suggesting to expose iterators at the python level that iterate over an array in some predefined way. That is possible, but I doubt it will be efficient. At the C level however, it might be worth thinking about as a way of easing writing functions in C. I proposed to do it the other way around in an earlier mail: providing a set of generic functions that take a python or a C function to be applied at each element. I most likely will implement something in that direction, but I should give your idea also some thought. > More generally, I agree that sum should work the same as a function > and a method, and that an extra axis argument could be a good thing > (it is so common elsewhere, e.g. size). I'd be tempted to break > backwards compatibility to fix this, since numarray is still new and > the current situation is very confusing. I would absolutely vote for such a change. Simply because we would like a range of such functions, e.g. minimum, maximum, and so on. Even if we have to leave sum() as it is, I think we should have the alternatives, we would just have to come up with an alternative name for sum(). In fact I would consider volunteering implementing these functions. Peter From tim.hochberg at cox.net Mon Oct 25 14:03:03 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Mon Oct 25 14:03:03 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> Message-ID: <417D69CD.7070604@cox.net> Peter Verveer wrote: > > On 25 Oct 2004, at 19:32, Russell E Owen wrote: > >> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote: >> >>> On 25 Oct 2004, at 18:51, Gary Strangman wrote: >>> >>>> >>>>> I'm not sure how feasible it is, but I'd much rather an >>>>> efficient, non-copying, 1-D view of an noncontiguous array (from >>>>> an enhanced version of flat or ravel or whatever) than a bunch of >>>>> extra methods. The former allows all of the standard methods to >>>>> just work efficiently using sum(ravel(A)) or sum(A.flat) [ and max >>>>> and min, etc]. Making special whole array methods for everything >>>>> just leads to method eplosion. >>>> >>>> >>>> I completely agree with this ... an efficient flat/ravel would >>>> seem to solve many of the issues being raised. Forgive the >>>> potentially naive question here, but is there any reason such an >>>> efficient, enhanced view can't be implemented for the .flat method? >>> >>> >>> I believe it is not possible without copying data. The strides >>> between elements of a noncontiguous array are not always the same, >>> so you cannot efficiently view it as a 1D array. >> >> >> How about providing an iterator that counts through all the elements >> of an array (e.g. arr.itervalues()). So long as C extensions could >> efficiently make use of such an iterator, I think it'd do the job. > > > It would still be slower, because you would need a function call at > each element that returns a value. Not a problem if you do a lot of > work at each element, but if you are just adding values you want a > custom written C function. You can do it a the C level with macros or > so, (I do that in nd_image) but that would not help at the python level. > >> One could also imagine: >> - arr.iteritems(), which returned (index, value) for each item >> - a mask argument: a boolean array the same shape as the data array; >> True means elide the corresponding value from the data array >> - general support for indexing > > > Essentially you are suggesting to expose iterators at the python level > that iterate over an array in some predefined way. That is possible, > but I doubt it will be efficient. > > At the C level however, it might be worth thinking about as a way of > easing writing functions in C. I proposed to do it the other way > around in an earlier mail: providing a set of generic functions that > take a python or a C function to be applied at each element. I most > likely will implement something in that direction, but I should give > your idea also some thought. > >> More generally, I agree that sum should work the same as a function >> and a method, and that an extra axis argument could be a good thing >> (it is so common elsewhere, e.g. size). I'd be tempted to break >> backwards compatibility to fix this, since numarray is still new and >> the current situation is very confusing. > > > I would absolutely vote for such a change. Simply because we would > like a range of such functions, e.g. minimum, maximum, and so on. Even > if we have to leave sum() as it is, I think we should have the > alternatives, we would just have to come up with an alternative name > for sum(). In fact I would consider volunteering implementing these > functions. Why the need to break backwards compatability? If one is going to reimplement sum, et al so as to operate on an arbitrary set of axes there's no reason one couldn't maintain the current behaviour as the default. All that is required is to allow axis to be a number (current behaviour), a tuple (reduce across the designated axes) or some special value to sum over all (None?, "all"?). Having two sum functions with different names is not particularly better than the current proposal of a method and a function. -tim From verveer at embl-heidelberg.de Mon Oct 25 15:48:03 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Mon Oct 25 15:48:03 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <417D69CD.7070604@cox.net> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net> Message-ID: On Oct 25, 2004, at 11:02 PM, Tim Hochberg wrote: > Peter Verveer wrote: > >> >> On 25 Oct 2004, at 19:32, Russell E Owen wrote: >> >>> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote: >>> >>>> On 25 Oct 2004, at 18:51, Gary Strangman wrote: >>>> >>>>> >>>>>> I'm not sure how feasible it is, but I'd much rather an >>>>>> efficient, non-copying, 1-D view of an noncontiguous array (from >>>>>> an enhanced version of flat or ravel or whatever) than a bunch of >>>>>> extra methods. The former allows all of the standard methods to >>>>>> just work efficiently using sum(ravel(A)) or sum(A.flat) [ and >>>>>> max and min, etc]. Making special whole array methods for >>>>>> everything just leads to method eplosion. >>>>> >>>>> >>>>> I completely agree with this ... an efficient flat/ravel would >>>>> seem to solve many of the issues being raised. Forgive the >>>>> potentially naive question here, but is there any reason such an >>>>> efficient, enhanced view can't be implemented for the .flat >>>>> method? >>>> >>>> >>>> I believe it is not possible without copying data. The strides >>>> between elements of a noncontiguous array are not always the same, >>>> so you cannot efficiently view it as a 1D array. >>> >>> >>> How about providing an iterator that counts through all the elements >>> of an array (e.g. arr.itervalues()). So long as C extensions could >>> efficiently make use of such an iterator, I think it'd do the job. >> >> >> It would still be slower, because you would need a function call at >> each element that returns a value. Not a problem if you do a lot of >> work at each element, but if you are just adding values you want a >> custom written C function. You can do it a the C level with macros or >> so, (I do that in nd_image) but that would not help at the python >> level. >> >>> One could also imagine: >>> - arr.iteritems(), which returned (index, value) for each item >>> - a mask argument: a boolean array the same shape as the data array; >>> True means elide the corresponding value from the data array >>> - general support for indexing >> >> >> Essentially you are suggesting to expose iterators at the python >> level that iterate over an array in some predefined way. That is >> possible, but I doubt it will be efficient. >> >> At the C level however, it might be worth thinking about as a way of >> easing writing functions in C. I proposed to do it the other way >> around in an earlier mail: providing a set of generic functions that >> take a python or a C function to be applied at each element. I most >> likely will implement something in that direction, but I should give >> your idea also some thought. >> >>> More generally, I agree that sum should work the same as a function >>> and a method, and that an extra axis argument could be a good thing >>> (it is so common elsewhere, e.g. size). I'd be tempted to break >>> backwards compatibility to fix this, since numarray is still new and >>> the current situation is very confusing. >> >> >> I would absolutely vote for such a change. Simply because we would >> like a range of such functions, e.g. minimum, maximum, and so on. >> Even if we have to leave sum() as it is, I think we should have the >> alternatives, we would just have to come up with an alternative name >> for sum(). In fact I would consider volunteering implementing these >> functions. > > Why the need to break backwards compatability? If one is going to > reimplement sum, et al so as to operate on an arbitrary set of axes > there's no reason one couldn't maintain the current behaviour as the > default. It seems to me that the behavior one would expect for a function like that, would be to apply the operation to the whole array. Not along an axis. What would you expect as a new user if you call a minimum() function? A single value that is the minimum. So that is the logical choice for the default behavior, I would think. > All that is required is to allow axis to be a number (current > behaviour), a tuple (reduce across the designated axes) or some > special value to sum over all (None?, "all"?). Yes, that would be the idea anyway. The question is what should be the default behavior for this type of functions, something I think we should not decide based on the current behavior of a single existing function, but based on what makes the most sense. That is obviously something that can be discussed... > > Having two sum functions with different names is not particularly > better than the current proposal of a method and a function. This is certainly true. I would prefer breaking compability... Peter From meikuan75 at hotmail.com Tue Oct 26 02:22:05 2004 From: meikuan75 at hotmail.com (Mei Kuan) Date: Tue Oct 26 02:22:05 2004 Subject: [Numpy-discussion] Singaporeans ay tumutulong para mapaunlad ang sariling negosyo Message-ID: Dear Filipino friend, Kumusta ka na? We were looking and your email just appeared, perhaps it was GOD's will. We sincerely hope that you read on this letter. This may be of significant relevance to you and your loved ones and give you something you are looking for in life. Do allow us to provide you with a brief introduction of ourselves. We are a team of Singaporean entrepreneurs hailing from various professional fields. We know that, in the new millennium, more Filipino employees and professionals are finding it harder to get ahead in life due to greater job insecurity as a result of corporate downsizing and global outsourcing, diminishing wages, office politics, not forgetting constant retrenchment threats. They are further affected by the rising costs of living and interest rates, not forgetting the current economic difficulties that Philippines is currently facing. There are also thousands of Filipinos who have to endure the heart-break of leaving their loved ones to venture overseas in order to support their loved ones and the Philippines economy once again. Filipino businessmen too, have to grapple with increasing economic and political uncertainties, epidemic threats such as the Avian Flu, competitive threats and unstable crude oil crisis. Further, due to the increasingly rapid changes in the business environment, they find it harder to keep up with the increasingly volatile business cycles. We recognise these problems faced by many Filipinos today and decide to embark on a more fulfilling long term career of helping them solve their problems and improving their lives in the process. What we do is to help Filipinos develop/diversify into their own businesses in a new, potentially huge and expanding industry so that they can start managing the above adversities and making significant progress towards what they and their loved ones want in life once again. Would this be something that may be deemed as a long term solution in your life? Our fellow associates from Singapore will be flying specially to the Philippines to conduct a series of exclusive previews in Makati, Cebu and Naga in November. Would you be interested to attend one of our previews for you to discover how our revolutionary platform can possibly help you and your loved ones improve your results on a long-term basis? If you are interested to attend, could you kindly provide us with your cellphone no. for our senior associate, Mr. Chew to text you when he is in Philippines next month? Mr. Chew was a very successful corporate executive from a Multi-National Corporation and a former Economic Lecturer. He held a Master of Science Degree in Financial Economics. Hence, he knows what it takes for a business to be considered a viable one and of course, what it takes to succeed in the business. He gave up a very successful corporate life to help many Filipinos change their lives. Despite his busy schedule, he is committed to flying to Philippines to help them. As such, he is a great mentor, inspirational, dynamic leader to many of us. He gained great respects from many of our Filipinos and non Filipinos friends. We believe he is the best person to share with you in depth how our revolutionary platform can fulfill your goals in life and turn your dreams into reality. We would handle all enquiries via Chikka: 001877961 or Skype: Reychell We sincerely urge you to communicate with us on Chikka/Skype to know you better as a friend and understand the challenges you are currently facing because we are looking to help you on a long-term basis. Ingats. GOD BLESS. Chow Mei Kuan (Ms) / Don (Mr.) Email: reychell at singnet.com.sg /chewlw at singnet.com.sg Chikka No.: 001877961 Skype ID: Reychell P.S.: This may be a GOD-send opportunity to improve your life. Disclaimer: This email, together with any attachments, is intended ONLY for the use of the individual or entity to which it is addressed, and may contain information that is legally privileged, confidential, and/or subject to copyright. If you are not the intended recipient, please be informed that any dissemination, distribution or copying of this email, any attachment, or part thereof is strictly prohibited. Kindly note that internet communications are not secure, and therefore are susceptible to alterations. If you have received this email in error, please advise the sender by reply email, and delete this message. Your co-operation on this matter is highly appreciated. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Tue Oct 26 09:21:08 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Oct 26 09:21:08 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net> Message-ID: <417E7907.9060107@noaa.gov> Peter Verveer wrote: > On Oct 25, 2004, at 11:02 PM, Tim Hochberg wrote: >> Why the need to break backwards compatability? If one is going to >> reimplement sum, et al so as to operate on an arbitrary set of axes >> there's no reason one couldn't maintain the current behaviour as the >> default. Great idea! > It seems to me that the behavior one would expect for a function like > that, would be to apply the operation to the whole array. Not along an > axis. What would you expect as a new user if you call a minimum() > function? A single value that is the minimum. So that is the logical > choice for the default behavior, I would think. nope. I'd expect it to be along an axis, by default the last one. To me, that's what vectorization is all about. Maybe this is because of my MATLAB (and now Numeric) background, but it makes the most sense to me that a method either returns an array of the same rank, or "reducing" methods return an array of rank reduced by one. Having a method return the same rank answer, no matter the rank of the input, is weird to me. This all depends on how you use arrays. I can see that if you tend to use a 2-d array to store an image, that the single minimum would seem logical, but for many other uses, each dimension has an independent meaning. > Yes, that would be the idea anyway. The question is what should be the > default behavior for this type of functions, something I think we should > not decide based on the current behavior of a single existing function, > but based on what makes the most sense. That is obviously something that > can be discussed... yup, but frankly, this isn't about just one function, it's really about all the reductions: min, max, sum, etc, etc. I think the rule of thumb is not to break backward compatibility unless there is a compelling reason, and given that it's not clear what is most "natural" in this case, keeping the default the same makes the most sense. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From verveer at embl-heidelberg.de Tue Oct 26 11:20:02 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Tue Oct 26 11:20:02 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <417E7907.9060107@noaa.gov> References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net> <417E7907.9060107@noaa.gov> Message-ID: <8629C0DC-277B-11D9-8DC3-000D932805AC@embl-heidelberg.de> On Oct 26, 2004, at 6:19 PM, Chris Barker wrote: > Peter Verveer wrote: >> It seems to me that the behavior one would expect for a function like >> that, would be to apply the operation to the whole array. Not along >> an axis. What would you expect as a new user if you call a minimum() >> function? A single value that is the minimum. So that is the logical >> choice for the default behavior, I would think. > > nope. I'd expect it to be along an axis, by default the last one. I still do not agree completely with that, I will elaborate more below, because I also do not agree anymore with my own earlier writings :-). But I see your point that this type of operation can be natural depending on what you are doing. Sometimes a single value does make sense, sometimes not, I think we can agree on that. >> Yes, that would be the idea anyway. The question is what should be >> the default behavior for this type of functions, something I think we >> should not decide based on the current behavior of a single existing >> function, but based on what makes the most sense. That is obviously >> something that can be discussed... > > yup, but frankly, this isn't about just one function, it's really > about all the reductions: min, max, sum, etc, etc. Actually no. It seems that sum() is a special case, along with a few others. Again: I elaborate on the general case below. > I think the rule of thumb is not to break backward compatibility > unless there is a compelling reason, and given that it's not clear > what is most "natural" in this case, keeping the default the same > makes the most sense. I agree. In contrast what I have said before I think we should keep it as it is, for compatibility. Now to elaborate on the general problem, please correct me if I get something wrong. I will use the minimum function as an example and come back to sum() later. If you look at a minimum operation then there are three different things you might like to do: 1) An element by element minimum: minimum(a1, a2). This is the current behaviour. Like all binary ufuncs of this type, it operates on pairs of arrays. So by default it does not do reduction or calculate a single minimum. For most ufuncs that is the natural behavior anyway. 2) A reduction: minimum.reduce(a1). The reduce method of ufuncs is generally used for reductions. Having to use .reduce makes clear what you are doing. Although a bit odd at first sight, I think it is a clever way to overload ufuncs names with different functionality. 3) The minimum of the array: In numarray you do a1.min(). I think in Numeric, you have to do something like minimum.reduce(a1.flat), correct me if I am wrong. Not nice in both cases... Note that calling a binary ufunc with a single argument will give an error: minimum(a1) raises a TypeError. That seems to be a good decision, because people seem to have different ideas of what should happen: I would expect the minimum of the array, others expect a reduction. Generally I guess it was a wise decision not to change the meaning of a function depending on wether it has one or two arguments. The sum() function is an alias to add.reduce. there are a few more of these aliases (i.e. product). I would still say that it is a bit unfortunate, since not everybody may immediately realize that these functions are in fact reductions. I wonder if one would not be better of without these functions at all, after all you can access the functionality through .reduce(). If you mind the extra typing, just define your own alias. Can't we shift them into numarray.numeric? Just a thought... In any case, clearly these functions need to stay around as they are for compatibility reasons. It is far more productive to add the functionality that a few people already proposed: allow reductions over multiple axes. I would welcome that, I always found 1D reductions a bit limited anyway. Obviously you can do sequential 1D reductions, but that can be quite inefficient. As proposed, the axis argument would take maybe a list of dimensions, and 'all' or None. I would like to propose an additional possibility: like minimum.reduce(), we could have a minimum.all() function that reduces over all dimensions (with a potentially much more efficient implementation.) We don't need a sum_all(a1) then, you would use add.all(a1). I guess this would be easily prototyped using sequential reductions, one can worry about efficiency later. Sorry for the long story... Cheers, Peter From haase at msg.ucsf.edu Wed Oct 27 09:59:02 2004 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Wed Oct 27 09:59:02 2004 Subject: [Numpy-discussion] bug? in len(arr.flat) Message-ID: <200410270958.20025.haase@msg.ucsf.edu> Hi, I have a (UInt16) 3d data stack and want to get to it's underlying buffer (to (later) feed it into memmap) ... I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't multiply itemsize in. >>> pr2.shape (40, 512, 512) >>> pr2.flat.shape (10485760) >>> 512*512*40 10485760 >>> len(pr2.flat) 10485760 >>> pr2.flat._itemsize 2 >>> len(pr2._data) 20971520 >>> pr2._byteoffset 0 Is this a bug or am I missunderstanding ? Thanks, Sebastian Haase From strawman at astraw.com Thu Oct 28 19:21:02 2004 From: strawman at astraw.com (Andrew Straw) Date: Thu Oct 28 19:21:02 2004 Subject: [Numpy-discussion] floating point exception weirdness In-Reply-To: <41795006.1040807@astraw.com> References: <4119BBFC.6020304@astraw.com> <1092221365.3752.32.camel@localhost.localdomain> <411A08FA.7000601@astraw.com> <41795006.1040807@astraw.com> Message-ID: <4181A8CC.2040807@astraw.com> Just a small addendum, (which I hope will spur on bug-fixing once Todd et al. are back from the conference -- let me know if I should file a sourceforge bug report): Numeric is not necessary to trigger the bug in the below code -- numarray is sufficient on its own. Furthermore, I can confirm that merely removing the "atlas3-sse2" Debian package from my system causes the code, whether or not numarray.ieeespecial is imported, to run without being killed by an FPE. Andrew Straw wrote: > I've isolated a bug I first reported on this mailing list in August. > I've now confined it to a small code snippet using entirely > open-source software (previously I saw it while using Intel's IPP). > In a nutshell, importing numarray.ieeespecial triggers a floating > point exception (which kills my program) when I call Numeric's > singular_value_decomposition() function: > > import Numeric > from LinearAlgebra import singular_value_decomposition > > if want_FPE: > import numarray.ieeespecial > > A= [[-5.7, 2.2, -0.53, 46.0], > [-2.3, -5.5, -1.0, 1091.0], > [5.9, 1.4, -0.1, -142.0], > [-1.3, 5.7, -1.5, 2673.0]] > A=Numeric.array(A) > u,s,v = singular_value_decomposition(A) # FPE triggered here > > Here's my setup: > > $ python > Python 2.3.4 (#2, Sep 24 2004, 08:39:09) > [GCC 3.3.4 (Debian 1:3.3.4-12)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import Numeric > >>> Numeric.__version__ > '23.6' > >>> import numarray > >>> numarray.__version__ > '1.2a' > > $ gcc -v > Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.4/specs > Configured with: ../src/configure -v > --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang > --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info > --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared > --with-system-zlib --enable-nls --without-included-gettext > --enable-__cxa_atexit --enable-clocale=gnu --enable-debug > --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux > Thread model: posix > gcc version 3.3.4 (Debian 1:3.3.4-13) > > Now, for the clue: the above error is ONLY triggered when I compile > Numeric to use system blas and friends, not when I use lapack_lite > included with Numeric. This leads me to suspect it is related to the > SSE2 unit -- I have Debian sarge's atlas3-base, atlas3-see, > atlas3-sse2, blas, lapack, lapack3, and refblas3 packages installed on > my P4 machine. > > So, to propose a hypothesis: numarray.ieeespecial sets the FPE bit in > the SSE2 hardware, but for some reason this does not raise SIGFPE. > However, when the next call that touches SSE2 happens, the kernel sees > that error bit and throws the signal. Does this explanation make > sense? Is it easy to fix? > > Cheers! > Andrew > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal > Use IT products in your business? Tell us what you think of them. Give us > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out > more > http://productguide.itmanagersjournal.com/guidepromo.tmpl > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From stevech1097 at yahoo.com.au Thu Oct 28 21:56:30 2004 From: stevech1097 at yahoo.com.au (Steve Chaplin) Date: Thu Oct 28 21:56:30 2004 Subject: [Numpy-discussion] Re: floating point exception weirdness (Andrew Straw) In-Reply-To: References: Message-ID: <1099025806.2742.23.camel@f1> > Just a small addendum, (which I hope will spur on bug-fixing once Todd > et al. are back from the conference -- let me know if I should file a > sourceforge bug report): I've not read all this thread so I don't know the full background. But I had a floating point / SSE problem using numarray. It turned out to be a glibc not numarray problem and was solved by upgrading glibc. http://sources.redhat.com/bugzilla/show_bug.cgi?id=10 There was also a SourceForge bug report but I can't locate it. Regards Steve From jmiller at stsci.edu Fri Oct 29 06:27:11 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Oct 29 06:27:11 2004 Subject: [Numpy-discussion] bug? in len(arr.flat) In-Reply-To: <200410270958.20025.haase@msg.ucsf.edu> References: <200410270958.20025.haase@msg.ucsf.edu> Message-ID: <1099056380.4904.12.camel@localhost.localdomain> On Wed, 2004-10-27 at 12:58, Sebastian Haase wrote: > Hi, > I have a (UInt16) 3d data stack and want to get to it's underlying buffer (to > (later) feed it into memmap) ... > I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't > multiply itemsize in. > >>> pr2.shape > (40, 512, 512) > >>> pr2.flat.shape > (10485760) > >>> 512*512*40 > 10485760 > >>> len(pr2.flat) > 10485760 > >>> pr2.flat._itemsize > 2 > >>> len(pr2._data) > 20971520 > >>> pr2._byteoffset > 0 > > Is this a bug No. > or am I missunderstanding ? Yes. _data is "an object which supports the buffer protocol". In this context, it is effectively a string and thus the product of the total number of elements and the itemsize. (We'll ignore for now the fact that not every array uses the entire buffer.) In contrast, shape(.flat) is only the total number of elements and is independent of itemsize. Regards, Todd From haase at msg.ucsf.edu Fri Oct 29 09:03:25 2004 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri Oct 29 09:03:25 2004 Subject: [Numpy-discussion] bug? in len(arr.flat) In-Reply-To: <1099056380.4904.12.camel@localhost.localdomain> References: <200410270958.20025.haase@msg.ucsf.edu> <1099056380.4904.12.camel@localhost.localdomain> Message-ID: <200410290902.25410.haase@msg.ucsf.edu> Of course ! sorry I forgot. Thanks, Sebastian On Friday 29 October 2004 06:26 am, Todd Miller wrote: > On Wed, 2004-10-27 at 12:58, Sebastian Haase wrote: > > Hi, > > I have a (UInt16) 3d data stack and want to get to it's underlying buffer > > (to (later) feed it into memmap) ... > > I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't > > multiply itemsize in. > > > > >>> pr2.shape > > > > (40, 512, 512) > > > > >>> pr2.flat.shape > > > > (10485760) > > > > >>> 512*512*40 > > > > 10485760 > > > > >>> len(pr2.flat) > > > > 10485760 > > > > >>> pr2.flat._itemsize > > > > 2 > > > > >>> len(pr2._data) > > > > 20971520 > > > > >>> pr2._byteoffset > > > > 0 > > > > Is this a bug > > No. > > > or am I missunderstanding ? > > Yes. _data is "an object which supports the buffer protocol". In this > context, it is effectively a string and thus the product of the total > number of elements and the itemsize. (We'll ignore for now the fact > that not every array uses the entire buffer.) In contrast, shape(.flat) > is only the total number of elements and is independent of itemsize. > > Regards, > Todd > > > > > ------------------------------------------------------- > This Newsletter Sponsored by: Macrovision > For reliable Linux application installations, use the industry's leading > setup authoring tool, InstallShield X. Learn more and evaluate > today. http://clk.atdmt.com/MSI/go/ins0030000001msi/direct/01/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From jmiller at stsci.edu Fri Oct 29 11:19:14 2004 From: jmiller at stsci.edu (Todd Miller) Date: Fri Oct 29 11:19:14 2004 Subject: [Numpy-discussion] Counting array elements Message-ID: <1099073854.4904.321.camel@localhost.localdomain> I have returned from our astronomical data systems conference and I am going to take a short cut and summarize what I saw as the key developments of this thread. I apologize for not responding sooner and individually but the web-mail system I use isn't effective for conducting any kind of discussion. You guys did a great job sorting this out this week. I marked my key points with **. The rest is probably only for people with a lot of patience. ** I've finally come to terms with the fact that functions are the right way to do numarray rather than methods. The arguments in the Numeric manual are no more persuasive now than they ever were, but Stephen Walton's remarks about method explosion finally convinced me what the "real" reason for doing functions is that using methods combines every new feature under the umbrella of a single namespace, the NumArray class. Using functions lets us partition things into modules which can be used selectively and makes a more extensible and understandable system. Thanks Stephen. A couple people remarked that using .flat might solve everything with something like a.flat.sum() or sum(ravel(a). This gets to the original motivation for the sum() method, which was the codification of a simple and storage efficient technique for reducing noncontiguous arrays. The first point is that a non-contiguous array cannot generally be reshaped without making a copy. The basic idea of the sum() method is to do *two* reductions, the first, along a single axis, results in a smaller contiguous array. In the case of astronomical images which are generally square or at least non-degenerate, the reduction result is a *much* smaller array. The second reduction handles all the remaining dimensions since .flat is guaranteed to work because the array is contiguous. The end result is a complete sum() without righting additional ufuncs or making an array copy. There was understandable confusion about why .flat is sometimes allowed to fail. Since it is an attribute, we thought it inappropriate to make it return a copy of the source array and chose instead to raise an exception. In contrast, it is reasonable for the ravel() function to return a completely different array, so it always works. (I just noticed that ravel() is not named flat()). Some of our more contemporary thinkers suggested using iterators to produce a .flat which always works. If anyone has an idea how to make this work with good performance, please let me know; I don't. ** Tim Hochberg pointed out that we can overload the reduction (and not accumulation?) axis parameter with an "all" or a tuple describing a sequence of axes to reduce along. My perception was that there was a consensus behind this and in any case I'm in agreement with Tim. Alan Isaac pointed out that None might be better here than "all" and I agree. At this point, I think sumAll() is dead, the sum() method will be deprecated, and the reductions should be expanded as Tim suggested. ** Peter Verveer made some comments about the expectations of a naive user regarding reductions, namely that "all" should be the default. My own experience bears this out, and I am torn about what to do here. Chris Barker pointed out the need for backward compatibility with Numeric, and given the current numarray goal of supporting SciPy, this need is growing stronger and more complex. SciPy uses yet another axis convention. If anyone has any ideas how to handle these multiple conventions with elegance, let me know. A number of people commented on our naming conventions, an issue which we have side stepped for the moment with sumAll(). My impression is that, for better or worse, numarray uses the lowerUpper() version of Camel case. I think this is very much a matter of personal taste and don't claim to have any. My guess is that numarray is probably inconsistent at the moment, in part because lowerUpper() often degenerates into merely lower() which degenerates into confusion. Regards, Todd From verveer at embl-heidelberg.de Sat Oct 30 08:39:28 2004 From: verveer at embl-heidelberg.de (Peter Verveer) Date: Sat Oct 30 08:39:28 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain> References: <1099073854.4904.321.camel@localhost.localdomain> Message-ID: > ** Peter Verveer made some comments about the expectations of a naive > user regarding reductions, namely that "all" should be the default. > My > own experience bears this out, and I am torn about what to do here. > Chris Barker pointed out the need for backward compatibility with > Numeric, and given the current numarray goal of supporting SciPy, > this > need is growing stronger and more complex. SciPy uses yet another axis > convention. If anyone has any ideas how to handle these multiple > conventions with elegance, let me know. Numarray should probably be either completely compatible in every small detail, or we could take the opportunity to change what we believe was the wrong choice. Not sure what is really best, although personally feel breaking compatibility is fine if the result is better. Is there not already a sub-package numeric within numarray that provides Numeric compatibility? Such a package could at least provide wrappers with compatible behavior for people who need that. Peter From tim.hochberg at cox.net Sat Oct 30 11:49:36 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sat Oct 30 11:49:36 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain> References: <1099073854.4904.321.camel@localhost.localdomain> Message-ID: <4183E208.6050001@cox.net> Todd Miller wrote: [SNIP] >** Tim Hochberg pointed out that we can overload the reduction (and not >accumulation?) > It seems possible. It's probably marginally useful at best. However, it might be worth doing if not too painful, just so that the accumulate and reduce signatures match. >axis parameter with an "all" or a tuple describing a >sequence of axes to reduce along. My perception was that there was a >consensus behind this and in any case I'm in agreement with Tim. Alan >Isaac pointed out that None might be better here than "all" and I >agree. > Using None to mean ALL seems a little perverse to me, but I'll grant that using an existing singleton makes things simpler. I'll just point out that it would also be possible to define an ALL singleton and use that. Very tangential: it's too bad that '...' can't be typed more places: the natural spelling for ALL is [...] as in: add.reduce(a, axis=[...]) Sadly, that won't work. >At this point, I think sumAll() is dead, the sum() method will >be deprecated, and the reductions should be expanded as Tim suggested. > >** Peter Verveer made some comments about the expectations of a naive >user regarding reductions, namely that "all" should be the default. My >own experience bears this out, and I am torn about what to do here. > > I suspect that one's experience here depends on your typical problem domain. If one does a lot 2D work ALL would seem to be the natural choice. If you use a lot of arrays of vectors, as I do, -1 is the natural choice. At this point I can't recall a case where ALL would have been the natural choice for me. In addition to backwards compatibility, one argument for not using ALL as the default is that it makes little sense or no sense for accumulate. Having the default for reduce be ALL, but that for accumulate be -1 (for instance) would be confusing. >Chris Barker pointed out the need for backward compatibility with >Numeric, > I'd think that the importance of backward compatibility with not just Numeric, but with Numarray itself has been underrated. Changing the default for reduce / sum is a particularly insiduous since many uses will fail silently, producing the wrong answer, but continuing to run. This means that all instances of sum, product and reduce will need to be inspected and corrected. Having 10k LOC that use Numarray, I'll be a bit irked if this gets changed without a better justification than what I've seen thus far. >and given the current numarray goal of supporting SciPy, this >need is growing stronger and more complex. SciPy uses yet another axis >convention. If anyone has any ideas how to handle these multiple >conventions with elegance, let me know. > > Could you describe the SciPy axis convention: I'm not familiar with it. [SNIP] -tim From gazzar at email.com Sun Oct 31 04:22:01 2004 From: gazzar at email.com (Gary Ruben) Date: Sun Oct 31 04:22:01 2004 Subject: [Numpy-discussion] vector cross product Message-ID: <20041031121856.E2DDC1CE304@ws1-6.us4.outblaze.com> Not that I have a really urgent need, but is there a reason that nice, fast C-based vector operations aren't implemented in Numeric or numarray? I notice Fernando Perez has a cross product as a useful SciPy weave example on his site. I've also seen comments elsewhere about Numpy's lack of a cross product. eg. I'm using Konrad Hinsen's Scientific Python for the convenience value of his Vector class, which also provides a nice angle() method but it bothers me that it's implemented in native Python. The Vector type in vpython probably does it 'properly', but I don't use it just for the convenience since it adds an extra dependency to my code. comments? Gary R. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From perry at stsci.edu Sun Oct 31 09:22:28 2004 From: perry at stsci.edu (Perry Greenfield) Date: Sun Oct 31 09:22:28 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain> Message-ID: Todd Miller wrote: > > There was understandable confusion about why .flat is sometimes allowed > to fail. Since it is an attribute, we thought it inappropriate to make > it return a copy of the source array and chose instead to raise an > exception. In contrast, it is reasonable for the ravel() function to > return a completely different array, so it always works. (I just > noticed that ravel() is not named flat()). Some of our more > contemporary thinkers suggested using iterators to produce a .flat which > always works. If anyone has an idea how to make this work with good > performance, please let me know; I don't. > This aspect of flat can be considered a wart. There are three different desired behaviors depending on who you talk to. For efficiency reasons, some only want flat (and even ravel) to work if the array is already contiguous; that is, they don't want copies unless they ask for them. Others want it to always work, producing a copy if necessary but otherwise for it to return a view. Yet others always want a copy. So, are three different versions needed? Or options to a function? The drawback of .flat (as an attribute) is there is only one choice for behavior. For a function (or a method) we could modify the behavior with a keyword argument. Personally, I would rather .flat always work, even if it means returning a copy. Is there any consensus on how this problem should be handled? > ** Peter Verveer made some comments about the expectations of a naive > user regarding reductions, namely that "all" should be the default. My > own experience bears this out, and I am torn about what to do here. > Chris Barker pointed out the need for backward compatibility with > Numeric, and given the current numarray goal of supporting SciPy, this > need is growing stronger and more complex. SciPy uses yet another axis > convention. If anyone has any ideas how to handle these multiple > conventions with elegance, let me know. > I find this issue particularly vexing as well. Let's be clear about this, scipy changes the behavior of Numeric to produce a new flavor. What should numarray do? Follow the scipy behavior or the Numeric behavior? Or should there be a scipy/numarray flavor vs the more Numeric compatible numarray? Note, we never intended numarray to be 100% compatible with Numeric since there were aspects we thought should be changed (e.g., scalar/array type coercions). Yet there appear to be two camps of the Numeric community. Some sort of survey may be in order here. Is scipy where all the new growth is now? Should we just adopt the axis convention used there? I'd very much prefer not proliferate any more flavors of behavior and just settle on one. > A number of people commented on our naming conventions, an issue which > we have side stepped for the moment with sumAll(). My impression is > that, for better or worse, numarray uses the lowerUpper() version of > Camel case. I think this is very much a matter of personal taste and > don't claim to have any. My guess is that numarray is probably > inconsistent at the moment, in part because lowerUpper() often > degenerates into merely lower() which degenerates into confusion. > How much of the public interface uses camelCase? I don't think all that much if any. It seems to me the inclination of scipy is to avoid it and I'm happy with that. The internal implementation is a different issue, and there I think Todd is right that it probably is somewhat inconsistent on that front. Perry From perry at stsci.edu Sun Oct 31 09:30:28 2004 From: perry at stsci.edu (Perry Greenfield) Date: Sun Oct 31 09:30:28 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: Message-ID: Peter Verveer wrote: > Numarray should probably be either completely compatible in every small > detail, or we could take the opportunity to change what we believe was Well, as I mentioned before having numarray match Numeric in every small detail is not going to happen (and even there, which flavor? the original Numeric or the scipy version?). We've been pretty clear about where incompatibilities were deliberate. But on the other hand, that leaves many other choices that could be revisited if enough people support them. The problem is that no matter what is done, I suspect some people are going to be inconvenienced since there is already (without numarray) a split in the community because of scipy. > the wrong choice. Not sure what is really best, although personally > feel breaking compatibility is fine if the result is better. Is there > not already a sub-package numeric within numarray that provides Numeric > compatibility? Such a package could at least provide wrappers with > compatible behavior for people who need that. > At the moment the numeric module provides more Numeric compatibility (but not complete). In matplotlib we use a module called numerix to provide a uniform interface to both Numeric and numerix (along with prohibitions on use of certain features that don't exist in the other). We are looking at scipy_base now that undoubtably will highlight similar cases where we will suggest internal reorganization to do the same sort of thing that was done for matplotlib. For those that intend to use numarray only now and forever, one is free to use all the features they desire. But there still is the behavior issue of those things that are currently incompatible like the axis issue. Perry From tim.hochberg at cox.net Sun Oct 31 14:24:01 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Sun Oct 31 14:24:01 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <4183F168.3060205@ucsd.edu> References: <1099073854.4904.321.camel@localhost.localdomain> <4183E208.6050001@cox.net> <4183F168.3060205@ucsd.edu> Message-ID: <418564AE.6050206@cox.net> Robert Kern wrote: > Tim Hochberg wrote: > >> Could you describe the SciPy axis convention: I'm not familiar with it. > > > axis=-1 OK, so Numarray (currently) and Numeric use axis=0, SciPy uses axis=-1 and there is some desire to use axis=ALL as instead. One advantage of ALL is that it breaks everyone's code equally, so there wouldn't be any charges of favoritism <0.8 wink>. I can't come up with any way to reconcile the three, but I can suggest a transition strategy whatever the decision. Supply an option so that one can require axis arguments to all calls to reduce. Then it's relatively easy to track down all the reduce calls and fix the ones that are broken. Something like numarray.setRequireReduceAxisArg(True). FWIW, it wouldn't bother me much to use SciPy's default here: supporting SciPy is a worthwhile goal and I think SciPy's choice here is a reasonable one. Another alternative that wouldn't bother me much is "In the face of ambiguity, refuse the temptation to guess". That is, always require axis arguments for multidimensional arrays. While not backwards compatible, this would make the transition relatively easy, since uses that might fail would raise exceptions. -tim From rkern at ucsd.edu Sun Oct 31 16:01:04 2004 From: rkern at ucsd.edu (Robert Kern) Date: Sun Oct 31 16:01:04 2004 Subject: [Numpy-discussion] Counting array elements In-Reply-To: <418564AE.6050206@cox.net> References: <1099073854.4904.321.camel@localhost.localdomain> <4183E208.6050001@cox.net> <4183F168.3060205@ucsd.edu> <418564AE.6050206@cox.net> Message-ID: <41857B53.5010308@ucsd.edu> Tim Hochberg wrote: > Robert Kern wrote: > >> Tim Hochberg wrote: >> >>> Could you describe the SciPy axis convention: I'm not familiar with it. >> >> axis=-1 > > OK, so Numarray (currently) and Numeric use axis=0, Well, sometimes. :-) > SciPy uses axis=-1 I should note that this convention is for Scipy-defined functions. With one unfortunate exception (cumsum), Scipy does not overwrite Numeric's axis default for Numeric-defined functions. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter