From faheem at email.unc.edu  Fri Oct  1 10:19:02 2004
From: faheem at email.unc.edu (Faheem Mitha)
Date: Fri Oct  1 10:19:02 2004
Subject: [Numpy-discussion] random number facilities in numarray and main
 Python libs
In-Reply-To: <982cfc7f.8876956d.8220100@expms6.cites.uiuc.edu>
References: <982cfc7f.8876956d.8220100@expms6.cites.uiuc.edu>
Message-ID: <Pine.LNX.4.61.0410011312170.22431@Chrestomanci>


On Fri, 1 Oct 2004, Bruce Southey wrote:

> Hi,
>
> I presume that you have R and can build the standalone library. I have
> attached my SWIG Smath.i , the SWIG Smath_wrap.c and the
> Smath.py files.  With these last two files, you shouldn't need SWIG.
>
> Note that I have not touched the void functions here as I have yet to check
> how these work in SWIG. Also, there are a few function in the R header that
> are only headers.  Eventually someone has to fixed these and add suitable
> documentation in some package.

I'm not sure what you mean by void functions.

> If you have SWIG you can directly use the Smath.i file - while SWIG can take
> a .h file directly it would not work in Python. So I just edited the header
> file into a .i file.
>
> The following is my process using Linux (I don't know about other platforms):
>
> 0) Have swig installed and built the R math library
> 1) $ swig -python Smath.i
> 2) $ gcc -c Smath_wrap.c -I/usr/local/include/python2.3
> -I/home/bsouthey/Rproject/R-1.9.1/src/nmath
> -I/home/bsouthey/Rproject/R-1.9.1/include
> 3) $ ld -shared Smath_wrap.o -o _Smath.so -lm -lRmath
> -L/home/bsouthey/Rproject/R-1.9.1/src/nmath/standalone
>
> Of course you must change the include (-I) and library (-L) paths to where
> python lives and standard alone Rmath library lives.

Thanks. I'm particularly interested in knowing how you interface with the 
random number generator at the top (Python) level. Can you supply an 
example?

Specifically, I'm looking for the following method.

1) When C/C++ code called, reads seed from python random state.

2) Does its stuff.

3) Writes seed back to python level when it exits.

R has this built it, but here one needs to build ones own mechanism. This 
is complicated by the fact that Numarray and the base Python random 
library use different RNG mechanisms, so one has to chose which one to 
use. Which one did you use?

                                                                 Faheem.


From jmiller at stsci.edu  Fri Oct  1 10:21:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Oct  1 10:21:04 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64]
Message-ID: <1096651226.9400.25.camel@halloween.stsci.edu>


-- 
-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 38
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041001/1dea7b34/attachment.mht>

From fccoelho at fiocruz.br  Fri Oct  1 13:06:10 2004
From: fccoelho at fiocruz.br (=?iso-8859-1?q?Fl=E1vio_Code=E7o_Coelho?=)
Date: Fri, 1 Oct 2004 17:06:10 +0000
Subject: [Matplotlib-users] warning: Numeric and amd64
Message-ID: <200410011706.10524.fccoelho@fiocruz.br>

Hi,

look at this:

>>> from RandomArray import *

>>> normal(2,2,10)
 array([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.])

This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit 
P4 and it ran fine.
Has anyone else seen this before?

For those that didn't understand, the normal function as called above,  is 
supposed to give me ten samples form a normal distribution with mean = 2 and 
standard deviation = 2

luckily:

>>> from numarray.random_array import *

>>> normal(2,2,10)
array([-0.04525638,  4.31467819, -0.17468357,  5.29377031,  0.84202135,
        5.29593539,  4.69651532,  1.61354655,  1.10839236,  1.7743317 ])

If anybody still needed a reason for switching to numarray, there you go!

I anybody here subscribes the numeric or numarray mailing lists (i.e. if they 
even exist) could you please forward this message to them?

Flavio


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

--=-f+ARSKyzBPwKnxDSn4zh--


From jdhunter at ace.bsd.uchicago.edu  Fri Oct  1 10:33:02 2004
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Fri Oct  1 10:33:02 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
 and amd64]
In-Reply-To: <1096651226.9400.25.camel@halloween.stsci.edu> (Todd Miller's
 message of "01 Oct 2004 13:20:26 -0400")
References: <1096651226.9400.25.camel@halloween.stsci.edu>
Message-ID: <m2vfdu1j96.fsf@mother.paradise.lost>

>>>>> "Todd" == Todd Miller <jmiller at stsci.edu> writes:

    >>>> from RandomArray import *

    >>>> normal(2,2,10)
    Todd>  array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.])

I get this too on a 64bit Opteron 250.

The root of the problem appears to be

  >>> from RandomArray import standard_normal
  >>> standard_normal(10)
  array([  5.31046164e-315,   1.57997427e-314,   5.16421382e-315,   5.22924144e-315,
              1.59247813e-314,   1.58920141e-314,   5.23691141e-315,
              5.24305935e-315,   5.20686204e-315,   1.58739568e-314])


But MLab.randn, which uses a different approach, works fine.

I've have this gnawing feeling I've seen this before, but I can't
remember ....

JDH


From a.schmolck at gmx.net  Fri Oct  1 11:34:01 2004
From: a.schmolck at gmx.net (Alexander Schmolck)
Date: Fri Oct  1 11:34:01 2004
Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is
 present
In-Reply-To: <4159BCA5.6090101@colorado.edu> (Fernando Perez's message of
 "Tue, 28 Sep 2004 13:33:57 -0600")
References: <4159BCA5.6090101@colorado.edu>
Message-ID: <yfs1xgil25h.fsf@black4.ex.ac.uk>

Fernando Perez <Fernando.Perez at colorado.edu> writes:

> Hi all,
>
> I found something today a bit unpleasant: if you install numeric without
> any BLAS support, 'matrixmultiply is dot==True', so they are fully
> interchangeable.  However, to my surprise, if you build numeric with the blas
> optimizations, they are NOT identical.  

Oops, my bad (I submitted the patch and while pretty much all the real coding
was done by Richard Everson this is my oversight).

> The reason is a bug in Numeric.py. After defining dot, the code reads:
>
> #This is obsolete, don't use in new code
> matrixmultiply = dot

On the other hand, it gently nudges people to no longer use the obsoleted
matrixmultiply ;)


> In [4]: timing 1,dot,a,b
> ------> timing(1,dot,a,b)
> Out[4]: 0.55591500000000005
>
> In [5]: timing 1,matrixmultiply,a,b
> ------> timing(1,matrixmultiply,a,b)
> Out[5]: 68.142640999999998
>
> In [6]: _/__
> Out[6]: 122.57744619231356
>
> Pretty significant difference...

Yup, someone should incorporate optional atlas dot support into numarray if it
hasn't happened already (won't be me, IIRC it took some convincing to get this
into Numeric and I won't be using numarray for anything real in the near
future).

cheers,

alex


From stephen.walton at csun.edu  Fri Oct  1 11:37:01 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Fri Oct  1 11:37:01 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
	and amd64]
In-Reply-To: <m2vfdu1j96.fsf@mother.paradise.lost>
References: <1096651226.9400.25.camel@halloween.stsci.edu>
	 <m2vfdu1j96.fsf@mother.paradise.lost>
Message-ID: <1096655567.2678.2.camel@localhost.localdomain>

On Fri, 2004-10-01 at 09:43, John Hunter wrote:

> The root of the problem appears to be
> 
>   >>> from RandomArray import standard_normal
>   >>> standard_normal(10)
>   array([  5.31046164e-315,   1.57997427e-314,
> I've have this gnawing feeling I've seen this before, but I can't
> remember ....

Those values look suspiciously like what one sees if one reads a
big-endian Float as little-endian or vice versa.  I saw similar numbers
recently when using pytables on a big-endian HDF5 (which generated a bug
report for numarray if you recall).

Is the Opteron big-endian?


From stephen.walton at csun.edu  Fri Oct  1 11:40:01 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Fri Oct  1 11:40:01 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
	and amd64]
In-Reply-To: <m2vfdu1j96.fsf@mother.paradise.lost>
References: <1096651226.9400.25.camel@halloween.stsci.edu>
	 <m2vfdu1j96.fsf@mother.paradise.lost>
Message-ID: <1096655567.2678.3.camel@localhost.localdomain>

On Fri, 2004-10-01 at 09:43, John Hunter wrote:

> The root of the problem appears to be
> 
>   >>> from RandomArray import standard_normal
>   >>> standard_normal(10)
>   array([  5.31046164e-315,   1.57997427e-314,
> I've have this gnawing feeling I've seen this before, but I can't
> remember ....

Those values look suspiciously like what one sees if one reads a
big-endian Float as little-endian or vice versa.  I saw similar numbers
recently when using pytables on a big-endian HDF5 (which generated a bug
report for numarray if you recall).

Is the Opteron big-endian?


From stephen.walton at csun.edu  Fri Oct  1 11:43:06 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Fri Oct  1 11:43:06 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
	and amd64]
In-Reply-To: <m2vfdu1j96.fsf@mother.paradise.lost>
References: <1096651226.9400.25.camel@halloween.stsci.edu>
	 <m2vfdu1j96.fsf@mother.paradise.lost>
Message-ID: <1096655567.2678.4.camel@localhost.localdomain>

On Fri, 2004-10-01 at 09:43, John Hunter wrote:

> The root of the problem appears to be
> 
>   >>> from RandomArray import standard_normal
>   >>> standard_normal(10)
>   array([  5.31046164e-315,   1.57997427e-314,
> I've have this gnawing feeling I've seen this before, but I can't
> remember ....

Those values look suspiciously like what one sees if one reads a
big-endian Float as little-endian or vice versa.  I saw similar numbers
recently when using pytables on a big-endian HDF5 (which generated a bug
report for numarray if you recall).

Is the Opteron big-endian?


From Fernando.Perez at colorado.edu  Fri Oct  1 11:51:00 2004
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Fri Oct  1 11:51:00 2004
Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present
In-Reply-To: <yfs1xgil25h.fsf@black4.ex.ac.uk>
References: <4159BCA5.6090101@colorado.edu> <yfs1xgil25h.fsf@black4.ex.ac.uk>
Message-ID: <415DA6D7.4070407@colorado.edu>

Alexander Schmolck schrieb:
> Fernando Perez <Fernando.Perez at colorado.edu> writes:
> 
> 
>>Hi all,
>>
>>I found something today a bit unpleasant: if you install numeric without
>>any BLAS support, 'matrixmultiply is dot==True', so they are fully
>>interchangeable.  However, to my surprise, if you build numeric with the blas
>>optimizations, they are NOT identical.  
> 
> 
> Oops, my bad (I submitted the patch and while pretty much all the real coding
> was done by Richard Everson this is my oversight).

No prob.  It's been fixed in Numeric 23.5, so no more worries.

>>Pretty significant difference...
> 
> 
> Yup, someone should incorporate optional atlas dot support into numarray if it
> hasn't happened already (won't be me, IIRC it took some convincing to get this
> into Numeric and I won't be using numarray for anything real in the near
> future).

I'll leave that question to the numarray guys, I have no idea where it stands 
in terms of blas/atlas support.  I certainly hope it has it or that this 
optimization can be brought in, as it makes a huge difference for the large 
array case.

Best,

f


From perry at stsci.edu  Fri Oct  1 11:57:02 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Oct  1 11:57:02 2004
Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present
In-Reply-To: <415DA6D7.4070407@colorado.edu>
References: <4159BCA5.6090101@colorado.edu> <yfs1xgil25h.fsf@black4.ex.ac.uk> <415DA6D7.4070407@colorado.edu>
Message-ID: <52083A9C-13DB-11D9-B931-000A95B68E50@stsci.edu>

On Oct 1, 2004, at 2:49 PM, Fernando Perez wrote:

> Alexander Schmolck schrieb:
>>> Pretty significant difference...
>> Yup, someone should incorporate optional atlas dot support into 
>> numarray if it
>> hasn't happened already (won't be me, IIRC it took some convincing to 
>> get this
>> into Numeric and I won't be using numarray for anything real in the 
>> near
>> future).
>
> I'll leave that question to the numarray guys, I have no idea where it 
> stands in terms of blas/atlas support.  I certainly hope it has it or 
> that this optimization can be brought in, as it makes a huge 
> difference for the large array case.
>
> Best,
>
> f
I'm not sure when it will get done, but we are working on the early 
stages of getting
scipy working with numarray. You should see visible signs of that 
within a month
(i.e., at least some parts of scipy working with numarray). It will 
probably take
months to finish though.

Perry


From pearu at cens.ioc.ee  Fri Oct  1 12:44:58 2004
From: pearu at cens.ioc.ee (Pearu Peterson)
Date: Fri Oct  1 12:44:58 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
 and amd64]
In-Reply-To: <1096651226.9400.25.camel@halloween.stsci.edu>
Message-ID: <Pine.LNX.4.21.0410012234160.9973-100000@cens.kybi>

On 1 Oct 2004, Todd Miller wrote:

> look at this:
>
> >>> from RandomArray import *
>
> >>> normal(2,2,10)
>  array([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.])
>
> This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a
> 32bit P4 and it ran fine.
> Has anyone else seen this before?

Yes. I just fixed a similar issue in scipy.stats.rand module. Below is the
corresponding patch for Numeric Src/ranlibmodule.c that fixes the issue
for Opteron.

Regards,
Pearu

*** ranlibmodule.c      Fri Oct  1 22:29:57 2004
--- ranlibmodule.c.orig Fri Oct  1 22:12:13 2004
***************
*** 47,49 ****
      case 0:
!       *out_ptr = (double) ((float (*)(void)) fun)();
        break;
--- 47,49 ----
      case 0:
!       *out_ptr = (double) ((double (*)()) fun)();
        break;
***************
*** 81,83 ****
    case 1:
!     if( !PyArg_ParseTuple(args, "lf|i", &int_arg, &float_arg, &n) ) {
        return NULL;
--- 81,83 ----
    case 1:
!     if( !PyArg_ParseTuple(args, "if|i", &int_arg, &float_arg, &n) ) {
        return NULL;
***************
*** 213,215 ****
  
!   if( !PyArg_ParseTuple(args, "lO|i", &num_trials, &priors_object, &n) ) {
      return NULL;
--- 213,215 ----
  
!   if( !PyArg_ParseTuple(args, "iO|i", &num_trials, &priors_object, &n) ) {
      return NULL;


From jmiller at stsci.edu  Fri Oct  1 13:35:07 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Oct  1 13:35:07 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
	and amd64]
In-Reply-To: <Pine.LNX.4.21.0410012234160.9973-100000@cens.kybi>
References: <Pine.LNX.4.21.0410012234160.9973-100000@cens.kybi>
Message-ID: <1096662489.15037.1.camel@halloween.stsci.edu>

Thanks Pearu.

For some unknown reason, numarray.random_array already had the fixes, 
but I applied the patch to Numeric CVS.

Regards,
Todd

On Fri, 2004-10-01 at 15:38, Pearu Peterson wrote:
> On 1 Oct 2004, Todd Miller wrote:
> 
> > look at this:
> >
> > >>> from RandomArray import *
> >
> > >>> normal(2,2,10)
> >  array([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.])
> >
> > This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a
> > 32bit P4 and it ran fine.
> > Has anyone else seen this before?
> 
> Yes. I just fixed a similar issue in scipy.stats.rand module. Below is the
> corresponding patch for Numeric Src/ranlibmodule.c that fixes the issue
> for Opteron.
> 
> Regards,
> Pearu
> 
> *** ranlibmodule.c      Fri Oct  1 22:29:57 2004
> --- ranlibmodule.c.orig Fri Oct  1 22:12:13 2004
> ***************
> *** 47,49 ****
>       case 0:
> !       *out_ptr = (double) ((float (*)(void)) fun)();
>         break;
> --- 47,49 ----
>       case 0:
> !       *out_ptr = (double) ((double (*)()) fun)();
>         break;
> ***************
> *** 81,83 ****
>     case 1:
> !     if( !PyArg_ParseTuple(args, "lf|i", &int_arg, &float_arg, &n) ) {
>         return NULL;
> --- 81,83 ----
>     case 1:
> !     if( !PyArg_ParseTuple(args, "if|i", &int_arg, &float_arg, &n) ) {
>         return NULL;
> ***************
> *** 213,215 ****
>   
> !   if( !PyArg_ParseTuple(args, "lO|i", &num_trials, &priors_object, &n) ) {
>       return NULL;
> --- 213,215 ----
>   
> !   if( !PyArg_ParseTuple(args, "iO|i", &num_trials, &priors_object, &n) ) {
>       return NULL;
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 


From faheem at email.unc.edu  Fri Oct  1 22:28:41 2004
From: faheem at email.unc.edu (Faheem Mitha)
Date: Fri Oct  1 22:28:41 2004
Subject: [Numpy-discussion] numarray.random_array number generation in C code
Message-ID: <Pine.LNX.4.61.0410020003050.30139@Chrestomanci>

Dear People,

I want to write some C++ code to link with Python, using the 
Boost.Python interface. I need to generate random numbers in the C++ 
code, and I was wondering as to the best way of doing this.

Note that it is important that the random number generation interoperate 
seamlessly with Python, in the sense that the behavior of the calls to 
the RNG is the same whether calls are made at the C level or the Python 
level. I hope the reasons why this is important are obvious.

I was thinking that the method should go like this.

1) When C/C++ code called, reads seed from python random state.

2) Does its stuff.

3) Writes seed back to python level when it exits.

After doing a little investigation of the numarray.random_array python 
library and associated extension modules, it seems possible that the 
answer is simpler than I had supposed. However, I would appreciate it if 
someone would tell me if my understanding is incorrect in some places.

Summary: It seems that I can just call all the C entry point routines 
defined in ranlib.h, without worrying about getting or setting seeds.

Rationale:

The structure of this random number facility has three parts, all files in 
Packages/RandomArray2/Src.

1) low-level C routines: Packages/RandomArray2/Src/com.c and 
Packages/RandomArray2/Src/ranlib.c.

com.c: basic RNG stuff; getting and setting seeds etc.
ranlib.c: Random number generator algorithms for different distributions 
etc.

2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c.

This interfaces the stuff in com.c and ranlib.c.

3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py.

This wraps the C interface. In most cases it does not do much else besides 
some basic argument error checking.

>From my perspective, the important thing is that the random number seed is 
only defined at C level as a static object, all the RNG stuff happens at C 
level, and the Python code just calls the C code as necessary. (I'm 
sketchy about the details of what is defined as the seed etc.)

This is in contrast with the R RNG facility (the only other RNG facility I 
am familiar with), which uses macros SetRNGstate() and GetRNGstate() to 
read and write the seed, which is defined at R level.

Therefore, the upshot is that the C routines in ranlib.h read and write 
the same seed as the python level functions do, so no special action is 
necessary with regard to the seed.

Is this correct?

In any case, it would be nice if something like the above was documented, 
so lost souls like myself don't have to go trawling through the source 
code to figure out what is going on. Of course it is nice that the source 
code is available, otherwise even that would be impossible.

R documents this stuff in the "Writing R Extensions" manual, online at 
http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray 
manual could have a small section about this too.

                                                         Regards, Faheem.


From fccoelho at gmail.com  Mon Oct  4 07:59:12 2004
From: fccoelho at gmail.com (Flavio Coelho)
Date: Mon Oct  4 07:59:12 2004
Subject: [Numpy-discussion] Bug Compiling Numeric on amd64
Message-ID: <d9af7a8804100407482601cae4@mail.gmail.com>

Hi,

look at this:

>>> from RandomArray import *

>>> normal(2,2,10)
 array([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.])

This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit 
P4 and it ran fine.
Has anyone else seen this before?


luckily:

>>> from numarray.random_array import *

>>> normal(2,2,10)
array([-0.04525638,  4.31467819, -0.17468357,  5.29377031,  0.84202135,
        5.29593539,  4.69651532,  1.61354655,  1.10839236,  1.7743317 ])

Both modules were compiled on my gentoo box with:

gcc version 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)

any comments?

Flavio
-- 
I use Linux daily to UP my productivity -- Microsoft, UP yours!


From jmiller at stsci.edu  Mon Oct  4 09:21:32 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Oct  4 09:21:32 2004
Subject: [Numpy-discussion] Bug Compiling Numeric on amd64
In-Reply-To: <d9af7a8804100407482601cae4@mail.gmail.com>
References: <d9af7a8804100407482601cae4@mail.gmail.com>
Message-ID: <1096906220.7641.55.camel@localhost.localdomain>

On Mon, 2004-10-04 at 10:48, Flavio Coelho wrote:
> Hi,
> 
> look at this:
> 
> >>> from RandomArray import *
> 
> >>> normal(2,2,10)
>  array([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.])
> 
> This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit 
> P4 and it ran fine.
> Has anyone else seen this before?
> 

This was discussed here briefly last week after I forwarded your post
from matplotlib-users.  Pearu Peterson posted a patch which he had
already performed for SciPy and I applied it to Numeric on Source
Forge.  Thanks for raising the issue.

Regards,
Todd

> 
> luckily:
> 
> >>> from numarray.random_array import *
> 
> >>> normal(2,2,10)
> array([-0.04525638,  4.31467819, -0.17468357,  5.29377031,  0.84202135,
>         5.29593539,  4.69651532,  1.61354655,  1.10839236,  1.7743317 ])
> 
> Both modules were compiled on my gentoo box with:
> 
> gcc version 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)
> 
> any comments?
> 
> Flavio
-- 


From Fernando.Perez at colorado.edu  Mon Oct  4 10:59:49 2004
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Mon Oct  4 10:59:49 2004
Subject: [Numpy-discussion] Small bug in MA with arrays of rank > 1
Message-ID: <41618DFD.7030106@colorado.edu>

Hi all,

a while back I noticed a small problem with MA for rank 2 (and larger) arrays. 
  Here's a simple example:

In [1]: a=RA.random((3,3))

In [2]: a
Out[2]:
array([[ 0.002542,  0.70301 ,  0.705466],
        [ 0.467305,  0.381492,  0.655857],
        [ 0.103372,  0.776988,  0.466528]])

In [3]: import MA

In [4]: a
Out[4]:
[[ 0.002542, 0.70301 , 0.705466,]
  [ 0.467305, 0.381492, 0.655857,]
  [ 0.103372, 0.776988, 0.466528,]]

The bug is that the commas at the end of each line are coming _before_ the 
closing bracket, instead of after.  This seemingly trivial problem turns out 
to be pretty serious for me, because I use this string representation to 
export python arrays into Mathematica files, by simply replacing [] with {} 
(and playing some other tricks).

Unfortunately, this bug means I can't use MA, which is otherwise great because 
of the way it gracefully handles the case where you accidentally say

A

when A is some monster array.  With MA, instead of your CPU getting killed for 
10 minutes, you get a nice summary of A's dimensions and typecode.

Anyway, it would be great if one of the gurus had a chance to fix this one.

Best,

f


From graik at web.de  Tue Oct  5 10:44:13 2004
From: graik at web.de (Raik =?iso-8859-1?q?Gr=FCnberg?=)
Date: Tue Oct  5 10:44:13 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
Message-ID: <200410051941.29807.graik@web.de>

Hi there,

I've just translated a package for molecular modelling, which makes extensive 
use of Numeric, from Numeric to numarray. The outcome is somewhat negative - 
for now we are basically going to postpone the transition - the reasons might 
be interesting for the list and the numarray developpers out there (who are 
doing a brave job!).

Speed:
A typical task in our package is the least-square fitting of a large array of 
coordinate frames ( N1 x N2 x 3) onto a set of reference or average 
coordinates (using a sub-set of coordinates for the matching). The example I 
looked at (500 x 876 x 3 items) took 1.3 s with Numeric and 4.7 s with 
numarray. The main culprits for the slow-down were:
* compress() - factor 10
* average() - factor 7 (average() is missing from Numeric and I hence had to 
write a little function myself)
* LinearAlgebra.singular_value_decomposition() - factor 10
but a lot of extra time is also spent in uufunc.py and various numarraycore.py 
routines.

Memory efficiency:
I hoped numarray would solve some of the Out-of-memory problems that I get 
with Numeric but it turns out that it is rather less memory efficient for my 
kind of applications. Slicing an array that takes up 800MB on disc just about 
runs through with Numeric (and heavy swapping) but gives an Out-of-memory 
with numarray.

Suggestions:
OK, it's easy to make clever comments without contributing any real work...
- compress(), take(), etc, really need some optimization
- a C-coded average() routine would be helpful
- faster LinearAlgebra routines are necessary

Our sysadmin noted that unlike Numeric, numarray is not using any external 
math libraries (like LAPACK) that have been speed-optimized for decades and 
are available in CPU-optimized variants (e.g. ATLAS). It's probably difficult 
to match this efficiency with any new code ...

Greetings
Raik

PS:
I didn't find any useful HowTo for the translation from Numeric to numarray. 
The practical issues were the different nonzero() return value, the more 
restrictive boolean comparison, that take doesn't support 'O' arrays any 
longer, and the missing average().

-- 
-----------------------------------------------------
Raik Gr?nberg		| Bioinformatique Structurale
				| Institut Pasteur
				| Paris, France
-----------------------------------------------------


From southey at uiuc.edu  Tue Oct  5 11:33:27 2004
From: southey at uiuc.edu (Bruce Southey)
Date: Tue Oct  5 11:33:27 2004
Subject: [Numpy-discussion] numarray.random_array
 number generation in C code
Message-ID: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu>

Hi,  
It is rather hard to suggest anything without more detail on what you want to 
actually do.  As you describe it, why do you need the 'seed' returned? It 
would only make sense if you were going in and out of Python multiple times - 
a somewhat undesirable situation due to the overhead costs.  
  
I see at least three options:  
1) Do everything in Python/numarray. 
 
2) Do parts in Python and the other in C/C++. 
   For example, pass a matrix of random numbers to your code from Python. The 
'seed' never needs to leave Python.   
 
3) Do it all in C/C++ - pass the 'seed' into your code that includes the 
random number generator(s) - there is C/C++ code around for this. Do you stuff 
and then return the 'seed' back with whatever else is required.  
  
You can email me privately if you want. 
 
Bruce 
  
   
---- Original message ----  
>Date: Sat, 2 Oct 2004 01:23:21 -0400 (EDT)  
>From: Faheem Mitha <faheem at email.unc.edu>    
>Subject: [Numpy-discussion] numarray.random_array number generation in C code    
>To: numpy-discussion <numpy-discussion at lists.sourceforge.net>  
>  
>  
>Dear People,  
>  
>I want to write some C++ code to link with Python, using the   
>Boost.Python interface. I need to generate random numbers in the C++   
>code, and I was wondering as to the best way of doing this.  
>  
>Note that it is important that the random number generation interoperate   
>seamlessly with Python, in the sense that the behavior of the calls to   
>the RNG is the same whether calls are made at the C level or the Python   
>level. I hope the reasons why this is important are obvious.  
>  
>I was thinking that the method should go like this.  
>  
>1) When C/C++ code called, reads seed from python random state.  
>  
>2) Does its stuff.  
>  
>3) Writes seed back to python level when it exits.  
>  
>After doing a little investigation of the numarray.random_array python   
>library and associated extension modules, it seems possible that the   
>answer is simpler than I had supposed. However, I would appreciate it if   
>someone would tell me if my understanding is incorrect in some places.  
>  
>Summary: It seems that I can just call all the C entry point routines   
>defined in ranlib.h, without worrying about getting or setting seeds.  
>  
>Rationale:  
>  
>The structure of this random number facility has three parts, all files in   
>Packages/RandomArray2/Src.  
>  
>1) low-level C routines: Packages/RandomArray2/Src/com.c and   
>Packages/RandomArray2/Src/ranlib.c.  
>  
>com.c: basic RNG stuff; getting and setting seeds etc.  
>ranlib.c: Random number generator algorithms for different distributions   
>etc.  
>  
>2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c.  
>  
>This interfaces the stuff in com.c and ranlib.c.  
>  
>3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py.  
>  
>This wraps the C interface. In most cases it does not do much else besides   
>some basic argument error checking.  
>  
>From my perspective, the important thing is that the random number seed is   
>only defined at C level as a static object, all the RNG stuff happens at C   
>level, and the Python code just calls the C code as necessary. (I'm   
>sketchy about the details of what is defined as the seed etc.)  
>  
>This is in contrast with the R RNG facility (the only other RNG facility I   
>am familiar with), which uses macros SetRNGstate() and GetRNGstate() to   
>read and write the seed, which is defined at R level.  
>  
>Therefore, the upshot is that the C routines in ranlib.h read and write   
>the same seed as the python level functions do, so no special action is   
>necessary with regard to the seed.  
>  
>Is this correct?  
>  
>In any case, it would be nice if something like the above was documented,   
>so lost souls like myself don't have to go trawling through the source   
>code to figure out what is going on. Of course it is nice that the source   
>code is available, otherwise even that would be impossible.  
>  
>R documents this stuff in the "Writing R Extensions" manual, online at   
>http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray   
>manual could have a small section about this too.  
>  
>                                                         Regards, Faheem.  
>  
>  
>  
>-------------------------------------------------------  
>This SF.net email is sponsored by: IT Product Guide on ITManagersJournal  
>Use IT products in your business? Tell us what you think of them. Give us  
>Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more  
>http://productguide.itmanagersjournal.com/guidepromo.tmpl  
>_______________________________________________  
>Numpy-discussion mailing list  
>Numpy-discussion at lists.sourceforge.net  
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion  
  
 
From stephen.walton at csun.edu  Tue Oct  5 12:20:01 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Tue Oct  5 12:20:01 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <200410051941.29807.graik@web.de>
References: <200410051941.29807.graik@web.de>
Message-ID: <1097003873.13715.17.camel@freyer.sfo.csun.edu>

On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote:

> Our sysadmin noted that unlike Numeric, numarray is not using any external 
> math libraries (like LAPACK) that have been speed-optimized for decades and 
> are available in CPU-optimized variants (e.g. ATLAS). It's probably difficult 
> to match this efficiency with any new code ...

This is a key point.  Have a look at addons.py in numarray, some
previous comments on this list, and build numarray with the line

env USE_LAPACK=1 python setup.py build

after editing addons.py appropriately.  You should see a major speed
improvement.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041005/24ffd730/attachment.sig>

From dd55 at cornell.edu  Tue Oct  5 13:02:01 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Tue Oct  5 13:02:01 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <1097003873.13715.17.camel@freyer.sfo.csun.edu>
References: <200410051941.29807.graik@web.de> <1097003873.13715.17.camel@freyer.sfo.csun.edu>
Message-ID: <200410051600.38254.dd55@cornell.edu>

On Tuesday 05 October 2004 03:17 pm, Stephen Walton wrote:
> On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote:
> > Our sysadmin noted that unlike Numeric, numarray is not using any
> > external math libraries (like LAPACK) that have been speed-optimized for
> > decades and are available in CPU-optimized variants (e.g. ATLAS). It's
> > probably difficult to match this efficiency with any new code ...
>
> This is a key point.  Have a look at addons.py in numarray, some
> previous comments on this list, and build numarray with the line
>
> env USE_LAPACK=1 python setup.py build
>
> after editing addons.py appropriately.  You should see a major speed
> improvement.

I would kindly suggest updating the numarray documentation. In the section on 
installation, it is easy to overlook the option to compile againist existing 
libraries. That is explained in section 16, which appears to be out of date. 
The code listed in Packages/LinearAlgebra2/setup.py has been moved to 
addons.py, correct?

-- 

Darren


From jmiller at stsci.edu  Tue Oct  5 13:37:42 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Oct  5 13:37:42 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <200410051600.38254.dd55@cornell.edu>
References: <200410051941.29807.graik@web.de>
	 <1097003873.13715.17.camel@freyer.sfo.csun.edu>
	 <200410051600.38254.dd55@cornell.edu>
Message-ID: <1097008567.27149.140.camel@halloween.stsci.edu>

On Tue, 2004-10-05 at 16:00, Darren Dale wrote:
> On Tuesday 05 October 2004 03:17 pm, Stephen Walton wrote:
> > On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote:
> > > Our sysadmin noted that unlike Numeric, numarray is not using any
> > > external math libraries (like LAPACK) that have been speed-optimized for
> > > decades and are available in CPU-optimized variants (e.g. ATLAS). It's
> > > probably difficult to match this efficiency with any new code ...
> >
> > This is a key point.  Have a look at addons.py in numarray, some
> > previous comments on this list, and build numarray with the line
> >
> > env USE_LAPACK=1 python setup.py build
> >
> > after editing addons.py appropriately.  You should see a major speed
> > improvement.
> 
> I would kindly suggest updating the numarray documentation. 

Thanks, will do.

> In the section on 
> installation, it is easy to overlook the option to compile againist existing 
> libraries. That is explained in section 16, which appears to be out of date. 
> The code listed in Packages/LinearAlgebra2/setup.py has been moved to 
> addons.py, correct?

That's correct.

Regards,
Todd


From faheem at email.unc.edu  Tue Oct  5 15:44:36 2004
From: faheem at email.unc.edu (Faheem Mitha)
Date: Tue Oct  5 15:44:36 2004
Subject: [Numpy-discussion] numarray.random_array number generation in
 C code
In-Reply-To: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu>
References: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu>
Message-ID: <Pine.LNX.4.61.0410051448220.16160@Chrestomanci>


On Tue, 5 Oct 2004, Bruce Southey wrote:

> Hi,
> It is rather hard to suggest anything without more detail on what you want to
> actually do.

I could give you more details if you were interested.

> As you describe it, why do you need the 'seed' returned? It would only 
> make sense if you were going in and out of Python multiple times - a 
> somewhat undesirable situation due to the overhead costs.

Not really. One might (and I frequently do) want to run the same function 
(which in this case might be all in C++ code), interactively with 
different parameters. The kind of thing that I'm doing is akin to 
exploratory data analysis, and the specific code in question is a 
stochastic search algorithm. Doing all this in C++ would not be very 
interactive. Also, one often wants to postprocess data output using Python 
scripts. This involves multiple calls to C++ code, and would be impossible 
to do using C++, since one has to call other Python libraries.

  > I see at least three options:

> 1) Do everything in Python/numarray.

That's my current situation.

> 2) Do parts in Python and the other in C/C++.
>   For example, pass a matrix of random numbers to your code from Python. The
> 'seed' never needs to leave Python.

This doesn't work very well unless you know in advance how many random 
numbers are needed (not the case, for example, for stochastic search 
algorithms), and in any case is a rather clumsy way to do things. No 
offense intended.

> 3) Do it all in C/C++ - pass the 'seed' into your code that includes the
> random number generator(s) - there is C/C++ code around for this. Do you stuff
> and then return the 'seed' back with whatever else is required.

Yes, but part of the point of mixed programming is that you have an 
interpreted front end which can easily hook into other routines. Also, in 
this case, you would not be passing the seed in, since there is nothing to 
pass it in from. One would simply call system time or something similar to 
obtain the seed.

> You can email me privately if you want.

I'll keep sending this to the list unless someone objects, since I think 
this is of some general interest.

Really, my main question was to whether my understanding of how to use the 
Numarray random number facilities in C was correct or not.

                                                                Faheem.


From stephen.walton at csun.edu  Tue Oct  5 16:15:31 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Tue Oct  5 16:15:31 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <d9af7a88041005160040938eeb@mail.gmail.com>
References: <200410051941.29807.graik@web.de>
	 <1097003873.13715.17.camel@freyer.sfo.csun.edu>
	 <d9af7a88041005160040938eeb@mail.gmail.com>
Message-ID: <1097018077.22092.15.camel@freyer.sfo.csun.edu>

On Tue, 2004-10-05 at 16:00, Flavio Coelho wrote:
> I wrote 
> > env USE_LAPACK=1 python setup.py build
> > 
> > after editing addons.py appropriately.  You should see a major speed
> > improvement.
> > 
>  
> 
> If that is the case, why is it not the default?, at least when LAPACK
> is installed?

Well, I won't pretend to speak for the developers on this one.  But I
strongly suspect it is just too hard to find all possible LAPACK
distributions;  the default numarray setup should be self contained even
if somewhat slower.  The current version of Numeric also defaults to its
own built-in BLAS and requires editing setup.py to use a different one.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041005/88461001/attachment.sig>

From perry at stsci.edu  Tue Oct  5 17:30:58 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Oct  5 17:30:58 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <1097018077.22092.15.camel@freyer.sfo.csun.edu>
Message-ID: <NEBBIJKBMLDBLNCEEFOCEEOJFHAA.perry@stsci.edu>

Steve Walton wrote:
> On Tue, 2004-10-05 at 16:00, Flavio Coelho wrote:
> > I wrote 
> > > env USE_LAPACK=1 python setup.py build
> > > 
> > > after editing addons.py appropriately.  You should see a major speed
> > > improvement.
> > > 
> >  
> > 
> > If that is the case, why is it not the default?, at least when LAPACK
> > is installed?
> 
> Well, I won't pretend to speak for the developers on this one.  But I
> strongly suspect it is just too hard to find all possible LAPACK
> distributions;  the default numarray setup should be self contained even
> if somewhat slower.  The current version of Numeric also defaults to its
> own built-in BLAS and requires editing setup.py to use a different one.
> 
Well, it's been a while, and Todd handled that aspect of porting those
from Numeric, but if I recall correctly, the situation was the same
there, and I think Steve is correct. It was to provide the basic 
functionality as part of the distribution without requiring other
installations. If you needed better performance, you jump through a
couple more hoops. But requiring it to use LAPACK makes life more difficult
for those who were looking for a self contained and easy to install
solution.

Perry


From perry at stsci.edu  Tue Oct  5 17:40:51 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Oct  5 17:40:51 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <200410051941.29807.graik@web.de>
Message-ID: <NEBBIJKBMLDBLNCEEFOCKEOJFHAA.perry@stsci.edu>

I hadn't seen this until now. It's hard for us to understand
exactly the reasons for the slower performance with such large
arrays. Could you send us the code and an indication of the
what inputs and parameters were used so we could try to figure
out why some of these problems exist (we can check the specific
functions you mention, but I want to make sure you aren't
iterating over array slices or such). It's not obvious to
me why you are having out of memory errors and this may help.

Perry Greenfield

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net
> [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Raik
> Gr?nberg
> Sent: Tuesday, October 05, 2004 1:41 PM
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Numeric to numarray experiences
>
>
> Hi there,
>
> I've just translated a package for molecular modelling, which
> makes extensive
> use of Numeric, from Numeric to numarray. The outcome is somewhat
> negative -
> for now we are basically going to postpone the transition - the
> reasons might
> be interesting for the list and the numarray developpers out
> there (who are
> doing a brave job!).
>
> Speed:
> A typical task in our package is the least-square fitting of a
> large array of
> coordinate frames ( N1 x N2 x 3) onto a set of reference or average
> coordinates (using a sub-set of coordinates for the matching).
> The example I
> looked at (500 x 876 x 3 items) took 1.3 s with Numeric and 4.7 s with
> numarray. The main culprits for the slow-down were:
> * compress() - factor 10
> * average() - factor 7 (average() is missing from Numeric and I
> hence had to
> write a little function myself)
> * LinearAlgebra.singular_value_decomposition() - factor 10
> but a lot of extra time is also spent in uufunc.py and various
> numarraycore.py
> routines.
>
> Memory efficiency:
> I hoped numarray would solve some of the Out-of-memory problems
> that I get
> with Numeric but it turns out that it is rather less memory
> efficient for my
> kind of applications. Slicing an array that takes up 800MB on
> disc just about
> runs through with Numeric (and heavy swapping) but gives an Out-of-memory
> with numarray.
>
> Suggestions:
> OK, it's easy to make clever comments without contributing any
> real work...
> - compress(), take(), etc, really need some optimization
> - a C-coded average() routine would be helpful
> - faster LinearAlgebra routines are necessary
>
> Our sysadmin noted that unlike Numeric, numarray is not using any
> external
> math libraries (like LAPACK) that have been speed-optimized for
> decades and
> are available in CPU-optimized variants (e.g. ATLAS). It's
> probably difficult
> to match this efficiency with any new code ...
>
> Greetings
> Raik
>
> PS:
> I didn't find any useful HowTo for the translation from Numeric
> to numarray.
> The practical issues were the different nonzero() return value, the more
> restrictive boolean comparison, that take doesn't support 'O' arrays any
> longer, and the missing average().
>
> --
> -----------------------------------------------------
> Raik Gr?nberg		| Bioinformatique Structurale
> 				| Institut Pasteur
> 				| Paris, France
> -----------------------------------------------------
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to
> find out more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From perry at stsci.edu  Tue Oct  5 18:14:00 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Oct  5 18:14:00 2004
Subject: [Numpy-discussion] numarray.random_array number generation in C code
In-Reply-To: <Pine.LNX.4.61.0410020003050.30139@Chrestomanci>
Message-ID: <NEBBIJKBMLDBLNCEEFOCCEOKFHAA.perry@stsci.edu>

Faheem Mitha wrote:

> Dear People,
>
> I want to write some C++ code to link with Python, using the
> Boost.Python interface. I need to generate random numbers in the C++
> code, and I was wondering as to the best way of doing this.
>
> Note that it is important that the random number generation interoperate
> seamlessly with Python, in the sense that the behavior of the calls to
> the RNG is the same whether calls are made at the C level or the Python
> level. I hope the reasons why this is important are obvious.
>
> I was thinking that the method should go like this.
>
> 1) When C/C++ code called, reads seed from python random state.
>
> 2) Does its stuff.
>
> 3) Writes seed back to python level when it exits.
>
> After doing a little investigation of the numarray.random_array python
> library and associated extension modules, it seems possible that the
> answer is simpler than I had supposed. However, I would appreciate it if
> someone would tell me if my understanding is incorrect in some places.
>
> Summary: It seems that I can just call all the C entry point routines
> defined in ranlib.h, without worrying about getting or setting seeds.
>
> Rationale:
>
> The structure of this random number facility has three parts, all
> files in
> Packages/RandomArray2/Src.
>
> 1) low-level C routines: Packages/RandomArray2/Src/com.c and
> Packages/RandomArray2/Src/ranlib.c.
>
> com.c: basic RNG stuff; getting and setting seeds etc.
> ranlib.c: Random number generator algorithms for different distributions
> etc.
>
> 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c.
>
> This interfaces the stuff in com.c and ranlib.c.
>
> 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py.
>
> This wraps the C interface. In most cases it does not do much
> else besides
> some basic argument error checking.
>
> From my perspective, the important thing is that the random
> number seed is
> only defined at C level as a static object, all the RNG stuff
> happens at C
> level, and the Python code just calls the C code as necessary. (I'm
> sketchy about the details of what is defined as the seed etc.)
>
> This is in contrast with the R RNG facility (the only other RNG
> facility I
> am familiar with), which uses macros SetRNGstate() and GetRNGstate() to
> read and write the seed, which is defined at R level.
>
> Therefore, the upshot is that the C routines in ranlib.h read and write
> the same seed as the python level functions do, so no special action is
> necessary with regard to the seed.
>
> Is this correct?
>
> In any case, it would be nice if something like the above was documented,
> so lost souls like myself don't have to go trawling through the source
> code to figure out what is going on. Of course it is nice that the source
> code is available, otherwise even that would be impossible.
>
> R documents this stuff in the "Writing R Extensions" manual, online at
> http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray
> manual could have a small section about this too.
>
>                                                          Regards, Faheem.
>
I'm not sure I understand what you want to do. Do you want to link
directly to the extension code from your C++ code? If so I'm wondering
why. It would make the most sense if the C++ code needed obtain
small numbers of random numbers in some iterative loop, and you wish
to use the same random number library that that numarray is using.
Otherwise, I would normally obtain the random number array
in python, then call the C++ extension. Perhaps I didn't read carefully
enough. Normally linking to an extension module involves some hacks
that I'm not sure were done for the randomarray module (the gory
details are in the python docs for extension modules), Todd can
check on that, I'm not sure I will have time (a superficial check
seems to indicate that it doesn't support direct linking, though
one could link to the underlying library I suppose).

As an aside, it is likely that a better module can be done as some
have suggested, we just took what Numeric had at the time. Doing that
is not a high priority with us at the moment (anyone else want to
tackle that?). Right now integration with scipy is our biggest
priority so things like this will have to take a back seat for
a while.

Furthermore, we did what we needed to to port these modules from
Numeric, but that didn't necessarily make us experts in how they
worked. I wish we were, but we've generally been directing our
energy elsewhere. I'd presume that the sensible way for the module
to work is to initialize its seed from a time-based seed in the
absence of any other seed initialization, and to keep the seed
state in the extension module, but I could be wrong.

Perry


From faheem at email.unc.edu  Tue Oct  5 18:41:02 2004
From: faheem at email.unc.edu (Faheem Mitha)
Date: Tue Oct  5 18:41:02 2004
Subject: [Numpy-discussion] numarray.random_array number generation in
 C code
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCCEOKFHAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCCEOKFHAA.perry@stsci.edu>
Message-ID: <Pine.LNX.4.61.0410052108420.16761@Chrestomanci>


On Tue, 5 Oct 2004, Perry Greenfield wrote:

> I'm not sure I understand what you want to do. Do you want to link 
> directly to the extension code from your C++ code?

Yes.

> If so I'm wondering why. It would make the most sense if the C++ code 
> needed obtain small numbers of random numbers in some iterative loop, 
> and you wish to use the same random number library that that numarray is 
> using.

I need to obtain an arbitrary (not known in advance) number of random 
numbers in the C++ code.

I'm thinking of using the same random number library mostly because I 
assumed that using the same seed across the python/C interface would be 
supported. This is how it works in R (the only other place I have used 
this). Also, I had been using the same routines in the Python code I'm 
trying to convert to C++, so it would be a relatively smooth transfer.

If I was to use a pure C/C++ library, I'd have to worry about copying the 
seed back and forth between Python and C. Is this what I'll have to do 
then?

> Otherwise, I would normally obtain the random number array in python, 
> then call the C++ extension.

Yes, this is what everyone suggests. But in my case, the number of random 
variates required is not known in advance. I get the feeling this 
situation does not arise very often for most people, but I work with 
stochastic processes which terminate according to some stopping criterion, 
and that is the standard situation in this case.

Also generating these numbers in Python would give rise to serious 
performance issues.

> Perhaps I didn't read carefully enough. Normally linking to an extension 
> module involves some hacks that I'm not sure were done for the 
> randomarray module (the gory details are in the python docs for 
> extension modules), Todd can check on that, I'm not sure I will have 
> time (a superficial check seems to indicate that it doesn't support 
> direct linking, though one could link to the underlying library I 
> suppose).

Hmm. Well, this is unwelcome news. You mean I cannot link to ranlib.so? I 
assumed that including the ranlib.h header and linking my C++ module 
against ranlib.so would be enough. I suppose that was too optimistic.

> As an aside, it is likely that a better module can be done as some
> have suggested, we just took what Numeric had at the time. Doing that
> is not a high priority with us at the moment (anyone else want to
> tackle that?). Right now integration with scipy is our biggest
> priority so things like this will have to take a back seat for
> a while.

> Furthermore, we did what we needed to to port these modules from
> Numeric, but that didn't necessarily make us experts in how they
> worked. I wish we were, but we've generally been directing our
> energy elsewhere. I'd presume that the sensible way for the module
> to work is to initialize its seed from a time-based seed in the
> absence of any other seed initialization, and to keep the seed
> state in the extension module, but I could be wrong.

Yes. That is how R does it, anyway. Specifically, you declare the seed 
static, and then it persists across the Python/C interface. That is what I 
thought you had in the numarray code. Would it be hard to make it work 
like this?

I'm no expert either.

                                                             Faheem.


From southey at uiuc.edu  Wed Oct  6 07:01:38 2004
From: southey at uiuc.edu (Bruce Southey)
Date: Wed Oct  6 07:01:38 2004
Subject: [Numpy-discussion] numarray.random_array
 number generation in C code
Message-ID: <ba8a1339.8b0d6083.832bd00@expms6.cites.uiuc.edu>

Hi,  
My understanding is that you can use the Ranlib, R math, and GNU Scientific  
libraries in the manner you suggest or directly include the random number 
generator in your code. Usually you define the seed that should provide the 
same psuedo-random number stream every time these are used. If you don't use a 
seed then it is usually impossible to get the same stream of psuedo-random 
numbers. So I do not understand what you need to keep the same random number 
state. Not to mention that the common generators do repeat, some sooner than 
others. 
  
In your response to Perry, you indicate that you do not need an array of 
random numbers but rather the stream of random numbers. This is very different 
and I think you need to refine your algorithm to identify what parts need to 
be C/C++ and what need to be in Python/numarray. Since you currently have 
Python code, I would profile it to see what parts actually need extending - 
some times Python is rather surprising on how quick some things can be done 
(like using dictionaries). Providing those parts may be more fruitful to you 
than my vague responses. 
 
Regards 
Bruce 
  
---- Original message ----  
>Date: Tue, 5 Oct 2004 18:43:48 -0400 (EDT)  
>From: Faheem Mitha <faheem at email.unc.edu>    
>Subject: Re: [Numpy-discussion] numarray.random_array number generation in C  
code    
>To: Bruce Southey <southey at uiuc.edu>  
>Cc: numpy-discussion <numpy-discussion at lists.sourceforge.net>  
>  
>  
>  
>On Tue, 5 Oct 2004, Bruce Southey wrote:  
>  
>> Hi,  
>> It is rather hard to suggest anything without more detail on what you want  
to  
>> actually do.  
>  
>I could give you more details if you were interested.  
>  
>> As you describe it, why do you need the 'seed' returned? It would only   
>> make sense if you were going in and out of Python multiple times - a   
>> somewhat undesirable situation due to the overhead costs.  
>  
>Not really. One might (and I frequently do) want to run the same function   
>(which in this case might be all in C++ code), interactively with   
>different parameters. The kind of thing that I'm doing is akin to   
>exploratory data analysis, and the specific code in question is a   
>stochastic search algorithm. Doing all this in C++ would not be very   
>interactive. Also, one often wants to postprocess data output using Python   
>scripts. This involves multiple calls to C++ code, and would be impossible   
>to do using C++, since one has to call other Python libraries.  
>  
>  > I see at least three options:  
>  
>> 1) Do everything in Python/numarray.  
>  
>That's my current situation.  
>  
>> 2) Do parts in Python and the other in C/C++.  
>>   For example, pass a matrix of random numbers to your code from Python.  
The  
>> 'seed' never needs to leave Python.  
>  
>This doesn't work very well unless you know in advance how many random   
>numbers are needed (not the case, for example, for stochastic search   
>algorithms), and in any case is a rather clumsy way to do things. No   
>offense intended.  
>  
>> 3) Do it all in C/C++ - pass the 'seed' into your code that includes the  
>> random number generator(s) - there is C/C++ code around for this. Do you  
stuff  
>> and then return the 'seed' back with whatever else is required.  
>  
>Yes, but part of the point of mixed programming is that you have an   
>interpreted front end which can easily hook into other routines. Also, in   
>this case, you would not be passing the seed in, since there is nothing to   
>pass it in from. One would simply call system time or something similar to   
>obtain the seed.  
>  
>> You can email me privately if you want.  
>  
>I'll keep sending this to the list unless someone objects, since I think   
>this is of some general interest.  
>  
>Really, my main question was to whether my understanding of how to use the   
>Numarray random number facilities in C was correct or not.  
>  
>                                                                Faheem.  
  
 
From jmiller at stsci.edu  Wed Oct  6 23:47:31 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Oct  6 23:47:31 2004
Subject: [Numpy-discussion] numarray.random_array number generation in
	C code
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCCEOKFHAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCCEOKFHAA.perry@stsci.edu>
Message-ID: <1097073394.31512.76.camel@halloween.stsci.edu>

On Tue, 2004-10-05 at 21:10, Perry Greenfield wrote:
> Faheem Mitha wrote:
> 
> > Dear People,
> >
> > I want to write some C++ code to link with Python, using the
> > Boost.Python interface. I need to generate random numbers in the C++
> > code, and I was wondering as to the best way of doing this.
> >
> > Note that it is important that the random number generation interoperate
> > seamlessly with Python, in the sense that the behavior of the calls to
> > the RNG is the same whether calls are made at the C level or the Python
> > level. I hope the reasons why this is important are obvious.
> >
> > I was thinking that the method should go like this.
> >
> > 1) When C/C++ code called, reads seed from python random state.
> >
> > 2) Does its stuff.
> >
> > 3) Writes seed back to python level when it exits.
> >
> > After doing a little investigation of the numarray.random_array python
> > library and associated extension modules, it seems possible that the
> > answer is simpler than I had supposed. However, I would appreciate it if
> > someone would tell me if my understanding is incorrect in some places.
> >
> > Summary: It seems that I can just call all the C entry point routines
> > defined in ranlib.h, without worrying about getting or setting seeds.
> >
> > Rationale:
> >
> > The structure of this random number facility has three parts, all
> > files in
> > Packages/RandomArray2/Src.
> >
> > 1) low-level C routines: Packages/RandomArray2/Src/com.c and
> > Packages/RandomArray2/Src/ranlib.c.
> >
> > com.c: basic RNG stuff; getting and setting seeds etc.
> > ranlib.c: Random number generator algorithms for different distributions
> > etc.
> >
> > 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c.
> >
> > This interfaces the stuff in com.c and ranlib.c.
> >
> > 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py.
> >
> > This wraps the C interface. In most cases it does not do much
> > else besides
> > some basic argument error checking.
> >
> > From my perspective, the important thing is that the random
> > number seed is
> > only defined at C level as a static object, all the RNG stuff
> > happens at C
> > level, and the Python code just calls the C code as necessary. (I'm
> > sketchy about the details of what is defined as the seed etc.)
> >
> > This is in contrast with the R RNG facility (the only other RNG
> > facility I
> > am familiar with), which uses macros SetRNGstate() and GetRNGstate() to
> > read and write the seed, which is defined at R level.
> >
> > Therefore, the upshot is that the C routines in ranlib.h read and write
> > the same seed as the python level functions do, so no special action is
> > necessary with regard to the seed.
> >
> > Is this correct?
> >
> > In any case, it would be nice if something like the above was documented,
> > so lost souls like myself don't have to go trawling through the source
> > code to figure out what is going on. Of course it is nice that the source
> > code is available, otherwise even that would be impossible.
> >
> > R documents this stuff in the "Writing R Extensions" manual, online at
> > http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray
> > manual could have a small section about this too.
> >
> >                                                          Regards, Faheem.
> >
> I'm not sure I understand what you want to do. Do you want to link
> directly to the extension code from your C++ code? If so I'm wondering
> why. It would make the most sense if the C++ code needed obtain
> small numbers of random numbers in some iterative loop, and you wish
> to use the same random number library that that numarray is using.
> Otherwise, I would normally obtain the random number array
> in python, then call the C++ extension. Perhaps I didn't read carefully
> enough. Normally linking to an extension module involves some hacks
> that I'm not sure were done for the randomarray module (the gory
> details are in the python docs for extension modules), Todd can
> check on that, 

I checked and there's no C level export of the ranlib interface, at
least not in the "hacked" sense of an extension module C-API where the
linkage is made indirect via an API pointer and bizarre macros.

> I'm not sure I will have time (a superficial check
> seems to indicate that it doesn't support direct linking, though
> one could link to the underlying library I suppose).

Ordinary C linkage to numarray.random_array.ranlib2 may be supported
since as an extension it is also a shared library, but I've never tried
it myself and I wonder if it would actually work. If anyone has tried
something like that I'd be interested in hearing how it turned out. 
Without a really compelling reason,  I'd avoid it myself.

Regards,
Todd


From dd55 at cornell.edu  Sun Oct 10 12:51:58 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Sun Oct 10 12:51:58 2004
Subject: [Numpy-discussion] ieeespecial
Message-ID: <200410101547.18413.dd55@cornell.edu>

Hello,

I am getting invalid numeric result exceptions when dividing a complex array 
by zero. Is this the desired behavior?

Also, while trying to find a way around the above problem, I ran 
ieeespecial.test and got the following output. I am running numarray 1.1 on 
python 2.3.3. Todd, this might be correlated with the numerix package in 
matplotlib. I tried importing numarray and ieeespecial without matplotlib and 
the ieeespecial.test was successful.

Thanks,

Darren


In [31]: ieeespecial.test()
Out[31]: inf
*****************************************************************
Failure in example:
inf    # the repr() of inf may vary from platform to platform
from line #6 of numarray.ieeespecial
Expected: inf
Got:
Out[31]: nan
*****************************************************************
Failure in example:
nan    # the repr() of nan may vary from platform to platform
from line #8 of numarray.ieeespecial
Expected: nan
Got:
Out[31]: (array([0, 2]), array([0, 3]))
*****************************************************************
Failure in example: getinf(b)
from line #20 of numarray.ieeespecial
Expected: (array([0, 2]), array([0, 3]))
Got:
Out[31]:
array([[ 999.,    1.,    2.,    3.],
       [   4.,    5.,    6.,    7.],
       [   8.,    9.,   10.,  999.],
       [  12.,   13.,   14.,   15.]])
*****************************************************************
Failure in example: a
from line #26 of numarray.ieeespecial
Expected:
array([[ 999.,    1.,    2.,    3.],
       [   4.,    5.,    6.,    7.],
       [   8.,    9.,   10.,  999.],
       [  12.,   13.,   14.,   15.]])
Got:
Out[31]: (array([0, 1, 2]), array([1, 2, 3]))
*****************************************************************
Failure in example: getnan(a)
from line #35 of numarray.ieeespecial
Expected: (array([0, 1, 2]), array([1, 2, 3]))
Got:
*****************************************************************
1 items had failures:
   5 of  11 in numarray.ieeespecial
***Test Failed*** 5 failures.
Out[31]: (5, 11)


-- 

Darren


From dd55 at cornell.edu  Sun Oct 10 13:57:43 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Sun Oct 10 13:57:43 2004
Subject: [Numpy-discussion] ieeespecial
In-Reply-To: <200410101547.18413.dd55@cornell.edu>
References: <200410101547.18413.dd55@cornell.edu>
Message-ID: <200410101653.51172.dd55@cornell.edu>


On Sunday 10 October 2004 03:47 pm, Darren Dale wrote:
> Hello,
>
> I am getting invalid numeric result exceptions when dividing a complex
> array by zero. Is this the desired behavior?
>
> Also, while trying to find a way around the above problem, I ran
> ieeespecial.test and got the following output. I am running numarray 1.1 on
> python 2.3.3. Todd, this might be correlated with the numerix package in
> matplotlib. I tried importing numarray and ieeespecial without matplotlib
> and the ieeespecial.test was successful.
>

On a related note, ieeespecial.getnan appears to be incompatible with complex 
arrays, see below. I didnt mention in my last email that I built numarray for 
my existing blas/lapack libraries, will this change the behavior on my system 
from the default?

Thanks,
Darren

>>> from numarray import *
>>> from numarray.ieeespecial import *
>>> b=arange(10,typecode=Complex64)
>>> a=b/0
Warning: Encountered invalid numeric result(s)  in divide
>>> a
array([              nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj])
>>> getnan(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 117, 
ingetnan
    return _spec.index(a, _spec.NAN)
  File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 95, in 
index
    return _na.nonzero(mask(a, msk))
  File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 87, in 
mask
    f = _na.ieeemask(a, m)
  File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 883, in 
_cache_miss2
    mode, win1, win2, wout, cfunc, ufargs = \
  File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 929, in 
_setup
    convtype1, convtype2, outtype, ucfunc \
  File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 471, in 
_typematch
    newInputSignature = (self._typePromoter(intype, atypelist),)*2
  File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 498, in 
_typePromoter
    raise TypeError("unable to find type to promote to")
TypeError: unable to find type to promote to

>>> getnan(a.real)
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),)
>>>  


From aisaac at american.edu  Sun Oct 10 15:57:18 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Sun Oct 10 15:57:18 2004
Subject: [Numpy-discussion] documentation error
Message-ID: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>

In the Numeric manual, there are  two different defintions of the
'diagonal' function.  The second definition appears to be incorrect.

p.39:
diagonal(a, k=0, axis1=0, axis2 = 1)
returns the entries along the k th diagonal of a (k is an
offset from the main diagonal). This is designed for 2d
arrays. For larger arrays, it will return the diagonal of
each 2d sub-array.

p.44
diagonal(a, offset=0, axis1=0, axis2=1)
The diagonal function takes an array a, and returns an array
of rank 1 containing all of the elements of a such that the
difference between their indices along the specified axes is
equal to the specified offset. With the default values, this
corresponds to all of the elements of the diagonal of a
along the last two axes.

fwiw,
Alan Isaac


From jmiller at stsci.edu  Sun Oct 10 17:43:34 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Sun Oct 10 17:43:34 2004
Subject: [Numpy-discussion] ieeespecial
In-Reply-To: <200410101653.51172.dd55@cornell.edu>
References: <200410101547.18413.dd55@cornell.edu>
	 <200410101653.51172.dd55@cornell.edu>
Message-ID: <1097454870.3741.48.camel@localhost.localdomain>

On Sun, 2004-10-10 at 16:53, Darren Dale wrote:
> On Sunday 10 October 2004 03:47 pm, Darren Dale wrote:
> > Hello,
> >
> > I am getting invalid numeric result exceptions when dividing a complex
> > array by zero. Is this the desired behavior?
> >
> > Also, while trying to find a way around the above problem, I ran
> > ieeespecial.test and got the following output. I am running numarray 1.1 on
> > python 2.3.3. Todd, this might be correlated with the numerix package in
> > matplotlib. I tried importing numarray and ieeespecial without matplotlib
> > and the ieeespecial.test was successful.
> >
> 
> On a related note, ieeespecial.getnan appears to be incompatible with complex 
> arrays, see below. 

Thanks for pointing this out.  It's an oversight in the implementation
of ieeespecial and I'll fix it.

> I didnt mention in my last email that I built numarray for 
> my existing blas/lapack libraries, will this change the behavior on my system 
> from the default?

Regarding ieeespecial and complex division by zero, I am pretty sure
blas/lapack linkage is irrelevant.  But... I very rarely link with an
external blas/lapack,  so if there is an issue, I'm unlikely to have
come across it myself.  Still, off the top of my head,  blas/lapack is
unrelated.

Regards,
Todd

> Thanks,
> Darren
> 
> >>> from numarray import *
> >>> from numarray.ieeespecial import *
> >>> b=arange(10,typecode=Complex64)
> >>> a=b/0
> Warning: Encountered invalid numeric result(s)  in divide
> >>> a
> array([              nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj])
> >>> getnan(a)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 117, 
> ingetnan
>     return _spec.index(a, _spec.NAN)
>   File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 95, in 
> index
>     return _na.nonzero(mask(a, msk))
>   File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 87, in 
> mask
>     f = _na.ieeemask(a, m)
>   File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 883, in 
> _cache_miss2
>     mode, win1, win2, wout, cfunc, ufargs = \
>   File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 929, in 
> _setup
>     convtype1, convtype2, outtype, ucfunc \
>   File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 471, in 
> _typematch
>     newInputSignature = (self._typePromoter(intype, atypelist),)*2
>   File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 498, in 
> _typePromoter
>     raise TypeError("unable to find type to promote to")
> TypeError: unable to find type to promote to
> 
> >>> getnan(a.real)
> (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),)
> >>>  


From dd55 at cornell.edu  Sun Oct 10 18:08:10 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Sun Oct 10 18:08:10 2004
Subject: [Numpy-discussion] ieeespecial
In-Reply-To: <1097454560.3741.41.camel@localhost.localdomain>
References: <200410101547.18413.dd55@cornell.edu> <1097454560.3741.41.camel@localhost.localdomain>
Message-ID: <200410102103.42221.dd55@cornell.edu>

On Sunday 10 October 2004 08:29 pm, you wrote:
> On Sun, 2004-10-10 at 15:47, Darren Dale wrote:
> > Hello,
> >
> > I am getting invalid numeric result exceptions when dividing a complex
> > array by zero. Is this the desired behavior?
>
> This is what I would have expected,  and examining the definition I have
> for complex division in numarray/Include/numarray/numcomplex.h,  I don't
> see a problem.   The definition should probably be checked by an extra
> set of eyes.  Looks OK to me.

Hi Todd,

Sorry, I wasnt clear. I was wondering if it should raise a divide by zero 
exception and return an inf, as the real datatypes do, instead of an invalid 
numeric result and a nan.  As it stands now, we have to handle divide by zero 
differently for different data types, if we need to filter/replace such 
values.

Thanks,
Darren


From jmiller at stsci.edu  Sun Oct 10 18:44:38 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Sun Oct 10 18:44:38 2004
Subject: [Numpy-discussion] ieeespecial
In-Reply-To: <200410101547.18413.dd55@cornell.edu>
References: <200410101547.18413.dd55@cornell.edu>
Message-ID: <1097454560.3741.41.camel@localhost.localdomain>

On Sun, 2004-10-10 at 15:47, Darren Dale wrote:
> Hello,
> 
> I am getting invalid numeric result exceptions when dividing a complex array 
> by zero. Is this the desired behavior?

This is what I would have expected,  and examining the definition I have
for complex division in numarray/Include/numarray/numcomplex.h,  I don't
see a problem.   The definition should probably be checked by an extra
set of eyes.  Looks OK to me.

> Also, while trying to find a way around the above problem, I ran 
> ieeespecial.test and got the following output. I am running numarray 1.1 on 
> python 2.3.3. Todd, this might be correlated with the numerix package in 
> matplotlib. I tried importing numarray and ieeespecial without matplotlib and 
> the ieeespecial.test was successful.
> 

I tried this with an ordinary Python shell and ieeespecial.test()
completed without errors.  Looking at your test output,  I noticed it
was skewed, and guessed there was an I/O synchronization issue messing
up doctest.  I tried the same test under IPython w/o matplotlib and
duplicated your results,  so I think the problem is an IPython/doctest
issue.  

Regards,
Todd

> Thanks,
> 
> Darren
> 
> 
> In [31]: ieeespecial.test()
> Out[31]: inf
> *****************************************************************
> Failure in example:
> inf    # the repr() of inf may vary from platform to platform
> from line #6 of numarray.ieeespecial
> Expected: inf
> Got:
> Out[31]: nan
> *****************************************************************
> Failure in example:
> nan    # the repr() of nan may vary from platform to platform
> from line #8 of numarray.ieeespecial
> Expected: nan
> Got:
> Out[31]: (array([0, 2]), array([0, 3]))
> *****************************************************************
> Failure in example: getinf(b)
> from line #20 of numarray.ieeespecial
> Expected: (array([0, 2]), array([0, 3]))
> Got:
> Out[31]:
> array([[ 999.,    1.,    2.,    3.],
>        [   4.,    5.,    6.,    7.],
>        [   8.,    9.,   10.,  999.],
>        [  12.,   13.,   14.,   15.]])
> *****************************************************************
> Failure in example: a
> from line #26 of numarray.ieeespecial
> Expected:
> array([[ 999.,    1.,    2.,    3.],
>        [   4.,    5.,    6.,    7.],
>        [   8.,    9.,   10.,  999.],
>        [  12.,   13.,   14.,   15.]])
> Got:
> Out[31]: (array([0, 1, 2]), array([1, 2, 3]))
> *****************************************************************
> Failure in example: getnan(a)
> from line #35 of numarray.ieeespecial
> Expected: (array([0, 1, 2]), array([1, 2, 3]))
> Got:
> *****************************************************************
> 1 items had failures:
>    5 of  11 in numarray.ieeespecial
> ***Test Failed*** 5 failures.
> Out[31]: (5, 11)
-- 


From aisaac at american.edu  Sun Oct 10 18:59:17 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Sun Oct 10 18:59:17 2004
Subject: [Numpy-discussion] location of tutorial
Message-ID: <Mahogany-0.66.0-1780-20041010-215449.00@american.edu>

p.29 of the Numeric manual refers to 
http://www.python.org/doc/tut/functional.html
which no longer exists.  I suggest substituting
http://docs.python.org/tut/tut.html

fwiw,
Alan Isaac


From jmiller at stsci.edu  Mon Oct 11 04:28:51 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Oct 11 04:28:51 2004
Subject: [Numpy-discussion] ieeespecial
In-Reply-To: <200410102103.42221.dd55@cornell.edu>
References: <200410101547.18413.dd55@cornell.edu>
	 <1097454560.3741.41.camel@localhost.localdomain>
	 <200410102103.42221.dd55@cornell.edu>
Message-ID: <1097493501.2619.26.camel@localhost.localdomain>

On Sun, 2004-10-10 at 21:03, Darren Dale wrote:
> On Sunday 10 October 2004 08:29 pm, you wrote:
> > On Sun, 2004-10-10 at 15:47, Darren Dale wrote:
> > > Hello,
> > >
> > > I am getting invalid numeric result exceptions when dividing a complex
> > > array by zero. Is this the desired behavior?
> >
> >
> > This is what I would have expected,  and examining the definition I have
> > for complex division in numarray/Include/numarray/numcomplex.h,  I don't
> > see a problem.   The definition should probably be checked by an extra
> > set of eyes.  Looks OK to me.
> 
> Hi Todd,
> 
> Sorry, I wasn't clear. I was wondering if it should raise a divide by zero 
> exception and return an inf, as the real data types do, instead of an invalid 
> numeric result and a nan.  As it stands now, we have to handle divide by zero 
> differently for different data types, if we need to filter/replace such 
> values.

Numarray's error handling system is pretty flexible, and can raise
exceptions on divide by zero if configured properly, or can ignore them
altogether.  See section 4.9 in the numarray-1.1 manual here:

http://prdownloads.sourceforge.net/numpy/numarray-1.1.pdf?download

It's an interesting question regarding the inf vs. nan.   Looking at the
complex division macro (NUM_CDIV) in numcomplex.h,  I don't understand
why we're getting nans now and not infs;  it might be a bug in the
macro,  but I don't see it.

Regards,
Todd


From stephen.walton at csun.edu  Mon Oct 11 20:16:55 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Mon Oct 11 20:16:55 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
Message-ID: <1097550159.2568.5.camel@localhost.localdomain>

On Sun, 2004-10-10 at 11:33, Alan G Isaac wrote:
> In the Numeric manual, there are  two different defintions of the
> 'diagonal' function.  The second definition appears to be incorrect.
> 
> p.39:
> diagonal(a, k=0, axis1=0, axis2 = 1)

> p.44
> diagonal(a, offset=0, axis1=0, axis2=1)

Are you sure?  On my system, it appears that the second definition is
correct in both Numeric 23.3 and numarray 1.1.


From a.schmolck at gmx.net  Tue Oct 12 02:40:55 2004
From: a.schmolck at gmx.net (Alexander Schmolck)
Date: Tue Oct 12 02:40:55 2004
Subject: [Numpy-discussion] A disconnected numarray rant
Message-ID: <yfsu0t0b7lk.fsf@black4.ex.ac.uk>

Hi,

I'm taking a 1 month break from computers (i.e. I will be completely
off-line), and I have to catch a train in an hour; but I've recently bitten
the bullet and made a matrix class I've been using for some time work with
numarray; I've written down a number of things that occured to me while I was
doing it, including some things which I think are bugs in numarray, so I
thought at least posting the bugs would be a useful service; the rest is very
raw and essentially unedited cut-and-paste of these notes -- sorry about that
and I hope it doesn't contain anything particularly offensive.

P.S. just dumped the code for the matrix class (nummat) at
http://www.dcs.ex.ac.uk/~aschmolc/Stuff/

'as

The following are my notes:


Things that fairly clearly seem to be bugs:
    - numarray.Int32 etc. can't be pickled
    - ``a = array(1+0j); a.imag = a.real * 10`` => IndexError
    - array(0, type=Float64) + 1e3000  => `inf` with right error modes
      but  array(0, type=Float32) + 1e3000 => `OverflowError`
    - numarray.array(10)/numarray.array(0) => 0 
    - numarray.array(10000000000000L) => array(1316134912)
    - numarray.where(0,1,0) => array([0])
    - l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l => [1, 2, 3]
      a = array([1,2,3]); numarray.put(a,numarray.array([1,2,0]),[0,0,0]); a => array([0, 0, 0])
    - repr(numarray.array([],typecode='i')) (etc. etc.) => "numarray.array([])"
    - getattr(array([1,2,3]), '_aligned') => SystemError
    - obscure: numarray.where(0, matrix(568, convert_scalars=True),2) =>
      ValueError (tries __len__ which fails, as len(array(568)) also fails)

Numeric incompatiblilities (that are either undocumented or bug-like)

- numarray.array('a', typecode='O') => TypeError (object arrays)
- for extra fun try: numarray.array(1, type=numarray.Object) -=> RuntimeError
  something entirely different
- nonzero is completely incompatible
- shape(None) etc. no longer works (IMHO a bug)
- cross_correlate & average missing
- left_shift et al missing
- numarray.sqrt(a,a) is None (*not* the result, as it used to be)
- num.put(a, [0,1,2,3], [10,20]) style behavior seems unavailable (without numarray.numeric)
  put(array([[ 0.,  1.,  2.], [ 3.,  4.,  5.]]), [1, 4], [10,40]) fails
- boolean testing (not even bool(array(0)) works; I'm not sure this is good)

- Generally different handling of rank0-arrays; e.g. ``type(num.array(1.0) +
  0) is float``; one potentially very nasty gotcha are inplace operations
  (e.g. a**=2) which have totally different semantics for python scalars and
  rank0 arrays, which, unlike Attribute errors on ``a.shape``, can lead to
  nasty bugs in corner cases (e.g. when a reduction just infrequently yields
  scalar ``a``) -- I think this should be mentioned in a gotchas section
  (another possible entry would be the need to use .copy() to **save** memory
  on slicing and 1xN, Nx1 matrices versus vectors (people are not used to
  thinking properly about rank from mathematical training or matlab
  exposure)).

- asarray downcasts arrays (e.g.: asarray(array([1.,2.,3.]),'i'))

- numarray.ones(-5) => MemoryError (ValueError would be nicer)
- numarray.ones(2.0), numarray.ones([2]) fail (cf. numarray.range(2.0))
      b=num.array([[1,2,3,4],[5,6,7,8]]*2)
      assert eq(num.diagonal(b), [1,6,3,8])
      assert eq(num.diagonal(b, -1), [5,2,7])
      c = num.array([b,b])
      assert eq(num.diagonal(c,1), [[2,7,4], [2,7,4]])
- no a.toscalar() !!!
- matrixmultiply in the docs
- what's the point of swapaxes (i.e. why not have a generalized in-place
  transpose?)
- what's the point of innerproduct?


- indexing by a list is different from indexing by tuple (I haven't had time
  to look closely at the docs whether that's intentional)

- doesn't know about Numeric's bizzarre '\x0b' typecode
- numarray.sqrt.reduce([]) raises (sensibly) TypeError, not ValueError

- len(array(1)) or array(1)[0] won't work anymore (understandable, but
  should be documented)
- (should maximim, minimum reduce to -inf and inf?)
- <built-in method reduce of _BinaryUFunc object at 0x82dfc9c> is not
  a very helpful repr; should be possible to get to the ufunc itself
- as in Numeric numarray.maximum.reduce(numarray.array([0,-0.])) => -0.0
- __array__ protocol no longer supported (how can a non-derived class convert
  itself efficiently to an array?)


Documentation Gotchas
- p. 34 IMO row vector is used incorrectly; row and column vectors are really
     matrices (i.e. have rank 2) so ``array([[1,2,3]])`` would be a row vector

- No proper explanation of differences between Numeric and numarray, or
  numarray.numeric module differences to proper (e.g. argmin)

- No migration and best-practice advice (e.g. there should be a standard way
  for packages which work with both numarray and numeric as backends to let
  the user choose his preference; how about setting an environment var NumPy
  or something?)


Waffle
------

- there *really* ought to be an array equality function (with optional
  tolerance); it's quite difficult to get right for are normal user (nans;
  zero-size arrays etc.) and it's often required, especially for testing

- rank preserving reduction seems useful as an option would be nice -- e.g. to
  subtract out or divide by the reduced portion (which currently won't e.g.
  work for columns without adding a unit-dimension by hand). 

Design

  The (AFAICS) benefit-free but downside-rich introduction of `type`
  ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

  Is there any reason that Typecode objects that compare as desired to the
  relevant strings ("i", "d") wouldn't have done? Now there is an explosion
  and confusion of interfaces -- some numpy code will now only except
  type(code)s as "typecode" keyword parameter (even in numarray! see
  numarray.mlab!) and other stuff

  Never mind that type already is a highly overused word in the python world.


  The big method bloat.
  '''''''''''''''''''''

  As it says in the Numeric manual introductions there were "good reasons" for
  "very few array methods" -- now there are **56** public methods and 8 public
  attributes (public == not starting with '_'); of those 56 methods about 11
  are accessors and of the rest about half are redundant or worse (i.e. they
  either also exist as numarray functions (argmin, argmax, diagonal, ...) or
  they really ought to be functions (mean, stddev) or they are quite confusing
  (``a.min``, ``a.max`` which behave quite differenlty from ``a.argmin`` and
  ``a.argmax``, never mind ``numarray.minimum``) or simply utterly pointless
  (``a.nelements`` == ``a.size``)).

  - argmin, argmax : what's wrong with numarray.argmin, numarray.argmax??? Why
    do argmin/argmax and max/min have completely different interfaces??? If
    there really is a need for these (there isn't) anything a.min and a.max
    should be called a.flatmin, a.flatmax

  - diagonal, mean, nelements, nonzero, ...

  - perversely the **only** function that I can think off that could have
    sensibly become a method hasn't: ``put`` (it used to work only on arrays
    under Numeric and not without reason, so making it a method would have
    been sensible; numarray.put of course also "works" on non-arrays, it just
    doesn't do anything with them)


  Test Code
  '''''''''
  numtest.py doesn't inspire full confidence (it's about 1000 lines of actual
  code but it doesn't seem that clearly structured and AFAICT contains no
  single loop (and that despite the diversity of shapes, types etc. that exist
  in numarray -- why not try something slightly more systematic?)).


From avhot at email.msn.com  Tue Oct 12 06:11:30 2004
From: avhot at email.msn.com (Shelia Mendez)
Date: Tue Oct 12 06:11:30 2004
Subject: [Numpy-discussion] Cheap software for you please.   6610536
Message-ID: <43647672541191164755429@email.msn.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041012/e4aa121a/attachment.html>

From aisaac at american.edu  Tue Oct 12 07:03:18 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Tue Oct 12 07:03:18 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <1097550159.2568.5.camel@localhost.localdomain>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
 <1097550159.2568.5.camel@localhost.localdomain>
Message-ID: <Mahogany-0.66.0-1780-20041012-100006.00@american.edu>

> On Sun, 2004-10-10 at 11:33, Alan G Isaac wrote:
>> In the Numeric manual, there are  two different defintions of the
>> 'diagonal' function.  The second definition appears to be incorrect.


On Mon, 11 Oct 2004, Stephen Walton apparently wrote:
> Are you sure?  On my system, it appears that the second definition is
> correct in both Numeric 23.3 and numarray 1.1.


You did not quote the problematic portion:
        The diagonal function takes an array a, and returns
        an array of rank 1 ... With the default values, this
        corresponds to all of the elements of the diagonal
        of a along the last two axes.

Contrast:
>>> import Numeric
>>> Numeric.__version__
'23.1'
>>> x=[[[1,2],[3,4]],[[5,6],[7,8]]]
>>> Numeric.diagonal(x)
array([[1, 4],
       [5, 8]])

fwiw,
Alan Isaac


From stephen.walton at csun.edu  Tue Oct 12 08:42:04 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Tue Oct 12 08:42:04 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <Mahogany-0.66.0-1780-20041012-100006.00@american.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
	 <1097550159.2568.5.camel@localhost.localdomain>
	 <Mahogany-0.66.0-1780-20041012-100006.00@american.edu>
Message-ID: <1097595580.24491.4.camel@freyer.sfo.csun.edu>

On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote:

> On Mon, 11 Oct 2004, Stephen Walton apparently wrote:
> > Are you sure?  On my system, it appears that the second definition is
> > correct in both Numeric 23.3 and numarray 1.1.
> 
> 
> You did not quote the problematic portion:
>         The diagonal function takes an array a, and returns
>         an array of rank 1 ...

Ah, I thought you were referring to the fact that, in the first version
in the documentation, the second, named argument is given as "k" but in
the second version it is "offset". A look at the source reveals the
second keyword name is the correct one.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041012/567758a5/attachment.sig>

From aisaac at american.edu  Tue Oct 12 12:25:01 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Tue Oct 12 12:25:01 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <1097595580.24491.4.camel@freyer.sfo.csun.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu><1097550159.2568.5.camel@localhost.localdomain><Mahogany-0.66.0-1780-20041012-100006.00@american.edu><1097595580.24491.4.camel@freyer.sfo.csun.edu>
Message-ID: <Mahogany-0.66.0-1780-20041012-152445.00@american.edu>

> On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote:
>> You did not quote the problematic portion:
>>         The diagonal function takes an array a, and returns
>>         an array of rank 1 ...


On Tue, 12 Oct 2004, Stephen Walton apparently wrote:
> A look at the source reveals the
> second keyword name is the correct one.


OK then, we have a double problem.
The first version gives the correct description
but uses the wrong keyword.
The second version gives the wrong description
but uses the correct keyword.

So, how do we file a documentation bug?

Cheers,
Alan Isaac


From perry at stsci.edu  Tue Oct 12 12:31:17 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Oct 12 12:31:17 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <Mahogany-0.66.0-1780-20041012-152445.00@american.edu>
Message-ID: <NEBBIJKBMLDBLNCEEFOCOEPOFHAA.perry@stsci.edu>

> So, how do we file a documentation bug?
> 
> Cheers,
> Alan Isaac
>
I'd say just like any other kind of bug.

Perry


From jmiller at stsci.edu  Tue Oct 12 12:40:19 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Oct 12 12:40:19 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <Mahogany-0.66.0-1780-20041012-152445.00@american.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
	 <1097550159.2568.5.camel@localhost.localdomain>
	 <Mahogany-0.66.0-1780-20041012-100006.00@american.edu>
	 <1097595580.24491.4.camel@freyer.sfo.csun.edu>
	 <Mahogany-0.66.0-1780-20041012-152445.00@american.edu>
Message-ID: <1097609991.30171.556.camel@halloween.stsci.edu>

On Tue, 2004-10-12 at 12:40, Alan G Isaac wrote:
> > On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote:
> >> You did not quote the problematic portion:
> >>         The diagonal function takes an array a, and returns
> >>         an array of rank 1 ...
> 
> 
> 
> On Tue, 12 Oct 2004, Stephen Walton apparently wrote:
> > A look at the source reveals the
> > second keyword name is the correct one.
> 
> 
> OK then, we have a double problem.
> The first version gives the correct description
> but uses the wrong keyword.
> The second version gives the wrong description
> but uses the correct keyword.
> 
> So, how do we file a documentation bug?
> 
Go here:

http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse

then "Submit New", and set the "category" to "documentation.

Regards,
Todd

> Cheers,
> Alan Isaac
> 
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 


From pearu at scipy.org  Wed Oct 13 06:02:48 2004
From: pearu at scipy.org (Pearu Peterson)
Date: Wed Oct 13 06:02:48 2004
Subject: [Numpy-discussion] ANN: SciPy 0.3.2 Released
Message-ID: <Pine.LNX.4.61.0410130804100.11778@scipy.org>

Hi,

Scipy 0.3.2 has been released and binaries are available from the 
scipy.org site:

   http://www.scipy.org

Scipy 0.3.2 is a bug fix release of Scipy 0.3 including the following new 
features:

- wxPython 2.5 support
- reading/writing dense/sparse matrices in Matrix Market format
- iterative solvers, new functions sqrtm, hessenberg
- Constrained Optimization BY Linear Approximation
- discrete Boltzmann, Planck, Levy distributions
- Scipy tests pass now also on 64-bit systems and Mac OSX
etc.

The complete release notes can be found here:

   http://www.scipy.org/download/scipy_release_notes_0.3.2.html

Best regards,

Pearu

BTW Scipy is:
-------------
Scipy is an open source library of scientific tools for Python. Scipy 
supplements the popular Numeric module, gathering a variety of high level 
science and engineering modules together as a single package.

Scipy includes modules for graphics and plotting, optimization, 
integration, special functions, signal and image processing, genetic 
algorithms, ODE solvers, and others.


From jmiller at stsci.edu  Wed Oct 13 14:35:08 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Oct 13 14:35:08 2004
Subject: [Numpy-discussion] A disconnected numarray rant
In-Reply-To: <yfsu0t0b7lk.fsf@black4.ex.ac.uk>
References: <yfsu0t0b7lk.fsf@black4.ex.ac.uk>
Message-ID: <1097703239.631.923.camel@halloween.stsci.edu>

Hi Alexander,

Thanks for taking the time to provide us with feedback.  I've responded
to many of your points below.  [and in the interest of keeping the text
bloat down, I've interjected my own comments in brackets--Perry]

On Tue, 2004-10-12 at 05:37, Alexander Schmolck wrote: 
> Hi,
> 
> I'm taking a 1 month break from computers (i.e. I will be completely
> off-line), and I have to catch a train in an hour; but I've recently 
> bitten
> the bullet and made a matrix class I've been using for some time work 
> with
> numarray; I've written down a number of things that occured to me 
> while I was
> doing it, including some things which I think are bugs in numarray, so
> I
> thought at least posting the bugs would be a useful service; the rest 
> is very
> raw and essentially unedited cut-and-paste of these notes -- sorry 
> about that
> and I hope it doesn't contain anything particularly offensive.
> 
> P.S. just dumped the code for the matrix class (nummat) at
> http://www.dcs.ex.ac.uk/~aschmolc/Stuff/
> 
> 'as
> 
> The following are my notes:
> 
> 
> Things that fairly clearly seem to be bugs:
>     - numarray.Int32 etc. can't be pickled

Known limitation,  but OK.   Arrays can be pickled, as can Numeric
typecodes so I'm not sure how critical this omission is. 

>     - ``a = array(1+0j); a.imag = a.real * 10`` => IndexError
>     - array(0, type=Float64) + 1e3000  => `inf` with right error modes
>       but  array(0, type=Float32) + 1e3000 => `OverflowError`
>     - numarray.array(10)/numarray.array(0) => 0
>     - numarray.array(10000000000000L) => array(1316134912)
>     - numarray.where(0,1,0) => array([0])

There seems to be an infinity of rank-0 issues and so little
justification for having them that at one point we considered ripping
them out altogether.  Noted,  but low priority.

[Amen. If I had known the problems that rank-0 zero arrays would cause
I think I would have excluded them. I'm not sure I see the need for
them now that coercion rules have changed and helper functions to change
scalars into rank-1 len-1 arrays which serve almost all other purposes.
I'm interested in seeing what real purpose they serve now (I understand
the backward compatibility issue, but backward compatibility is not the
be all and end all for numarray; more on that later)] 
>     - l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l 
> => [1, 2, 3]

Should raise a TypeError I guess. 
>       a = array([1,2,3]); 
> numarray.put(a,numarray.array([1,2,0]),[0,0,0]); a => array([0, 0, 0])

I don't see what's wrong here. 
>     - repr(numarray.array([],typecode='i')) (etc. etc.) => 
> "numarray.array([])"

Zero length arrays are rather like rank-0 arrays: low priority. 
Agreed... this is a small wart. 
>     - getattr(array([1,2,3]), '_aligned') => SystemError

Interesting.  I've been thinking about ripping out the _align and
_contiguous self-test hacks for a long time.  You've made up my mind. 
>     - obscure: numarray.where(0, matrix(568, convert_scalars=True),2) 
> =>
>       ValueError (tries __len__ which fails, as len(array(568)) also 
> fails)

I think this may boil down to "no where() for object arrays".
numarray.where() can't handle object arrays and there is no
numarray.objects.where().  Not implemented yet. 
> Numeric incompatiblilities (that are either undocumented or bug-like)

The best Numeric compatibility in numarray comes from:

import numarray.numeric as Numeric

It's still not perfect,  but it is more compatible than ordinary
numarray. 
> - numarray.array('a', typecode='O') => TypeError (object arrays)
> - for extra fun try: numarray.array(1, type=numarray.Object) -=> 
> RuntimeError
>   something entirely different

Object arrays in numarray do not have the synergy they have in
Numeric.   In particular,  numarray.array() can't create them, only
numarray.objects.array(). [At the time we added object arrays, we 
noticed that they were not safe in Numeric; that is, Numeric was not
properly handling reference counts of objects in arrays for at least
some operations and it was possible to segfault object arrays. This may
have changed since then; we haven't had a chance to check the current
status. But the point is that handling object arrays safely is a lot
more than just loading them with object pointers. Any function that can
set values in arrays needs to handle their refcounts, and that isn't all
that trivial. We took a short cut of using a Python implementation for
object arrays that doesn't have all the old functionality, but also
didn't have the problems that they did at the time.] 
> - nonzero is completely incompatible

numarray.numeric covers this.

numarray's nonzero() is more powerful, capable of handling
multidimensional arrays,  so it returns a tuple of values rather than a
single value.  It's unfortunate that we chose to use the name nonzero()
for the "new" function;  it has the right interface and the wrong name.
Keep in mind though,  our compatibility goals have grown immensely since
we started. 
> - shape(None) etc. no longer works (IMHO a bug)

This may be related to the object array synergy.   I think
numarray.asarray() is the problem here, since it doesn't know how to
create object arrays.
> - cross_correlate & average missing

I think cross_correlate is in numarray.convolve.correlate.  It was a
conscious choice not to put it in core numarray.  Average has never been
implemented and should be, especially since it has different semantics
than the mean() method. 
> - left_shift et al missing

These were renamed lshift and rshift.   Note that << works fine.
Synonyms should probably be added. 
> - numarray.sqrt(a,a) is None (*not* the result, as it used to be)

What do you want here?  What we have now is, IMO, correct. [Amen. This 
was intentionally changed from Numeric.] 
> - num.put(a, [0,1,2,3], [10,20]) style behavior seems unavailable 
> (without numarray.numeric)

I wasn't exactly sure what the expected behavior was for this,  but
guessed is was some kind of repeat.  If that's what the behavior was,
Perry and I don't really like it.  Besides,  numarray.numeric.put *is*
Numeric.put, modulo numarray underpinnings.

>   put(array([[ 0.,  1.,  2.], [ 3.,  4.,  5.]]), [1, 4], [10,40])
> fails

numarray.put() does have different semantics for multi-dimensional
destinations... you need multi-dimensional indexes (i.e. a tuple of
index arrays).  Again,  there's now numarray.numeric.put().

> - boolean testing (not even bool(array(0)) works; I'm not sure this is
> good)

[I am. This was a clear and explicit decision to not replicate Numeric 
behavior. I'm convinced that it is the right decision. There is just too
much confusion about what the truth value of an array should be. Helper
functions should be used to make it unambiguous.] 

> - Generally different handling of rank0-arrays; e.g. 
> ``type(num.array(1.0) +
>   0) is float``; one potentially very nasty gotcha are inplace 
> operations
>   (e.g. a**=2) which have totally different semantics for python 
> scalars and
>   rank0 arrays, which, unlike Attribute errors on ``a.shape``, can 
> lead to
>   nasty bugs in corner cases (e.g. when a reduction just infrequently 
> yields
>   scalar ``a``) -- I think this should be mentioned in a gotchas 
> section

We have areduce() for this case, which always returns an array. 
>   (another possible entry would be the need to use .copy() to **save**
> memory
>   on slicing and 1xN, Nx1 matrices versus vectors (people are not used
> to
>   thinking properly about rank from mathematical training or matlab
>   exposure)).

[You will need to elaborate about what you mean here. E.g., as to the 
first: I'm guessing you mean when a slice is taken and then the original
array is deleted. But it isn't clear.] 
> - asarray downcasts arrays (e.g.: asarray(array([1.,2.,3.]),'i'))

True enough.  Is there some reason why the method should silently
succeed (I know we wanted that) and the function should not? 
> - numarray.ones(-5) => MemoryError (ValueError would be nicer)

Easy to change. 
> - numarray.ones(2.0),

This fails, and that's fine by me.  The idea of floating point shapes
seems bogus. 
> numarray.ones([2])

AFIK, this works, and should work. 
> fail (cf. numarray.range(2.0))

IMHO, arange() is a special case and not really equivalent to
numarray.ones(). 
>       b=num.array([[1,2,3,4],[5,6,7,8]]*2)
>       assert eq(num.diagonal(b), [1,6,3,8])
>       assert eq(num.diagonal(b, -1), [5,2,7])
>       c = num.array([b,b])
>       assert eq(num.diagonal(c,1), [[2,7,4], [2,7,4]])
> - no a.toscalar() !!!

a.toscalar() is written a[()] in numarray.
[This is one method that shouldn't be there IMO. What would people 
expect it to do for arrays with  len>1 ?]   
> - matrixmultiply in the docs

OK. 
> - what's the point of swapaxes (i.e. why not have a generalized 
> in-place
>   transpose?)

It's a very common function in implementation of numarray/Numeric.
[In many cases it is far easier to use than an generalized transpose
(which does exist, but requires all axes to be explicitly given)] 
> - what's the point of innerproduct?

Compatibility. [For a while the flavor is: "dammit, why aren't you
compatible?" Now it's: "dammit, why are you compatible?"] 
> - indexing by a list is different from indexing by tuple (I haven't 
> had time
>   to look closely at the docs whether that's intentional)

It's intentional.  Indexing by a list is "array" indexing.  Indexing by
a tuple is not.  Thus, a 3D array by [1,2,3] is pulling out 2D blocks,
while (1,2,3) is pulling out a single scalar. [In particular, tuples
have a special meaning for indexing; this distinction is unavoidable 
since it is a Python language issue.] 
> - doesn't know about Numeric's bizzarre '\x0b' typecode

Me either.  Should we add this? [Not unless there is a good reason.
What's it for? Why are you using it (particularly since you called it
bizarre)?] 
> - numarray.sqrt.reduce([]) raises (sensibly) TypeError, not ValueError

Got lucky I guess. 
> - len(array(1)) or array(1)[0] won't work anymore (understandable, but
>   should be documented)

OK. 
> - (should maximim, minimum reduce to -inf and inf?)

Don't they? 
> - <built-in method reduce of _BinaryUFunc object at 0x82dfc9c> is not
>   a very helpful repr; should be possible to get to the ufunc itself

Doesn't this comment fly in the face of Python itself?
[I imagine it is possible, but why? repr(dir) doesn't give you a usable
function creator, nor does it work in Numeric.] 
> - as in Numeric numarray.maximum.reduce(numarray.array([0,-0.])) => 
> -0.0

Talk about fine points...  noted.  I think the problem is that 0.0 ==
-0.0,  so there's no way for the reduction to get it right without
adding special code to look for this case, and that isn't gonna happen
without a strong case being made. [Again, a very good case needs to be
made for handling this. I doubt that it is important to many, and as
Todd mentions, not easy to handle.] 
> - __array__ protocol no longer supported (how can a non-derived class 
> convert
>   itself efficiently to an array?)

Maybe an old-timer can explain how this worked for Numeric.  I think
this is only partially implemented in numarray and that maybe we need to
add a check for an __array__() method to numarray.array(). 
> Documentation Gotchas
> - p. 34 IMO row vector is used incorrectly; row and column vectors are
> really
>      matrices (i.e. have rank 2) so ``array([[1,2,3]])`` would be a 
> row vector

Sounds reasonable. 
> - No proper explanation of differences between Numeric and numarray,
> or
>   numarray.numeric module differences to proper (e.g. argmin)

If there is,  I don't know where it is.  Noted,  but I'm not really an
encyclopedia of these facts myself. 
> - No migration and best-practice advice (e.g. there should be a 
> standard way
>   for packages which work with both numarray and numeric as backends 
> to let
>   the user choose his preference; how about setting an environment var
> NumPy
>   or something?)

We're just working this out ourselves. [Let me elaborate more. We 
haven't really had much experience yet porting tons of Numeric code (MA
is about the only example). We are working on scipy now so I expect that
in a few months we will know much better what the most important porting
issues are. At the moment, this is better documented by others.] 
> Waffle
[meaning?] 
> ------
> 
> - there *really* ought to be an array equality function (with optional
>   tolerance); it's quite difficult to get right for are normal user 
> (nans;
>   zero-size arrays etc.) and it's often required, especially for 
> testing

You're right.  Want submit one? [Make sure it isn't dependent on the
underlying C compiler's libraries for testing floating point special 
values!] 
> - rank preserving reduction seems useful as an option would be nice --
> e.g. to
>   subtract out or divide by the reduced portion (which currently won't
> e.g.
>   work for columns without adding a unit-dimension by hand).

Sounds like an interesting idea,  but also method bloat. 
> Design
> 
>   The (AFAICS) benefit-free but downside-rich introduction of `type`
>   ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
> 
>   Is there any reason that Typecode objects that compare as desired to
> the
>   relevant strings ("i", "d") wouldn't have done? Now there is an 
> explosion
>   and confusion of interfaces -- some numpy code will now only except
>   type(code)s as "typecode" keyword parameter (even in numarray! see
>   numarray.mlab!) and other stuff
> 
>   Never mind that type already is a highly overused word in the python
> world.

Personally,  I like type because it's succinct and we have type objects,
not single character codes.  More importantly,  Perry likes type, and
the bottom line is that it's his shot to call and he's called it.
[We wrestled with this a while. Given that the representation of the
type had changed from a character code, typecode is clearly misleading
and inappropriate. It is there only for backward compatibility; for new
code to be used under numarray only, people shouldn't use it. Type
certainly seemed by far the most descriptive and accurate term. It does
have the drawback of overloading the type function. Other considerations
were things like atype, but type is what we went with.] 
>   The big method bloat.
>   '''''''''''''''''''''
> 
>   As it says in the Numeric manual introductions there were "good 
> reasons" for

I actually don't buy the reasons myself.  Some methods are natural,
convenient, and good so I need to hear more voices arguing this point
before I'll budge.  Clearly there is *some* bloat,  but identifying what
to ax is more difficult.  I suppose we could do a vote to clean this up.
>   "very few array methods" -- now there are **56** public methods and 
> 8 public
>   attributes (public == not starting with '_'); of those 56 methods 
> about 11
>   are accessors and of the rest about half are redundant or worse 
> (i.e. they
>   either also exist as numarray functions (argmin, argmax, diagonal, 
> ...) or

Which of the public attributes do you have a problem with?

Which accessors? 
>   they really ought to be functions (mean, stddev) or they are quite 
> confusing

The need for these is common so I thought it would be good to add them.
Functions could be added as well. 
>   (``a.min``, ``a.max``

These require tricks to get right so we added them.  The doc-strings
explain what they do. 
> which behave quite differenlty from ``a.argmin`` and
>   ``a.argmax``,

Good point.  These are inconsistent with min and max, which were added
independently at a later date.  I'm thinking we should deprecate the
argmin and argmax methods,  which I added hoping to do polymorphism for
strings and records and if I recall correctly never did anyway.

IMHO,  min(), max(), mean(), and stddev() are simple, useful, and should
remain. 
> never mind ``numarray.minimum``) or

min != minimum, and because it is a little tricky to get right, we
codified it as a method. 
> simply utterly pointless
>   (``a.nelements`` == ``a.size``)).

I added nelements() because I needed it and didn't know about
a.size()... simple as that.  a.size() came later for compatibility
only. [I'll argue that nelements is far clearer in meaning. What
does size mean? Total bytes? Total number of elements? Sorry,
I disagree on this one.] 
> If there really is a need for these (there isn't) if anything a.min 
> and a.max
>     should be called a.flatmin, a.flatmax

flatmin is certainly clear,  but the min/max docstrings also explain it
with no fuzz. 
>   - diagonal, mean, nelements, nonzero, ...

nonzero(), and diagonal() I could care less about so they
can probably be deprecated and removed.  I like mean(). 
>   - perversely the **only** function that I can think off that could 
> have
>     sensibly become a method hasn't: ``put`` (it used to work only on 
> arrays
>     under Numeric and not without reason, so making it a method would 
> have
>     been sensible; numarray.put of course also "works" on non-arrays, 
> it just
>     doesn't do anything with them)

Well,  we need the numarray.put() function for compatibility, and
there's already a more succinct syntax for put(), which is array based
indexing so I don't see any point in adding a put() method. 
>   Test Code
>   '''''''''
>   numtest.py doesn't inspire full confidence (it's about 1000 lines of
> actual
>   code but it doesn't seem that clearly structured and AFAICT contains
> no
>   single loop (and that despite the diversity of shapes, types etc. 
> that exist
>   in numarray -- why not try something slightly more systematic?))

Testing could certainly be better.  unittest might work better for this
kind of thing than doctest.  I agree that we should test for a wider
variety of shapes, types, sizes, and behaviors but it takes time and
effort to do it so it hasn't been done yet.  There's little doubt we'd
find bugs and the system would be better for it. [On the other hand,
is it the most important thing to do next? Any volunteers to improve the
test suite? It may not be the most complete and systematic one out 
there, but it's at least as good as the one for Numeric ;-)]

There's a lot of input here.  We'll see what we can do.  Thanks again.

Regards,
Todd

[A few more editorial comments. When we started numarray, compatibility 
was not high on the list of priorities, so the initial implementation
didn't focus on it. A number of the problems you point out reflect that
origin. While it is more important, it isn't the only guide. We seek
compatibility when there is no strong reason to be incompatible. But
there are a number of issues where we definitely wanted different
behavior (if it were to be completely compatible, we wouldn't have
bothered in the first place; we needed some changes).

Given the odd corners you've run into, it makes me curious to see the
code that generated this; particularly with regard to rank-0 arrays.
If I get a chance I'll take a look at the link you provided.
I wonder if it is typical of what other users will encounter or not.
I guess our experience in porting scipy will give us a better 
indication.

To summarize what we see as work that should be done to address the 
points
made:

rank-0 issues:
1) a.imag doesn't work
2) array(0, type=Float64) + 1e3000  => `inf` with right error modes
      but  array(0, type=Float32) + 1e3000 => `OverflowError`
3) numarray.array(10)/numarray.array(0) => 0
4) numarray.array(10000000000000L) => array(1316134912)
5) numarray.where(0,1,0) => array([0])
6) documentation of behavior (how to turn into scalar, that len and [0] 
indexing
      doesn't work, etc.)

Others

1) puts into lists should raise Type error
   l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l => 
[1, 2, 3]
2) repr for zero length arrays needs to show type and other info.
3) rip out _align and _contiguous self-test hacks
4) improved object array handling (e.g., where and the like)
5) average function
6) change MemoryError to ValueError for ones(-5)
7) document matrixmultiply
8) support for __array__ protocol?
9) Documentation fix for p34 row vector usage.
10) Numeric to numarray conversion guide
11) Better tests

Most of these are not likely to get immediate attention as our focus 
now is on integrating scipy. To the extent they make it easier to do,
their priority may be raised. There are a lot of "should"s but we have
limited resources just like anyone else; we can't do it all at once.]


From jmiller at stsci.edu  Thu Oct 14 06:11:22 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 14 06:11:22 2004
Subject: [Numpy-discussion] character arrays supported by C API?
In-Reply-To: <Pine.LNX.4.61.0410140405380.2667@Chrestomanci>
References: <Pine.LNX.4.61.0409150021150.20224@Chrestomanci>
	 <1095253587.4624.380.camel@halloween.stsci.edu>
	 <Pine.LNX.4.61.0410140405380.2667@Chrestomanci>
Message-ID: <1097759076.4219.39.camel@halloween.stsci.edu>

On Thu, 2004-10-14 at 04:20, Faheem Mitha wrote:
> On Wed, 15 Sep 2004, Todd Miller wrote:
> 
> > On Wed, 2004-09-15 at 00:52, Faheem Mitha wrote:
> >> Dear People,
> >>
> >> Are character arrays supported by the Numarray C API? My impression from
> >> the documentation is no, but I would appreciate a confirmation. Thanks.
> >>
> >>                                                                 Faheem.
> >
> > Yes and no.  CharArray is not as well supported from C as NumArray;
> > there are no easy to call functions which will convert a nested sequence
> > of strings into a CharArray.
> >
> > However,  it is possible to call the Python functions in the CharArray
> > module from C,  and a pre-existing CharArray is a PyArrayObject so it
> > can be manipulated in C as a struct;  it's shape and strides are
> > visible,  it's itemsize is the length of the string, etc.
> >
> > What is it you want to do?   What functions do you think would help?
> 
> Hi. Sorry about the slow reply.
> 
> What I want to do is extremely simple. I want to convert (in C++) a C++ 
> character array to a CharArray. The simplest way of doing this would be to 
> create an array of the appropriate size, and write character strings into 
> it element by element.
> 
> So, a utility function which creates a character array of appropriate 
> dimensions would be useful. Also a utility function which convert a list 
> of strings into a Character Array would also be desirable.
> 
> Currently I am having to work around this limitation by returning lists of 
> strings back to Python. I'd prefer to not have to do that.

That's a sensible addition,  but right now,  such a function does not
exist, and I don't have time to add it myself.  The way to achieve this
without C-API support by CharArray is to do a Python callback.  The
steps in C would be roughly:

0. Import the numarray.strings module.  PyImport_ImportModule().

1. Get the module's dictionary object.  PyModule_GetDict().

2. Get a pointer to CharArray by looking it up in the dictionary.  
PyDict_GetItemString().

3. Construct an argument tuple which contains the constructor
parameters.  Py_BuildValue().

4. Call the constructor using the arg tuple.  The return value is the
CharArray.  PyObject_CallFunction().

Similar steps are done for NumArray in the current C-API in newarray.ch
in NA_NewAllFromBuffer().  

Regards,
Todd


From akulla at comcast.net  Thu Oct 14 06:44:20 2004
From: akulla at comcast.net (akulla at comcast.net)
Date: Thu Oct 14 06:44:20 2004
Subject: [Numpy-discussion] Slow operation of nd_image.generic_filter
Message-ID: <101420041338.1510.416E8157000325EE000005E622007456720E04049A050E@comcast.net>

Hi all,

Could it be that the execution of the following function lasts more than 25 seconds, for an array of shape (256, 480)? 

...
def myFunc(anArray, winSize=5):
    return numarray.nd_image.generic_filter(\
                input=anArray,
                function=lambda win: win.mean(),
                size=winSize,
                mode='constant')
...

Python 2.3, numarray 1.0 (XP, P4)

Regards,
Alban


From falted at pytables.org  Fri Oct 15 04:27:55 2004
From: falted at pytables.org (Francesc Alted)
Date: Fri Oct 15 04:27:55 2004
Subject: [Numpy-discussion] numarray and ATLAS
Message-ID: <200410151318.40035.falted@pytables.org>

Hi,

Perhaps this is a too recurrent subject, but I'm having problems when
making numarray to use ATLAS instead of the mini-lapack included.

I've installed ATLAS 3.6.0 on my pentium IV machine. I've made it a
completely featured LAPACK by following the instructions in:

http://math-atlas.sourceforge.net/errata.html#completelp

and I'm pretty sure that the resulting library works. Now, after exporting
USE_LAPACK and set the appropiate directory for lapack_dirs in addons.py,
the compilation went well (however, I can see that lapack_litemodule.c is
still being compiled, and I don't know if that's normal or not). The command
I've used to install is:

$ python setup.py install --gencode --home=/users/exp/alted/bin-i686

And the error that happens during the test phase follows:

$ python
Python 2.3.4 (#1, Jul 22 2004, 20:47:54)
[GCC 3.3.2 20031022 (Red Hat Linux 3.3.2-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numarray.testall as testall
>>> testall.test()
numarray:                               ((0, 1199), (0, 1199))
numarray.records:                       (0, 48)
numarray.strings:                       (0, 176)
numarray.memmap:                        (0, 82)
numarray.objects:                       (0, 105)
numarray.memorytest:                    (0, 16)
numarray.examples.convolve:             ((0, 20), (0, 20), (0, 20), (0, 20))
numarray.convolve:                      (0, 52)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/users/exp/alted/bin-i686/lib/python/numarray/testall.py", line 24, in test
    result = eval(p+".test()")
  File "<string>", line 0, in ?
  File "/users/exp/alted/bin-i686/lib/python/numarray/fft/FFT.py", line 326, in test
    import dtest
  File "/users/exp/alted/bin-i686/lib/python/numarray/fft/dtest.py", line 238, in ?
    import numarray.random_array as random_array
  File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/__init__.py", line 7, in ?
    from RandomArray2 import *
  File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/RandomArray2.py", line 3, in ?
    import numarray.linear_algebra as linalg
  File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/__init__.py", line 1, in ?
    from LinearAlgebra2 import *
  File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/LinearAlgebra2.py", line 23, in ?
    import lapack_lite2
ImportError:
/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/lapack_lite2.so:
undefined symbol: dgesdd_

I've checked that dgesdd symbol exists on my liblapack.a:

$ strings ~/bin-i686/lib/atlas/liblapack.a | grep dgesdd
dgesdd.o/       1097832195  2514  515   100644  13788     `

but not a dgesdd_, as you can see. 

I'm missing something?

-- 
Francesc Alted


From falted at pytables.org  Fri Oct 15 10:07:40 2004
From: falted at pytables.org (Francesc Alted)
Date: Fri Oct 15 10:07:40 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <200410151318.40035.falted@pytables.org>
References: <200410151318.40035.falted@pytables.org>
Message-ID: <200410151903.41288.falted@pytables.org>

Hi,

Despite de fact that some errors arise, I've checked the numarray version
linked against ATLAS, and it seems like it doesn't get the expected ATLAS
boost:

>>> import timeit
>>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)")
>>> t1.repeat(3,10)
[3.7274820804595947, 3.8542821407318115, 3.7117569446563721]

However, Numeric seems to get it:

>>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))")
>>> t3.repeat(3,10)
[0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281]

i.e. almost 300 faster than numarray

Anyone is getting the acceleration boost with numarray & ATLAS?

Cheers,

A Divendres 15 Octubre 2004 13:18, Francesc Alted va escriure:
> Hi,
> 
> Perhaps this is a too recurrent subject, but I'm having problems when
> making numarray to use ATLAS instead of the mini-lapack included.
> 
> I've installed ATLAS 3.6.0 on my pentium IV machine. I've made it a
> completely featured LAPACK by following the instructions in:
> 
> http://math-atlas.sourceforge.net/errata.html#completelp
> 
> and I'm pretty sure that the resulting library works. Now, after exporting
> USE_LAPACK and set the appropiate directory for lapack_dirs in addons.py,
> the compilation went well (however, I can see that lapack_litemodule.c is
> still being compiled, and I don't know if that's normal or not). The command
> I've used to install is:
> 
> $ python setup.py install --gencode --home=/users/exp/alted/bin-i686
> 
> And the error that happens during the test phase follows:
> 
> $ python
> Python 2.3.4 (#1, Jul 22 2004, 20:47:54)
> [GCC 3.3.2 20031022 (Red Hat Linux 3.3.2-1)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numarray.testall as testall
> >>> testall.test()
> numarray:                               ((0, 1199), (0, 1199))
> numarray.records:                       (0, 48)
> numarray.strings:                       (0, 176)
> numarray.memmap:                        (0, 82)
> numarray.objects:                       (0, 105)
> numarray.memorytest:                    (0, 16)
> numarray.examples.convolve:             ((0, 20), (0, 20), (0, 20), (0, 20))
> numarray.convolve:                      (0, 52)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/users/exp/alted/bin-i686/lib/python/numarray/testall.py", line 24, in test
>     result = eval(p+".test()")
>   File "<string>", line 0, in ?
>   File "/users/exp/alted/bin-i686/lib/python/numarray/fft/FFT.py", line 326, in test
>     import dtest
>   File "/users/exp/alted/bin-i686/lib/python/numarray/fft/dtest.py", line 238, in ?
>     import numarray.random_array as random_array
>   File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/__init__.py", line 7, in ?
>     from RandomArray2 import *
>   File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/RandomArray2.py", line 3, in ?
>     import numarray.linear_algebra as linalg
>   File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/__init__.py", line 1, in ?
>     from LinearAlgebra2 import *
>   File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/LinearAlgebra2.py", line 23, in ?
>     import lapack_lite2
> ImportError:
> /users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/lapack_lite2.so:
> undefined symbol: dgesdd_
> 
> I've checked that dgesdd symbol exists on my liblapack.a:
> 
> $ strings ~/bin-i686/lib/atlas/liblapack.a | grep dgesdd
> dgesdd.o/       1097832195  2514  515   100644  13788     `
> 
> but not a dgesdd_, as you can see. 
> 
> I'm missing something?
> 

-- 
Francesc Alted


From dd55 at cornell.edu  Fri Oct 15 14:18:41 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Fri Oct 15 14:18:41 2004
Subject: [Numpy-discussion] how to deal with large arrays
Message-ID: <200410151714.38492.dd55@cornell.edu>

Hello,

I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to sum 
q(r) over R, so I take a dot product RQ and then sum along one axis to get a 
1-by-q result.

I'm doing this with dot products because it is much faster than the equivalent 
for or while loop. The intermediate r-by-q array can get very large though 
(200MB in my case), so I was wondering if there is a better way to go about 
it?

If not, I can slice up R and deal with it one chunk at a time, then the 
intermediate arrays fit within the available system resources. Would somebody 
offer a suggestion of how to do this intelligently? Should the intermediate 
array be about the size of the processor cache, some fraction of the 
available memory, or is there something else I need to consider?

Thank you,
Darren


From tim.hochberg at cox.net  Fri Oct 15 15:11:05 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Fri Oct 15 15:11:05 2004
Subject: [Numpy-discussion] how to deal with large arrays
In-Reply-To: <200410151714.38492.dd55@cornell.edu>
References: <200410151714.38492.dd55@cornell.edu>
Message-ID: <41704A3C.5080802@cox.net>

Darren Dale wrote:

>Hello,
>
>I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to sum 
>q(r) over R, so I take a dot product RQ and then sum along one axis to get a 
>1-by-q result.
>
>I'm doing this with dot products because it is much faster than the equivalent 
>for or while loop. The intermediate r-by-q array can get very large though 
>(200MB in my case), so I was wondering if there is a better way to go about 
>it?
>  
>
I think so. I believe you are doing something like this:

   result_1 = na.sum(na.dot(R,Q), 0)

I'm fairly certain (but I urge you to double check), that this reduces to:

    result_2 = na.dot(na.sum(R, 0), Q)

which will take up much less intermediate storage and be faster to boot. 
In more quasi-mathematical notations:

   result_1 => sum_i  sum_j  R_ij Qjk = sum_j sum_i R_ij Q_jk = sum_j 
Q_jk sum_i R_ij => result_2

A quick test seems to confirm this:

import numarray as na
from numarray import random_array

q = 10
r = 12

R = random_array.random((r,3))
Q = random_array.random((3,q))

x1 = na.sum(na.dot(R,Q), 0)
x2 = na.dot(na.sum(R, 0), Q)

print na.allclose(x1, x2)


-tim


>If not, I can slice up R and deal with it one chunk at a time, then the 
>intermediate arrays fit within the available system resources. Would somebody 
>offer a suggestion of how to do this intelligently? Should the intermediate 
>array be about the size of the processor cache, some fraction of the 
>available memory, or is there something else I need to consider?
>
>Thank you,
>Darren
>
>
>-------------------------------------------------------
>This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
>Use IT products in your business? Tell us what you think of them. Give us
>Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
>http://productguide.itmanagersjournal.com/guidepromo.tmpl
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>  
>


From dd55 at cornell.edu  Fri Oct 15 16:29:03 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Fri Oct 15 16:29:03 2004
Subject: [Numpy-discussion] how to deal with large arrays
In-Reply-To: <41704A3C.5080802@cox.net>
References: <200410151714.38492.dd55@cornell.edu> <41704A3C.5080802@cox.net>
Message-ID: <200410151927.54005.dd55@cornell.edu>

Thank you for your response, Tim,

On Friday 15 October 2004 06:07 pm, Tim Hochberg wrote:
> Darren Dale wrote:
> >Hello,
> >
> >I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to
> > sum q(r) over R, so I take a dot product RQ and then sum along one axis
> > to get a 1-by-q result.
> >
> >I'm doing this with dot products because it is much faster than the
> > equivalent for or while loop. The intermediate r-by-q array can get very
> > large though (200MB in my case), so I was wondering if there is a better
> > way to go about it?
>
> I'm fairly certain (but I urge you to double check), that this reduces to:
>
>     result_2 = na.dot(na.sum(R, 0), Q)
>

Yes. As usual, I left out a bit of information that turned out to be 
important. See below 

A modified test:

from numarray import *
from numarray import random_array

q = 10
r = 12

R = random_array.random((r,3))
Q = random_array.random((3,q))

x1 = sum( exp(1j*dot(R,Q)), 0) #note complex argument to exp()
x2 = exp(1j*dot(sum(R, 0), Q))

print allclose(x1, x2)

The complex arithmetic changes things. I am still learning how to keep my code 
efficient. The following code is actually almost as fast as using the large 
dot product, apparently I had some other sinks in my original tests:

phase = zeros(len(Q[0]),'d')
for i in range(len(Q[0])):
    phase[i] = phase[i] + sum(exp(1j*dot(R,Q[:,i])), 0)

If q=1000 and r=2500, the for loop takes about 13% longer than the dot product 
method. Incredibly, if q=10,000 and r=2500, the for loop is 17% faster. So I 
am going to use it instead. Apparently I had some other time sink in my 
original test.

from numarray import *
from numarray import random_array
from time import clock

q = 10000
r = 2500

R = random_array.random((r,3))
Q = random_array.random((3,q))

t0 = clock()
x1 = sum(exp(1j*dot(R,Q)), 0) #note complex argument to exp()
t1 = clock()
dt1 = t1-t0

phase = zeros(len(Q[0]),'d')
for i in range(len(Q[0])):
    phase[i] = phase[i] + sum(exp(1j*dot(R,Q[:,i])), 0)
    
t2 = clock()
dt2 = t2-t1

print (dt2-dt1)/dt1

-- 

Darren


From falted at pytables.org  Sat Oct 16 04:29:02 2004
From: falted at pytables.org (Francesc Alted)
Date: Sat Oct 16 04:29:02 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <200410151903.41288.falted@pytables.org>
References: <200410151318.40035.falted@pytables.org> <200410151903.41288.falted@pytables.org>
Message-ID: <200410161327.47485.falted@pytables.org>

A Divendres 15 Octubre 2004 19:03, Francesc Alted va escriure:
> >>> import timeit
> >>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)")
> >>> t1.repeat(3,10)
> [3.7274820804595947, 3.8542821407318115, 3.7117569446563721]
> 
> However, Numeric seems to get it:
> 
> >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))")
> >>> t3.repeat(3,10)
> [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281]
> 
> i.e. almost 300 faster than numarray

Ooops! The Numeric test had a bug on it. The correct test would be:

>>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');m1=Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');m2=Numeric.reshape(m2,(dim1,dim1))")
>>> t3.repeat(3,10)
[0.47363090515136719, 0.47403502464294434, 0.47770595550537109]

which is 8 times faster, more or less, than numarray (or Numeric) without
ATLAS.

Just to clarify things ;)

-- 
Francesc Alted


From aisaac at american.edu  Sat Oct 16 15:53:01 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Sat Oct 16 15:53:01 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <1097609991.30171.556.camel@halloween.stsci.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu><1097550159.2568.5.camel@localhost.localdomain><Mahogany-0.66.0-1780-20041012-100006.00@american.edu><1097595580.24491.4.camel@freyer.sfo.csun.edu><Mahogany-0.66.0-1780-20041012-152445.00@american.edu><1097609991.30171.556.camel@halloween.stsci.edu>
Message-ID: <Mahogany-0.66.0-1432-20041016-185312.00@american.edu>

On 12 Oct 2004, Todd Miller apparently wrote:
> Go here:
> http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse
> then "Submit New", and set the "category" to "documentation.


Done.

Thanks,
Alan Isaac


From aisaac at american.edu  Sat Oct 16 15:53:02 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Sat Oct 16 15:53:02 2004
Subject: [Numpy-discussion] matrixmultiply: return type
Message-ID: <Mahogany-0.66.0-1432-20041016-185314.00@american.edu>

Being new to numerical Python applications,
I was a little puzzled/concerned when I read
http://sourceforge.net/tracker/index.php?func=detail&aid=984368&group_id=1369&atid=450446
I *think* the answer is:
matrixmultiply will always return an array.

Is there a stable view about what type of object will be
returned by matrixmultiply?  Currently, to my initial
surprise, it returns an array when the arguments are
matrices.  Is this stable?

Might an optional argument to specify the return type
be desirable?

Thank you,
Alan Isaac


From jmiller at stsci.edu  Sat Oct 16 18:27:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Sat Oct 16 18:27:04 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <Mahogany-0.66.0-1432-20041016-185312.00@american.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
	 <1097550159.2568.5.camel@localhost.localdomain>
	 <Mahogany-0.66.0-1780-20041012-100006.00@american.edu>
	 <1097595580.24491.4.camel@freyer.sfo.csun.edu>
	 <Mahogany-0.66.0-1780-20041012-152445.00@american.edu>
	 <1097609991.30171.556.camel@halloween.stsci.edu>
	 <Mahogany-0.66.0-1432-20041016-185312.00@american.edu>
Message-ID: <1097976412.3744.159.camel@localhost.localdomain>

On Sat, 2004-10-16 at 17:17, Alan G Isaac wrote:
> On 12 Oct 2004, Todd Miller apparently wrote:
> > Go here:
> > http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse
> > then "Submit New", and set the "category" to "documentation.
> 
> 
> Done.
> 
> Thanks,
> Alan Isaac

As it turns out,  I misdirected you.  The above link is for numarray
bugs.  This link is for Numeric bugs:

http://sourceforge.net/tracker/?group_id=1369&atid=101369

I moved the diagonal doc bug report to the Numeric bugs tracker.

Regards,
Todd


From jmiller at stsci.edu  Sat Oct 16 18:50:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Sat Oct 16 18:50:04 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <200410161327.47485.falted@pytables.org>
References: <200410151318.40035.falted@pytables.org>
	 <200410151903.41288.falted@pytables.org>
	 <200410161327.47485.falted@pytables.org>
Message-ID: <1097977801.3744.184.camel@localhost.localdomain>

On Sat, 2004-10-16 at 07:27, Francesc Alted wrote:
> A Divendres 15 Octubre 2004 19:03, Francesc Alted va escriure:
> > >>> import timeit
> > >>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)")
> > >>> t1.repeat(3,10)
> > [3.7274820804595947, 3.8542821407318115, 3.7117569446563721]
> > 
> > However, Numeric seems to get it:
> > 
> > >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))")
> > >>> t3.repeat(3,10)
> > [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281]
> > 
> > i.e. almost 300 faster than numarray
> 
> Ooops! The Numeric test had a bug on it. The correct test would be:
> 
> >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');m1=Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');m2=Numeric.reshape(m2,(dim1,dim1))")
> >>> t3.repeat(3,10)
> [0.47363090515136719, 0.47403502464294434, 0.47770595550537109]
> 
> which is 8 times faster, more or less, than numarray (or Numeric) without
> ATLAS.
> 
> Just to clarify things ;)

Hi Francesc,

I don't think numarray dot() will pick up any boost at all from ATLAS
because it's not written to do it.   Besides that,  there are two
performance problems I know of with numarray's dot() which may dominate
or dilute any ATLAS benefits:

1. dot() requires array creation.

2. dot() requires array copies.

Because it has a class hierarchy and a memory buffer object,  numarray
is at a disadvantage for (1).  (2) just hasn't been optimized yet for
noncontiguous arrays which (I think) are always present when dot()
starts with two contiguous array parameters.

Regards,
Todd


From stephen.walton at csun.edu  Sun Oct 17 17:35:03 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Sun Oct 17 17:35:03 2004
Subject: [Numpy-discussion] New LAPACK and ScaLAPACK planned
Message-ID: <1098059497.5110.5.camel@localhost.localdomain>

From volume 4 #37 of the NA-Digest mailing list.  I hope this is of
enough interest to this list to justify the cross post.

From dongarra at cs.utk.edu  Fri Oct 15 04:10:44 2004
From: dongarra at cs.utk.edu (Jack Dongarra)
Date: Fri, 15 Oct 2004 04:10:44 -0400
Subject: New Release of LAPACK and ScaLAPACK Planned
Message-ID: <mailman.17.1490332312.18468.numpy-discussion@python.org>

New Release of LAPACK and ScaLAPACK planned.

We are pleased to announce that we recently received NSF funding for new
releases of the LAPACK and ScaLAPACK linear algebra libraries.
The proposal pointed out the new and better algorithms that have been
developed by many people in the community since the first  releases of
these libraries, as well as more obvious gaps and possible improvements.

The proposal listed a large number of activities, which we now need to
prioritize. There are a number of design decisions that still need to be
made, for which we are interested in your input. For this purpose, we
would like to remind you of a web page to collect your input that we
originally announced on NA-Digest while we were preparing the proposal:

    http://icl.cs.utk.edu/lapack-survey.html

In addition to the questions on that form, we are interested in your
opinion on all aspects of the proposal, a copy of which you may find at:

    http://www.cs.berkeley.edu/~demmel/Sca-LAPACK-Proposal.pdf

Thanks,
Jim Demmel and Jack Dongarra
--=20
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, CSU Northridge

--=-vf5K3It096b9Vx529EKP
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQBBcw7pURWByv7S9xcRAms0AJ0YE13AXJ127J/5UVRs2t+BUYMIUQCgnd8I
kvjNlPBX6phVfhjclKGExPY=
=1kTj
-----END PGP SIGNATURE-----

--=-vf5K3It096b9Vx529EKP--


From falted at pytables.org  Mon Oct 18 01:30:01 2004
From: falted at pytables.org (Francesc Alted)
Date: Mon Oct 18 01:30:01 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <1097977801.3744.184.camel@localhost.localdomain>
References: <200410151318.40035.falted@pytables.org> <200410161327.47485.falted@pytables.org> <1097977801.3744.184.camel@localhost.localdomain>
Message-ID: <200410181029.14879.falted@pytables.org>

Hi Todd,

A Diumenge 17 Octubre 2004 03:50, Todd Miller va escriure:
> I don't think numarray dot() will pick up any boost at all from ATLAS
> because it's not written to do it.   Besides that,  there are two
> performance problems I know of with numarray's dot() which may dominate
> or dilute any ATLAS benefits:
> 
> 1. dot() requires array creation.

Yes, but my guess is that for large arrays, this time should be negligible
compared with the multiplication time.

> 2. dot() requires array copies.

Mmm, you mean even for well-behaved arrays? Sorry, but I don't understand
why.

May I ask if there is any plan to complete a better integration of external
LAPACK libraries in numarray or this is considered low priority?

Never mind, I don't need this functionality right now. It's just that I'm
preparing a series of 'hands-on' sessions about Python and Scientific
Computing, and I was trying to understand the current advantages and
limitations of numarray compared with NumPy.

Cheers,

-- 
Francesc Alted


From jmiller at stsci.edu  Mon Oct 18 04:53:01 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Oct 18 04:53:01 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <200410181029.14879.falted@pytables.org>
References: <200410151318.40035.falted@pytables.org>
	 <200410161327.47485.falted@pytables.org>
	 <1097977801.3744.184.camel@localhost.localdomain>
	 <200410181029.14879.falted@pytables.org>
Message-ID: <1098100329.3741.96.camel@localhost.localdomain>

On Mon, 2004-10-18 at 04:29, Francesc Alted wrote:
> Hi Todd,
> 
> A Diumenge 17 Octubre 2004 03:50, Todd Miller va escriure:
> > I don't think numarray dot() will pick up any boost at all from ATLAS
> > because it's not written to do it.   Besides that,  there are two
> > performance problems I know of with numarray's dot() which may dominate
> > or dilute any ATLAS benefits:
> > 
> > 1. dot() requires array creation.
> 
> Yes, but my guess is that for large arrays, this time should be negligible
> compared with the multiplication time.
> 

Probably true.  I should measure this.  For small computations,  it's an
issue.

> > 2. dot() requires array copies.
> 
> Mmm, you mean even for well-behaved arrays? Sorry, but I don't understand
> why.

I looked at this some this morning,  trying to figure out why this is a
problem only for numarray.  It turns out that Numeric strides its arrays
to get around the copy.   When I implemented numarray,  I chose not to
stride because I thought it would be too slow...  Recently I realized
that one input array to dot() is *always* transposed and therefore
likely noncontiguous and therefore copied.  I think it's now possible to
simply port the Numeric code so I'll look into that.

> May I ask if there is any plan to complete a better integration of external
> LAPACK libraries in numarray or this is considered low priority?

Perry may answer this.  I have no immediate plans for it...  it does
sound like enough people need this that it should be done.

Regards,
Todd


From perry at stsci.edu  Mon Oct 18 05:21:02 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Mon Oct 18 05:21:02 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <1098100329.3741.96.camel@localhost.localdomain>
References: <200410151318.40035.falted@pytables.org> <200410161327.47485.falted@pytables.org> <1097977801.3744.184.camel@localhost.localdomain> <200410181029.14879.falted@pytables.org> <1098100329.3741.96.camel@localhost.localdomain>
Message-ID: <C98A8957-20FF-11D9-8F26-000A95B68E50@stsci.edu>

On Oct 18, 2004, at 7:52 AM, Todd Miller wrote:

> On Mon, 2004-10-18 at 04:29, Francesc Alted wrote:
>
>> May I ask if there is any plan to complete a better integration of 
>> external
>> LAPACK libraries in numarray or this is considered low priority?
>
> Perry may answer this.  I have no immediate plans for it...  it does
> sound like enough people need this that it should be done.
>
Like Todd says, it does sound like this needs to be done. I think it 
takes
a back seat to doing the scipy integration in general, but will need to
be addressed soon thereafter.

Perry


From frank.horowitz at csiro.au  Mon Oct 18 23:33:03 2004
From: frank.horowitz at csiro.au (Frank Horowitz)
Date: Mon Oct 18 23:33:03 2004
Subject: [Numpy-discussion] Numeric Underflow Exceptions: Recommendations?
Message-ID: <1098167541.8538.48.camel@localhost>

Hi all,

Using Numeric 23.5 I've been bitten by the dreaded 'floating point
underflow throws an "OverflowError: math range error" instead of
silently returning zero' bug.

My setup is Debian unstable (Sid) on an i386, and I am using Debian's
binary package "python-numeric".

I understand from googling past discussions that this is (used to be?)
phase-of-the-moon stuff, depending mostly upon architecture, options at
libm compilation time of libc6. Several references to a trick of adding
"-lieee" to the link list succeeding in taming the bug were mentioned
around the era of Python2.0. 

My questions are these: Is there some higher level way of dealing with
underflow now in Numeric? Or am I going to have to track down wherever
"-lieee" has disappeared to in Debian, and recompile Numeric in the
hopes that that still cures the problem?

Any other tricks up people's sleeves for dealing with this? (I already
know about exp_safe in Fernando Perez' IPython/numutils.py, BTW. I'm
kind of hoping for a library level fix though, since my code is littered
with "Numeric.exp()" calls.)

TIA for any help you might be able to provide!

Cheers,

	Frank Horowitz


From falted at pytables.org  Tue Oct 19 01:35:05 2004
From: falted at pytables.org (Francesc Alted)
Date: Tue Oct 19 01:35:05 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <1098100329.3741.96.camel@localhost.localdomain>
References: <200410151318.40035.falted@pytables.org> <200410181029.14879.falted@pytables.org> <1098100329.3741.96.camel@localhost.localdomain>
Message-ID: <200410191034.08018.falted@pytables.org>

A Dilluns 18 Octubre 2004 13:52, Todd Miller va escriure:
> > > 1. dot() requires array creation.
> > 
> > Yes, but my guess is that for large arrays, this time should be negligible
> > compared with the multiplication time.
> > 
> 
> Probably true.  I should measure this.  For small computations,  it's an
> issue.

Well, for small arrays ATLAS (or any other optimized LAPACK library) can't
probably do much better than lapack lite, so I think you should not worry
about this anyway.

> Perry may answer this.  I have no immediate plans for it...  it does
> sound like enough people need this that it should be done.

Ok. Thanks for information,

-- 
Francesc Alted


From flin at broadpark.no  Wed Oct 20 02:23:30 2004
From: flin at broadpark.no (Frank Lindseth)
Date: Wed Oct 20 02:23:30 2004
Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003
Message-ID: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>

Hi,
 
I need numeric in a python2.4 / win32 project.
 
Is there a binary installer somewhere?
 
I tried to compile it from source but ran into the following problem (se
below):
Where are the libs supposed to come from?
 
- Frank
 
C:\users\frankl\download\Numeric-23.5>c:\Python24\python.exe setup.py
install
running install
running build
running build_py
running build_ext
building 'lapack_lite' extension
C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link.exe /DLL
/nologo
 /INCREMENTAL:NO /LIBPATH:/usr/lib/atlas /LIBPATH:c:\Python24\libs
/LIBPATH:c:\P
ython24\PCBuild lapack.lib cblas.lib f77blas.lib atlas.lib g2c.lib
/EXPORT:initl
apack_lite build\temp.win32-2.4\Release\Src\lapack_litemodule.obj
/OUT:build\lib
.win32-2.4\lapack_lite.pyd
/IMPLIB:build\temp.win32-2.4\Release\Src\lapack_lite.
lib
LINK : fatal error LNK1181: cannot open input file 'lapack.lib'
error: command '"C:\Program Files\Microsoft Visual Studio .NET
2003\Vc7\bin\link
.exe"' failed with exit status 1181
 
C:\users\frankl\download\Numeric-23.5>
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041020/3bc48a5e/attachment.html>

From stephen.walton at csun.edu  Wed Oct 20 09:22:37 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Wed Oct 20 09:22:37 2004
Subject: [Numpy-discussion] Problems compiling numeric using python2.4
	and VS.Net 2003
In-Reply-To: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>
Message-ID: <1098288859.7182.11.camel@sunspot.csun.edu>

On Wed, 2004-10-20 at 11:17 +0200, Frank Lindseth wrote:

> LINK : fatal error LNK1181: cannot open input file 'lapack.lib'

Edit setup.py, setting the variables library_dirs_list and
libraries_list to empty lists, and try again.

List:  shouldn't this be the default?  Right now Numeric looks for ATLAS
by default.

-- 
Stephen Walton, Professor of Physics and Astronomy,
California State University, Northridge
stephen.walton at csun.edu


From flin at broadpark.no  Wed Oct 20 11:46:27 2004
From: flin at broadpark.no (flin at broadpark.no)
Date: Wed Oct 20 11:46:27 2004
Subject: [Numpy-discussion] Problems compiling numeric using python2.4	and VS.Net 2003
In-Reply-To: <1098288859.7182.11.camel@sunspot.csun.edu>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu>
Message-ID: <1098297610.4176b10a3b4d2@webmail.broadpark.no>

Thank you for the replay Stephen,
I did as you suggested:
library_dirs_list = []
libraries_list = [] 
#library_dirs_list = ['/usr/lib/atlas']
#libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] 

but it still woun't install (se below)
Any suggestions?


C:\users\frankl\download\Numeric-23.5>c:\Python24\python.exe setup.py install
running install
running build
running build_py
running build_ext
building 'lapack_lite' extension
C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link.exe /DLL
/nologo
 /INCREMENTAL:NO /LIBPATH:c:\Python24\libs /LIBPATH:c:\Python24\PCBuild
/EXPORT:
initlapack_lite build\temp.win32-2.4\Release\Src\lapack_litemodule.obj
/OUT:buil
d\lib.win32-2.4\lapack_lite.pyd
/IMPLIB:build\temp.win32-2.4\Release\Src\lapack_
lite.lib
   Creating library build\temp.win32-2.4\Release\Src\lapack_lite.lib and object
build\temp.win32-2.4\Release\Src\lapack_lite.exp
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgeev_
refere
nced in function _lapack_lite_dgeev
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dsyevd_
refer
enced in function _lapack_lite_dsyevd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zheevd_
refer
enced in function _lapack_lite_zheevd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgelsd_
refer
enced in function _lapack_lite_dgelsd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgesv_
refere
nced in function _lapack_lite_dgesv
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgesdd_
refer
enced in function _lapack_lite_dgesdd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgetrf_
refer
enced in function _lapack_lite_dgetrf
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dpotrf_
refer
enced in function _lapack_lite_dpotrf
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgeev_
refere
nced in function _lapack_lite_zgeev
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgelsd_
refer
enced in function _lapack_lite_zgelsd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgesv_
refere
nced in function _lapack_lite_zgesv
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgesdd_
refer
enced in function _lapack_lite_zgesdd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgetrf_
refer
enced in function _lapack_lite_zgetrf
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zpotrf_
refer
enced in function _lapack_lite_zpotrf
build\lib.win32-2.4\lapack_lite.pyd : fatal error LNK1120: 14 unresolved
externa
ls
error: command '"C:\Program Files\Microsoft Visual Studio .NET
2003\Vc7\bin\link
.exe"' failed with exit status 1120

C:\users\frankl\download\Numeric-23.5>


Quoting Stephen Walton <stephen.walton at csun.edu>:

> On Wed, 2004-10-20 at 11:17 +0200, Frank Lindseth wrote:
> 
> > LINK : fatal error LNK1181: cannot open input file 'lapack.lib'
> 
> Edit setup.py, setting the variables library_dirs_list and
> libraries_list to empty lists, and try again.
> 
> List:  shouldn't this be the default?  Right now Numeric looks for ATLAS
> by default.
> 
> -- 
> Stephen Walton, Professor of Physics and Astronomy,
> California State University, Northridge
> stephen.walton at csun.edu
> 
> 


From stephen.walton at csun.edu  Wed Oct 20 12:09:00 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Wed Oct 20 12:09:00 2004
Subject: [Numpy-discussion] Problems compiling numeric using
	python2.4	and VS.Net 2003
In-Reply-To: <1098297610.4176b10a3b4d2@webmail.broadpark.no>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>
	 <1098288859.7182.11.camel@sunspot.csun.edu>
	 <1098297610.4176b10a3b4d2@webmail.broadpark.no>
Message-ID: <1098299055.7182.33.camel@sunspot.csun.edu>

On Wed, 2004-10-20 at 20:40 +0200, flin at broadpark.no wrote:
> Thank you for the replay Stephen,
> I did as you suggested:
> library_dirs_list = []
> libraries_list = [] 
> #library_dirs_list = ['/usr/lib/atlas']
> #libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] 
> 
> but it still woun't install (se below)
> Any suggestions?

I'm guessing you still have files left over from last time.  On Unix,
you can run the 'makeclean.sh' script.  On Windows, manually deleting
the directories listed in that script (they are all called build) should
do the trick.  Then try the 'setup.py build' again.

-- 
Stephen Walton, Professor of Physics and Astronomy,
California State University, Northridge
stephen.walton at csun.edu


From flin at broadpark.no  Wed Oct 20 15:59:45 2004
From: flin at broadpark.no (flin at broadpark.no)
Date: Wed Oct 20 15:59:45 2004
Subject: [Numpy-discussion] Problems compiling numeric using	python2.4	and VS.Net 2003
In-Reply-To: <1098299055.7182.33.camel@sunspot.csun.edu>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>  <1098288859.7182.11.camel@sunspot.csun.edu>  <1098297610.4176b10a3b4d2@webmail.broadpark.no> <1098299055.7182.33.camel@sunspot.csun.edu>
Message-ID: <1098312989.4176ed1d83de1@webmail.broadpark.no>

Thanks again Stephen.
Still no success.
I deleted the whole Numeric-directory-tree,
unzipped a newly downloaded src-file,
edited the setup.py as you suggested,
tried to run the installer,
same error.

I'm not sure what to du next?
(what canm't somebody just make a binary installer for python2.4,
after all it's in beta now...)

- Frank

--------

Quoting Stephen Walton <stephen.walton at csun.edu>:

> On Wed, 2004-10-20 at 20:40 +0200, flin at broadpark.no wrote:
> > Thank you for the replay Stephen,
> > I did as you suggested:
> > library_dirs_list = []
> > libraries_list = [] 
> > #library_dirs_list = ['/usr/lib/atlas']
> > #libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] 
> > 
> > but it still woun't install (se below)
> > Any suggestions?
> 
> I'm guessing you still have files left over from last time.  On Unix,
> you can run the 'makeclean.sh' script.  On Windows, manually deleting
> the directories listed in that script (they are all called build) should
> do the trick.  Then try the 'setup.py build' again.
> 
> -- 
> Stephen Walton, Professor of Physics and Astronomy,
> California State University, Northridge
> stephen.walton at csun.edu
> 
> 


From stephen.walton at csun.edu  Wed Oct 20 16:47:52 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Wed Oct 20 16:47:52 2004
Subject: [Numpy-discussion] Problems compiling numeric
	using	python2.4	and VS.Net 2003
In-Reply-To: <1098312989.4176ed1d83de1@webmail.broadpark.no>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>
	 <1098288859.7182.11.camel@sunspot.csun.edu>
	 <1098297610.4176b10a3b4d2@webmail.broadpark.no>
	 <1098299055.7182.33.camel@sunspot.csun.edu>
	 <1098312989.4176ed1d83de1@webmail.broadpark.no>
Message-ID: <1098315982.7159.2.camel@freyer.sfo.csun.edu>

On Wed, 2004-10-20 at 15:56, flin at broadpark.no wrote:
> Thanks again Stephen.
> Still no success.

Sorry.  Being a Linux user I'm afraid I can't help much.

> I'm not sure what to du next?

Download SciPy from http://www.scipy.org/?  It is much more than you
actually need, being all of Scientific Python as well as Numeric, but at
least it's an all-in-one installer.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041020/736d0e27/attachment.sig>

From mdehoon at ims.u-tokyo.ac.jp  Wed Oct 20 21:40:04 2004
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Wed Oct 20 21:40:04 2004
Subject: [Numpy-discussion] Problems compiling numeric using	python2.4
 and VS.Net 2003
In-Reply-To: <1098312989.4176ed1d83de1@webmail.broadpark.no>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>  <1098288859.7182.11.camel@sunspot.csun.edu>  <1098297610.4176b10a3b4d2@webmail.broadpark.no> <1098299055.7182.33.camel@sunspot.csun.edu> <1098312989.4176ed1d83de1@webmail.broadpark.no>
Message-ID: <41772BE1.5020403@ims.u-tokyo.ac.jp>

flin at broadpark.no wrote:
> Thanks again Stephen.
> Still no success.
> I deleted the whole Numeric-directory-tree,
> unzipped a newly downloaded src-file,
> edited the setup.py as you suggested,
> tried to run the installer,
> same error.

Previously I managed to compile Numeric for Python 2.4 on Windows, using the 
MinGW compiler and Atlas. If you still need it, I can send you the binaries.
> 
> I'm not sure what to du next?
> (what canm't somebody just make a binary installer for python2.4,
> after all it's in beta now...)

There is a bug in Python 2.4 that prevents users from running bdist_wininst to 
create a binary installer. python setup.py install fails too. See bug 1021756 on 
sourceforge.

--Michiel, U Tokyo.

-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon


From stark at tuebingen.mpg.de  Wed Oct 20 23:49:23 2004
From: stark at tuebingen.mpg.de (Sebastian Stark)
Date: Wed Oct 20 23:49:23 2004
Subject: [Numpy-discussion] Re: numarray and ATLAS
Message-ID: <200410210846.09275.stark@tuebingen.mpg.de>

> Perhaps this is a too recurrent subject, but I"m having problems when
> making numarray to use ATLAS instead of the mini-lapack included.

I had to change lapack_libs and lapack_dirs in addons.py to read: 

  lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm']
  lapack_dirs = ['/usr/local/lib/ATLAS']

I have all my .a files in /usr/local/lib/ATLAS so I can control which ones I'm 
actually linking against.

mosel ~ % ls -l /usr/local/lib/ATLAS
total 14608
-rw-r--r--    1 root     staff     7952316 Oct 20 10:03 libatlas.a
-rw-r--r--    1 root     staff      277592 Oct 20 10:03 libcblas.a
-rw-r--r--    1 root     staff      261060 Oct 20 10:45 libf2c.a
-rw-r--r--    1 root     staff      353278 Oct 20 10:03 libf77blas.a
-rw-r--r--    1 root     staff     5734736 Oct 20 10:42 liblapack.a
-rw-r--r--    1 root     staff      324968 Oct 20 10:03 libtstatlas.a


-Sebastian

(and yes, I get a significant speed boost from ATLAS)

-- 
Sebastian Stark -- http://www.kyb.tuebingen.mpg.de/~stark
Max Planck Institute for Biological Cybernetics
Spemannstr. 38, 72076 Tuebingen
Phone: +49 7071 601 555 -- Fax: +49 7071 601 552


From stark at tuebingen.mpg.de  Wed Oct 20 23:56:14 2004
From: stark at tuebingen.mpg.de (Sebastian Stark)
Date: Wed Oct 20 23:56:14 2004
Subject: [Numpy-discussion] indexing on uninitialized arrays
Message-ID: <200410210852.02285.stark@tuebingen.mpg.de>

In matlab I can do:

>> x = []

x =

     []

>> x(2) = 1.4

x =

         0    1.4000

>> x(2,4) = 2.9

x =

         0    1.4000         0         0
         0         0         0    2.9000


which means x expands as necessary depending on "how far" my indexing goes. 

Now I'm thinking about how to realize this with numarray. I could imagine to 
define a derived array type "SelfInflatingArray" which catches the IndexError 
exception and does the right thing then. Any better ideas?


-Sebastian

-- 
Sebastian Stark -- http://www.kyb.tuebingen.mpg.de/~stark
Max Planck Institute for Biological Cybernetics
Spemannstr. 38, 72076 Tuebingen
Phone: +49 7071 601 555 -- Fax: +49 7071 601 552


From falted at pytables.org  Thu Oct 21 00:32:31 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Oct 21 00:32:31 2004
Subject: [Numpy-discussion] Re: numarray and ATLAS
In-Reply-To: <200410210846.09275.stark@tuebingen.mpg.de>
References: <200410210846.09275.stark@tuebingen.mpg.de>
Message-ID: <200410210929.18477.falted@pytables.org>

A Dijous 21 Octubre 2004 08:46, Sebastian Stark va escriure:
> 
> > Perhaps this is a too recurrent subject, but I"m having problems when
> > making numarray to use ATLAS instead of the mini-lapack included.
> 
> I had to change lapack_libs and lapack_dirs in addons.py to read: 
> 
>   lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm']
>   lapack_dirs = ['/usr/local/lib/ATLAS']

I've done something similar:

    lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas']
    lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib']

Mmm, I can see that you have added 'f2c'. However, I don't have it
installed. Could that be the cause that tests would not pass in my case?

> (and yes, I get a significant speed boost from ATLAS)

Great, it's good to know that.

Thank you very much for your feedback,

-- 
Francesc Alted


From rkern at ucsd.edu  Thu Oct 21 02:05:51 2004
From: rkern at ucsd.edu (Robert Kern)
Date: Thu Oct 21 02:05:51 2004
Subject: [Numpy-discussion] Re: numarray and ATLAS
In-Reply-To: <200410210929.18477.falted@pytables.org>
References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org>
Message-ID: <41777422.8040205@ucsd.edu>

Francesc Alted wrote:
> A Dijous 21 Octubre 2004 08:46, Sebastian Stark va escriure:
> 
>>>Perhaps this is a too recurrent subject, but I"m having problems when
>>>making numarray to use ATLAS instead of the mini-lapack included.
>>
>>I had to change lapack_libs and lapack_dirs in addons.py to read: 
>>
>>  lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm']
>>  lapack_dirs = ['/usr/local/lib/ATLAS']
> 
> 
> I've done something similar:
> 
>     lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas']
>     lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib']
> 
> Mmm, I can see that you have added 'f2c'. However, I don't have it
> installed. Could that be the cause that tests would not pass in my case?

If you are compiling with gcc, add 'g2c' after 'f77blas'. It's g77's 
FORTRAN runtime library.

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter


From falted at pytables.org  Thu Oct 21 02:33:28 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Oct 21 02:33:28 2004
Subject: [Numpy-discussion] Re: numarray and ATLAS
In-Reply-To: <41777422.8040205@ucsd.edu>
References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org> <41777422.8040205@ucsd.edu>
Message-ID: <200410211126.42729.falted@pytables.org>

A Dijous 21 Octubre 2004 10:32, Robert Kern va escriure:
> > Mmm, I can see that you have added 'f2c'. However, I don't have it
> > installed. Could that be the cause that tests would not pass in my case?
> 
> If you are compiling with gcc, add 'g2c' after 'f77blas'. It's g77's 
> FORTRAN runtime library.

Yeah, that made the trick!. So for a gcc compiler, this works just fine:

lapack_libs = ['lapack', 'f77blas', 'g2c', 'cblas', 'atlas', 'm']

Many thanks!,

-- 
Francesc Alted


From stephen.walton at csun.edu  Thu Oct 21 10:59:05 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Oct 21 10:59:05 2004
Subject: [Numpy-discussion] Counting array elements
Message-ID: <1098381332.8249.12.camel@freyer.sfo.csun.edu>

Is there some simple way of counting the number of array elements which
satisfy a certain condition?  It is easy to do

A[A<=1].sum()

to sum all the values of A which are less than 1, but there doesn't seem
to be a count() method.  I tried

(A<=1).sum()

but this throws an exception at numarray 1.1.  If I try

sum(A<=value)

I have to nest multiple sums if A has rank greater than 1, plus the sum
overflows if A is large, apparently because boolean gets treated as
Int8.  (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)).  You get
zero.)  The following works:

array(A<=1024,type=Int32).sum()

but is awkward.  Am I missing an obvious better alternative?  If not,
I'm going to file an RFE :-) .

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041021/85482632/attachment.sig>

From Chris.Barker at noaa.gov  Thu Oct 21 11:33:03 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Oct 21 11:33:03 2004
Subject: [Numpy-discussion] Re: numarray and ATLAS
In-Reply-To: <41777422.8040205@ucsd.edu>
References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org> <41777422.8040205@ucsd.edu>
Message-ID: <4177FFF6.40006@noaa.gov>

Robert Kern wrote:
> Francesc Alted wrote:

>>> I had to change lapack_libs and lapack_dirs in addons.py to read:
>>>  lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm']
>>>  lapack_dirs = ['/usr/local/lib/ATLAS']

>> I've done something similar:
>>
>>     lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas']
>>     lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib']

For what it's worth, this is what worked for me on Gentoo Linux:
     lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm']

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From jmiller at stsci.edu  Thu Oct 21 11:33:46 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 21 11:33:46 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
Message-ID: <1098383430.3644.4.camel@halloween.stsci.edu>

On Thu, 2004-10-21 at 13:55, Stephen Walton wrote:
> Is there some simple way of counting the number of array elements which
> satisfy a certain condition?  It is easy to do
> 
> A[A<=1].sum()
> 
> to sum all the values of A which are less than 1, but there doesn't seem
> to be a count() method.  I tried
> 
> (A<=1).sum()
>
> but this throws an exception at numarray 1.1.  If I try

This works now in CVS and will be part of numarray-1.2.  Another more
tedious approach which works for numarray-1.1 is:

(A <= 1).astype('Int32').sum()

> sum(A<=value)
> 
> I have to nest multiple sums if A has rank greater than 1, plus the sum
> overflows if A is large, apparently because boolean gets treated as
> Int8.  (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)).  You get
> zero.)  The following works:
> 
> array(A<=1024,type=Int32).sum()
> 
> but is awkward.  Am I missing an obvious better alternative?  If not,
> I'm going to file an RFE :-) .

I don't think there's any need for an RFE, provided you're satisfied
with (A<=1).sum().

Regards,
Todd


From rkern at ucsd.edu  Thu Oct 21 12:22:20 2004
From: rkern at ucsd.edu (Robert Kern)
Date: Thu Oct 21 12:22:20 2004
Subject: [Numpy-discussion] argmin and unsigned types
Message-ID: <41780BE5.4070009@ucsd.edu>

argmin locates the minimum by finding the maximum of the negative of the 
input. Unfortunately, for unsigned arrays, the negative has nothing to 
do with the actual numerical negative.

Example:

 >>> from numarray import *
 >>> a = arange(10).astype(UInt8)
 >>> print a
[0 1 2 3 4 5 6 7 8 9]
 >>> print -a
[  0 255 254 253 252 251 250 249 248 247]
 >>> argmin(a)
1

We need a separate argmin to handle these arrays properly.

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter


From jmiller at stsci.edu  Thu Oct 21 15:04:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 21 15:04:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098383430.3644.4.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <1098383430.3644.4.camel@halloween.stsci.edu>
Message-ID: <1098396116.3644.129.camel@halloween.stsci.edu>

On Thu, 2004-10-21 at 14:30, Todd Miller wrote:
> On Thu, 2004-10-21 at 13:55, Stephen Walton wrote:
> > Is there some simple way of counting the number of array elements which
> > satisfy a certain condition?  It is easy to do
> > 
> > A[A<=1].sum()
> > 
> > to sum all the values of A which are less than 1, but there doesn't seem
> > to be a count() method.  I tried
> > 
> > (A<=1).sum()
> >
> > but this throws an exception at numarray 1.1.  If I try
> 
> This works now in CVS and will be part of numarray-1.2.  

Stephen tried this and it turns out my earlier statement was untrue,
(A<=1).sum() doesn't do anything reasonable, even in CVS.  The problem
is that sum() is written (without direct C support) to conserve
storage.  As a result,  it doesn't do implicit 
> Another more
> tedious approach which works for numarray-1.1 is:
> 
> (A <= 1).astype('Int32').sum()
> 

There's also a prettier approach that works for 1.1 that I forgot about:

(A <= 1).sum('Int32')

> > sum(A<=value)
> > 
> > I have to nest multiple sums if A has rank greater than 1, plus the sum
> > overflows if A is large, apparently because boolean gets treated as
> > Int8.  (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)).  You get
> > zero.)  The following works:
> > 
> > array(A<=1024,type=Int32).sum()
> > 
> > but is awkward.  Am I missing an obvious better alternative?  If not,
> > I'm going to file an RFE :-) .
> 
> I don't think there's any need for an RFE, provided you're satisfied
> with (A<=1).sum().
> 
> Regards,
> Todd
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 


From jmiller at stsci.edu  Thu Oct 21 15:08:52 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 21 15:08:52 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098396116.3644.129.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <1098383430.3644.4.camel@halloween.stsci.edu>
	 <1098396116.3644.129.camel@halloween.stsci.edu>
Message-ID: <1098396420.28271.0.camel@halloween.stsci.edu>

On Thu, 2004-10-21 at 18:01, Todd Miller wrote:
> On Thu, 2004-10-21 at 14:30, Todd Miller wrote:
> > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote:
> > > Is there some simple way of counting the number of array elements which
> > > satisfy a certain condition?  It is easy to do
> > > 
> > > A[A<=1].sum()
> > > 
> > > to sum all the values of A which are less than 1, but there doesn't seem
> > > to be a count() method.  I tried
> > > 
> > > (A<=1).sum()
> > >
> > > but this throws an exception at numarray 1.1.  If I try
> > 
> > This works now in CVS and will be part of numarray-1.2.  
> 
> Stephen tried this and it turns out my earlier statement was untrue,
> (A<=1).sum() doesn't do anything reasonable, even in CVS.  The problem
> is that sum() is written (without direct C support) to conserve
> storage.  As a result,  it doesn't do implicit 
> > Another more
> > tedious approach which works for numarray-1.1 is:
> > 
> > (A <= 1).astype('Int32').sum()
> > 
> 
> There's also a prettier approach that works for 1.1 that I forgot about:
> 
> (A <= 1).sum('Int32')
> 
> > > sum(A<=value)
> > > 
> > > I have to nest multiple sums if A has rank greater than 1, plus the sum
> > > overflows if A is large, apparently because boolean gets treated as
> > > Int8.  (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)).  You get
> > > zero.)  The following works:
> > > 
> > > array(A<=1024,type=Int32).sum()
> > > 
> > > but is awkward.  Am I missing an obvious better alternative?  If not,
> > > I'm going to file an RFE :-) .
> > 
> > I don't think there's any need for an RFE, provided you're satisfied
> > with (A<=1).sum().
> > 
> > Regards,
> > Todd
> > 
> > 
> > 
> > -------------------------------------------------------
> > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> > Use IT products in your business? Tell us what you think of them. Give us
> > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> > http://productguide.itmanagersjournal.com/guidepromo.tmpl
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 


From jmiller at stsci.edu  Thu Oct 21 15:11:23 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 21 15:11:23 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098396116.3644.129.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <1098383430.3644.4.camel@halloween.stsci.edu>
	 <1098396116.3644.129.camel@halloween.stsci.edu>
Message-ID: <1098396569.28351.0.camel@halloween.stsci.edu>

On Thu, 2004-10-21 at 18:01, Todd Miller wrote:
> On Thu, 2004-10-21 at 14:30, Todd Miller wrote:
> > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote:
> > > Is there some simple way of counting the number of array elements which
> > > satisfy a certain condition?  It is easy to do
> > > 
> > > A[A<=1].sum()
> > > 
> > > to sum all the values of A which are less than 1, but there doesn't seem
> > > to be a count() method.  I tried
> > > 
> > > (A<=1).sum()
> > >
> > > but this throws an exception at numarray 1.1.  If I try
> > 
> > This works now in CVS and will be part of numarray-1.2.  
> 
> Stephen tried this and it turns out my earlier statement was untrue,
> (A<=1).sum() doesn't do anything reasonable, even in CVS.  The problem
> is that sum() is written (without direct C support) to conserve
> storage.  As a result,  it doesn't do implicit 
> > Another more
> > tedious approach which works for numarray-1.1 is:
> > 
> > (A <= 1).astype('Int32').sum()
> > 
> 
> There's also a prettier approach that works for 1.1 that I forgot about:
> 
> (A <= 1).sum('Int32')
> 
> > > sum(A<=value)
> > > 
> > > I have to nest multiple sums if A has rank greater than 1, plus the sum
> > > overflows if A is large, apparently because boolean gets treated as
> > > Int8.  (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)).  You get
> > > zero.)  The following works:
> > > 
> > > array(A<=1024,type=Int32).sum()
> > > 
> > > but is awkward.  Am I missing an obvious better alternative?  If not,
> > > I'm going to file an RFE :-) .
> > 
> > I don't think there's any need for an RFE, provided you're satisfied
> > with (A<=1).sum().
> > 
> > Regards,
> > Todd
> > 
> > 
> > 
> > -------------------------------------------------------
> > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> > Use IT products in your business? Tell us what you think of them. Give us
> > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> > http://productguide.itmanagersjournal.com/guidepromo.tmpl
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 


From jmiller at stsci.edu  Thu Oct 21 16:41:29 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 21 16:41:29 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098396569.28351.0.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <1098383430.3644.4.camel@halloween.stsci.edu>
	 <1098396116.3644.129.camel@halloween.stsci.edu>
	 <1098396569.28351.0.camel@halloween.stsci.edu>
Message-ID: <1098401959.3744.34.camel@localhost.localdomain>

On Thu, 2004-10-21 at 18:09, Todd Miller wrote:
> On Thu, 2004-10-21 at 18:01, Todd Miller wrote:
> > On Thu, 2004-10-21 at 14:30, Todd Miller wrote:
> > > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote:
> > > > Is there some simple way of counting the number of array elements which
> > > > satisfy a certain condition?  It is easy to do
> > > > 
> > > > A[A<=1].sum()
> > > > 
> > > > to sum all the values of A which are less than 1, but there doesn't seem
> > > > to be a count() method.  I tried
> > > > 
> > > > (A<=1).sum()
> > > >
> > > > but this throws an exception at numarray 1.1.  If I try
> > > 
> > > This works now in CVS and will be part of numarray-1.2.  
> > 
> > Stephen tried this and it turns out my earlier statement was untrue,
> > (A<=1).sum() doesn't do anything reasonable, even in CVS.  The problem
> > is that sum() is written (without direct C support) to conserve
> > storage.  As a result,  it doesn't do implicit 

<drum roll> up-casting. </drum roll>

I'm pretty sure this was a conscious and discussed choice (this is
actually the 2nd time sum() has been wrong).  IMHO, the typing for sum()
should be revised because it is too dangerous the way it is now.  

Regards,
Todd


From nwagner at mecha.uni-stuttgart.de  Fri Oct 22 02:17:16 2004
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Fri Oct 22 02:17:16 2004
Subject: [Numpy-discussion] Problems with complex matrices
Message-ID: <4178CEFF.2050608@mecha.uni-stuttgart.de>

Hi all,

Another bug is revealed

Traceback (most recent call last):
  File "complex_it.py", line 6, in ?
    res=dot(A,x)-r
  File "/usr/lib/python2.3/site-packages/Numeric/dotblas/__init__.py", 
line 55, in dot
    if multiarray.array(a).shape == () or multiarray.array(b).shape == ():
TypeError: a float is required
 
Nils

-------------- next part --------------
A non-text attachment was scrubbed...
Name: complex_it.py
Type: text/x-python
Size: 139 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041022/c13d8954/attachment.py>

From Sebastien.deMentendeHorne at electrabel.com  Fri Oct 22 02:44:46 2004
From: Sebastien.deMentendeHorne at electrabel.com (Sebastien.deMentendeHorne at electrabel.com)
Date: Fri Oct 22 02:44:46 2004
Subject: [Numpy-discussion] Problems with complex matrices
Message-ID: <035965348644D511A38C00508BF7EAEB145CB168@seacex03.eib.electrabel.be>

gmres returns a tuple so you should have used
res = dot(A, x[0]) - r

seb

> -----Original Message-----
> From: Nils Wagner [mailto:nwagner at mecha.uni-stuttgart.de]
> Sent: vendredi 22 octobre 2004 11:13
> To: SciPy Users List; numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Problems with complex matrices
> 
> 
> Hi all,
> 
> Another bug is revealed
> 
> Traceback (most recent call last):
>   File "complex_it.py", line 6, in ?
>     res=dot(A,x)-r
>   File 
> "/usr/lib/python2.3/site-packages/Numeric/dotblas/__init__.py", 
> line 55, in dot
>     if multiarray.array(a).shape == () or 
> multiarray.array(b).shape == ():
> TypeError: a float is required
>  
> Nils
> 
> 


From Chris.Barker at noaa.gov  Fri Oct 22 11:07:32 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Oct 22 11:07:32 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098392607.8249.20.camel@freyer.sfo.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
Message-ID: <41794B47.4090909@noaa.gov>

Stephen Walton wrote:

 > There is a difference between the sum() Ufunc and the sum() method which
 > is not mentioned in the documentation:  the function works along an
 > axis, while the method works on the whole array.  That is, A.sum() and
 > A.flat.sum() are equivalent regardless of the rank of A.


Bummer. I was hoping this was a move to a more object-oriented style, 
rather than different functionality. Also, it's pretty confusing 
terminology, particularly if it's not documented! Why not .SumAll() or 
something?

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From rowen at u.washington.edu  Fri Oct 22 11:20:36 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Fri Oct 22 11:20:36 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <41794B47.4090909@noaa.gov>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	
 <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov>
Message-ID: <p06200510bd9eff3962b0@[128.95.99.44]>

At 11:02 AM -0700 2004-10-22, Chris Barker wrote:
>Stephen Walton wrote:
>
>>  There is a difference between the sum() Ufunc and the sum() method which
>>  is not mentioned in the documentation:  the function works along an
>>  axis, while the method works on the whole array.  That is, A.sum() and
>>  A.flat.sum() are equivalent regardless of the rank of A.
>
>
>Bummer. I was hoping this was a move to a more object-oriented 
>style, rather than different functionality. Also, it's pretty 
>confusing terminology, particularly if it's not documented! Why not 
>.SumAll() or something?

I agree. Numarray is already confusing enough without identically 
named functions and methods that do different things. (nElements and 
size are another pet peeve, with size used in several places and 
nElements appearing exactly once. Though I am grateful to whoever 
added size as a workalike for nElements; formerly you had to know 
what kind of array you had before you knew how to find out how many 
elements it had.)

-- Russell


From strawman at astraw.com  Fri Oct 22 11:25:58 2004
From: strawman at astraw.com (Andrew Straw)
Date: Fri Oct 22 11:25:58 2004
Subject: [Numpy-discussion] floating point exception weirdness
In-Reply-To: <411A08FA.7000601@astraw.com>
References: <4119BBFC.6020304@astraw.com> <1092221365.3752.32.camel@localhost.localdomain> <411A08FA.7000601@astraw.com>
Message-ID: <41795006.1040807@astraw.com>

I've isolated a bug I first reported on this mailing list in August.  
I've now confined it to a small code snippet using entirely open-source 
software (previously I saw it while using Intel's IPP).  In a nutshell, 
importing numarray.ieeespecial triggers a floating point exception 
(which kills my program) when I call Numeric's 
singular_value_decomposition() function:

import Numeric
from LinearAlgebra import singular_value_decomposition

if want_FPE:
    import numarray.ieeespecial

A= [[-5.7, 2.2, -0.53, 46.0],
    [-2.3, -5.5, -1.0, 1091.0],
    [5.9, 1.4, -0.1, -142.0],
    [-1.3, 5.7, -1.5, 2673.0]]
A=Numeric.array(A)
u,s,v = singular_value_decomposition(A) # FPE triggered here

Here's my setup:

$ python
Python 2.3.4 (#2, Sep 24 2004, 08:39:09)
[GCC 3.3.4 (Debian 1:3.3.4-12)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import Numeric
 >>> Numeric.__version__
'23.6'
 >>> import numarray
 >>> numarray.__version__
'1.2a'

$ gcc -v
Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.4/specs
Configured with: ../src/configure -v 
--enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr 
--mandir=/usr/share/man --infodir=/usr/share/info 
--with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared 
--with-system-zlib --enable-nls --without-included-gettext 
--enable-__cxa_atexit --enable-clocale=gnu --enable-debug 
--enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
Thread model: posix
gcc version 3.3.4 (Debian 1:3.3.4-13)

Now, for the clue:  the above error is ONLY triggered when I compile 
Numeric to use system blas and friends, not when I use lapack_lite 
included with Numeric.  This leads me to suspect it is related to the 
SSE2 unit -- I have Debian sarge's atlas3-base, atlas3-see, atlas3-sse2, 
blas, lapack, lapack3, and refblas3 packages installed on my P4 machine.

So, to propose a hypothesis: numarray.ieeespecial sets the FPE bit in 
the SSE2 hardware, but for some reason this does not raise SIGFPE.  
However, when the next call that touches SSE2 happens, the kernel sees 
that error bit and throws the signal.  Does this explanation make 
sense?  Is it easy to fix?

Cheers!
Andrew


From jmiller at stsci.edu  Fri Oct 22 14:19:17 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Oct 22 14:19:17 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <p06200510bd9eff3962b0@[128.95.99.44]>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
Message-ID: <1098479844.29804.260.camel@halloween.stsci.edu>

On Fri, 2004-10-22 at 14:17, Russell E Owen wrote:
> At 11:02 AM -0700 2004-10-22, Chris Barker wrote:
> >Stephen Walton wrote:
> >
> >>  There is a difference between the sum() Ufunc and the sum() method which
> >>  is not mentioned in the documentation:  the function works along an
> >>  axis, while the method works on the whole array.  That is, A.sum() and
> >>  A.flat.sum() are equivalent regardless of the rank of A.
> >
> >
> >Bummer. I was hoping this was a move to a more object-oriented 
> >style, rather than different functionality. Also, it's pretty 
> >confusing terminology, particularly if it's not documented! Why not 
> >.SumAll() or something?

sumAll() would certainly be better.

Unless there are objections,  I'll rename the current sum() method to
sumAll() and re-write sum() to give a deprecation warning before calling
sumAll().  Eventually,  it'll go away altogether.

I reviewed the discussion of the sum() result type from a year ago:
"[Numpy-discussion] sum and mean methods behaviour".  We discussed sum()
in depth and AFIK I implemented the recommendations.  The results need
to be documented.

By default,  sum() now uses the maximum type of the type family of the
array, so families Bool, Integer, UnsignedInteger, Float, or Complex 
result in max types Bool, Int64, UInt64, Float64, Complex64.  I'm not
sure why we segregated Bool and it looks like a mistake to me now.  I'm
thinking the Bool "family" should just go away and be re-classified as
UnsignedInteger.  These ideas are captured by the
numerictypes.MaximumType() function which is also potentially useful for
any reduction.

> I agree. Numarray is already confusing enough without identically 
> named functions and methods that do different things. 

True enough.  This'll be fixed.

> (nElements and 
> size are another pet peeve, with size used in several places and 
> nElements appearing exactly once. Though I am grateful to whoever 
> added size as a workalike for nElements; formerly you had to know 
> what kind of array you had before you knew how to find out how many 
> elements it had.)

I'm not sure what you mean here.  When I grepped,  I got 52 hits for
nelements() in the numarray source, let alone what users have done with
it.  Right now,  IMHO, it's not clearly broken and there are bigger fish
to fry.

Regards,
Todd


From stephen.walton at csun.edu  Fri Oct 22 14:37:05 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Fri Oct 22 14:37:05 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <p06200510bd9eff3962b0@[128.95.99.44]>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
Message-ID: <1098480955.11372.19.camel@freyer.sfo.csun.edu>

On Fri, 2004-10-22 at 11:17, Russell E Owen wrote about the sum() Ufunc
vs. the sum() method:

> Numarray is already confusing enough without identically 
> named functions and methods that do different things

When I went through the Numarray docs and made suggestions for
improvements (see the list I posted at Sourceforge), I didn't make any
comments about functional changes, only what the documentation said. 
Since the sum() method is documented using 1-D arrays, you can't tell
that it in fact behaves differently than the sum() Ufunc.  On
reflection, I also agree that the Ufuncs and methods should behave the
same way.

Why do you say 'numarray is confusing'?  What in the docs would help
un-confuse it, in your view?

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041022/691b3617/attachment.sig>

From rowen at u.washington.edu  Fri Oct 22 14:48:03 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Fri Oct 22 14:48:03 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	
 <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu>	
 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
 <1098479844.29804.260.camel@halloween.stsci.edu>
Message-ID: <p06200515bd9f2b4fb808@[128.95.99.44]>

At 5:17 PM -0400 2004-10-22, Todd Miller wrote:
>On Fri, 2004-10-22 at 14:17, Russell E Owen wrote:
>>  I agree. Numarray is already confusing enough without identically
>>  named functions and methods that do different things.
>
>True enough.  This'll be fixed.

Great!

>>  (nElements and
>>  size are another pet peeve, with size used in several places and
>>  nElements appearing exactly once. Though I am grateful to whoever
>>  added size as a workalike for nElements; formerly you had to know
>>  what kind of array you had before you knew how to find out how many
>>  elements it had.)
>
>I'm not sure what you mean here.  When I grepped,  I got 52 hits for
>nelements() in the numarray source, let alone what users have done with
>it.  Right now,  IMHO, it's not clearly broken and there are bigger fish
>to fry.

Since you ask...

I'm counting the number of implementations in the public interface of 
the numarray package. There are four implementations of size 
(including the numarray array method, which is simply a synonym for 
nelements), but only one implementation of nelements.

When I started using numarray, the following was true:

* numarray had a function named size.
* numarray.ma had the same function
* numarray.ma arrays had method size
* All of these worked the same way:
   size(array, axis=None)
     size  returns the number of elements in an array or
     along the specified axis.

BUT numarray arrays had no method size. Instead there was a method 
nelements, which did the same thing as size, but had no "axis" 
argument.


This was very confusing, and I got tripped up badly because I was 
trying to count array elements and was using both "normal" numarray 
arrays and masked arrays. I filed PR 934514 and some kind soul 
patched the problem by making size a synonym for nelements.

There is a bit of residual mess because the new size does not have 
the axis argument. And then there's the historical clutter of two 
ways to do the same thing, but presumably one just lives with that. 
Though it seems a bit strange to me not to deprecate nelements and 
stop using it internally.

-- Russell


From Fernando.Perez at colorado.edu  Fri Oct 22 14:50:04 2004
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Fri Oct 22 14:50:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu>
Message-ID: <41797FFE.8090802@colorado.edu>

Todd Miller wrote:

> sumAll() would certainly be better.
> 
> Unless there are objections,  I'll rename the current sum() method to
> sumAll() and re-write sum() to give a deprecation warning before calling
> sumAll().  Eventually,  it'll go away altogether.

silly, minor nit: can we avoid mixed case names? Either sum_all or SumAll? I'm 
not too fond of CamelCase, but camelCase looks even worse to me :)

As I said, it's just a minor nit.  I don't know if there's an official naming 
policy for numarray, so please don't get angry at me if my comment is out of 
place.

Best,

f


From Chris.Barker at noaa.gov  Fri Oct 22 15:12:01 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Oct 22 15:12:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu>
Message-ID: <4179853F.8040800@noaa.gov>

Todd Miller wrote:

> By default,  sum() now uses the maximum type of the type family of the
> array, so families Bool, Integer, UnsignedInteger, Float, or Complex 
> result in max types Bool, Int64, UInt64, Float64, Complex64.  I'm not
> sure why we segregated Bool and it looks like a mistake to me now.  I'm
> thinking the Bool "family" should just go away and be re-classified as
> UnsignedInteger.

Well, I think that the idea of a bool being different than an int is 
often useful. In this case, we want Bool to behave like an integer, so 
that we can use some version of sum() to add up all the true values. 
This is handy, but maybe we need more complete support for boolean 
arrays, rather than getting rid of them. For instance, there could be a 
NumTrue() function or method, for this case. I would probably maintain 
the easy conversion of a Bool array to an Int array, for when you really 
do need to do math with them.

We'd want a compete set, many of which already exist. A few off the top 
of my head:

sometrue
alltrue
numtrue

Maybe mirrors for false:
somefalse
allfalse
numfalse

What else would be needed? My vote would be for all of these to be 
methods of a Bool array, but I'm partial to methods over functions anyway.

On the other hand, Python itself is sub classing Bool from integer, so 
maybe there's little point.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From aisaac at american.edu  Fri Oct 22 15:14:07 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Fri Oct 22 15:14:07 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu><417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu><41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
 <1098479844.29804.260.camel@halloween.stsci.edu>
Message-ID: <Mahogany-0.66.0-1388-20041022-181444.00@american.edu>

On 22 Oct 2004, Todd Miller apparently wrote:
> sumAll() would certainly be better.

> Unless there are objections,  I'll rename the current sum() method to
> sumAll() and re-write sum() to give a deprecation warning before calling
> sumAll().  Eventually,  it'll go away altogether.

Just two thoughts from a new user.
i. I agree that .sumAll is better than the current name
confusion.
ii. even better, I propose, would be for .sum to take
an axis argument, with default matching the sum function,
and possible value axis="all".

For the transition, the axis argument can be required.

fwiw,
Alan Isaac 


From rowen at u.washington.edu  Fri Oct 22 15:19:02 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Fri Oct 22 15:19:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098480955.11372.19.camel@freyer.sfo.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	
 <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu>	
 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
 <1098480955.11372.19.camel@freyer.sfo.csun.edu>
Message-ID: <p06200517bd9f307eeee5@[128.95.99.44]>

At 2:35 PM -0700 2004-10-22, Stephen Walton wrote:
>On Fri, 2004-10-22 at 11:17, Russell E Owen wrote about the sum() Ufunc
>vs. the sum() method:
>
>>  Numarray is already confusing enough without identically
>>  named functions and methods that do different things
>
>When I went through the Numarray docs and made suggestions for
>improvements (see the list I posted at Sourceforge), I didn't make any
>comments about functional changes, only what the documentation said.
>Since the sum() method is documented using 1-D arrays, you can't tell
>that it in fact behaves differently than the sum() Ufunc.  On
>reflection, I also agree that the Ufuncs and methods should behave the
>same way.
>
>Why do you say 'numarray is confusing'?  What in the docs would help
>un-confuse it, in your view?

OK, since I seem to be in a grumpy mood today, here are some examples 
(probably nothing new here):
- I'll expose my ignorance, but I find the take stuff and fancy 
indexing nearly incomprehensible. I've tried to follow the examples 
(several times--i.e. every time I need to do something fancy), but 
generally I either flail around until I find something that works, or 
give up and write a C extension.

- I'd like to write C/C++ code that would work on multiple array 
types. This seems a natural use of C++ templates, but that doesn't 
seem to be "how it's done". I hate to think how the internal code is 
managing this without being a horrible sphaghetti of code repeated 
for each array type.

The nd_image package is the closest I've come to finding source code 
that makes any sense to me in this areay. But it uses so many 
custom-defined specialized functions that I figured it was just too 
much work to figure out w/out a manual (and risky to rely on these 
functions since they are internal to the package).

So I gave up and just support the one data type I really need now. 
Very disappointing.

- Important functions are sometimes buried in a non-obvious (to me) 
sub-package.

For example: try to find that location at which an array has a 
minimum value (if there's more than one such point, pick any). You'd 
think it'd be a standard numarray function, wouldn't you? After all, 
you can ask for the minimum value. Now try to find it.

Well, I started out by trying to figure out how to get argmin to do 
the job. Horrible.

Fortunately I finally found minimum_position buried in nd_image.

- Masked arrays are not integrated. Thus a lot of important filtering 
and stuff simply cannot be done on masked data without writing custom 
extensions. For instance I'd like to do a median-filter that ignores 
masked data (taking the median of non-masked data only).

- For 2-d images x and y are reversed. I know this isn't going to 
change, but it is a headache every time I have to write new image 
processing code.

- I keep wanting more support for dealing with arrays of indices, 
e.g. "give me all the indices for which this is true", then use that 
to process the data in an array. Numarray seems to do that kind of 
operation in an entirely different way, suggesting I'm not "with it" 
on the underlying philosophy. Unfortunately no really good examples 
come to mind at the moment (it's been awhile since I've created new 
code using numarray), though I was fairly well convinced that if I 
had enough support for this I could code an efficient radial profile 
function w/out using a C extension.

-- Russell


From perry at stsci.edu  Fri Oct 22 16:50:01 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Oct 22 16:50:01 2004
Subject: [Numpy-discussion] In case there are any questions about numarray...
In-Reply-To: <p06200517bd9f307eeee5@[128.95.99.44]>
Message-ID: <NEBBIJKBMLDBLNCEEFOCOECAFIAA.perry@stsci.edu>

Todd and I will be away most of next week at a conference
and will likely not have a chance to respond to questions about
numarray or continue the current discussions about the
proper numarray interface or improvements to the
documentation. 

Perry 


From aisaac at american.edu  Fri Oct 22 19:17:02 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Fri Oct 22 19:17:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <4179853F.8040800@noaa.gov>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu><4179853F.8040800@noaa.gov>
Message-ID: <Mahogany-0.66.0-1768-20041022-221742.00@american.edu>

More new user feedback ...

On Fri, 22 Oct 2004, Chris Barker apparently wrote:
> Well, I think that the idea of a bool being different than
> an int is often useful.

Yes.  E.g., applications to directed graphs.


> we can use some version of sum() to add up all the
> true values.

Unclear, but given the existence of sometrue,
it seems natural enough to let sum treat a Bool as an
integer.  Products work naturally, of course.

> I would probably maintain
> the easy conversion of a Bool array to an Int array, for when you really
> do need to do math with them.

I would rephrase this.
Boolean arrays have a naturally different math,
which it would be nice to have supported.
It would also be nice to easily convert to Int,
when that representation captures the math needed.

> We'd want a compete set, many of which already exist. A few off the top
> of my head:
> sometrue
> alltrue
> numtrue       

I'd just let sum handle numtrue.

> Maybe mirrors for false:
> somefalse, allfalse, numfalse

I'd just rely on alltrue, sometrue, and (size less sum) for these.

fwiw,
Alan


From stephen.walton at csun.edu  Fri Oct 22 22:23:02 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Fri Oct 22 22:23:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <p06200517bd9f307eeee5@[128.95.99.44]>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
	 <1098480955.11372.19.camel@freyer.sfo.csun.edu>
	 <p06200517bd9f307eeee5@[128.95.99.44]>
Message-ID: <1098508579.3403.6.camel@localhost.localdomain>

I had no idea my innocent question would generate so much discussion.
Mindful that Perry and Todd are at ADASS in Pasadena next week:

On Fri, 2004-10-22 at 15:18 -0700, Russell E Owen wrote: 
> At 2:35 PM -0700 2004-10-22, Stephen Walton wrote:
> >
> >Why do you say 'numarray is confusing'?  What in the docs would help
> >un-confuse it, in your view?
> 
> - I'll expose my ignorance, but I find the take stuff and fancy 
> indexing nearly incomprehensible.

I agree.  It took me much experimentation to figure out exactly how it
worked.  I'd appreciate it very much if you would look at my suggested
rewrite of this section of the documentation at

http://sourceforge.net/tracker/index.php?func=detail&aid=1047889&group_id=1369&atid=101369

and give me any further thoughts for clarification (post them as
comments to the bug report itself).

> - I'd like to write C/C++ code that would work on multiple array 
> types.

I can't help much here, other than to say that C and C++ are pretty low
level languages, not well suited for this level of abstraction.

> - Important functions are sometimes buried in a non-obvious (to me) 
> sub-package.
> For example: try to find that location at which an array has a 
> minimum value

The current index to the documentation seems to include only the
function names but not concepts, which is a problem.  I myself was
trying to remember how to do type conversion;  there is no entry in the
index for 'conversion' or 'coercion' and I finally grepped my local copy
of the HTML files to re-find astype(). 

> - Masked arrays are not integrated.

I haven't tried these yet personally, but I agree that such a feature is
a very important one.  IRAF got partway along on this but didn't finish
it either.

Having said that, my workaround/technique for both MATLAB and numarray
is to simply put NaN's in the places where this not valid data and do
something like

sum(sum(A(~isnan(A)))

This is MATLAB syntax of course.  Something similar in numarray would go
a long way to helping me.  For example, I have full disk solar images
and I'd like to be able to operate on just the sunspot pixels, or just
the sky pixels, in a straightforward way.

> - For 2-d images x and y are reversed.

Are you referring to the fact that C and numarray are row major and
Fortran is column major?  Or to how images get displayed in the various
plot packages?

> - I keep wanting more support for dealing with arrays of indices, 
> e.g. "give me all the indices for which this is true", then use that 
> to process the data in an array. Numarray seems to do that kind of 
> operation in an entirely different way, suggesting I'm not "with it" 
> on the underlying philosophy.

There are two ways to do this, both of which work.  For example:

A=arange(25)
sum(A[A<=7])

will work just as you expect.  A bool array used as an index picks out
those values for which the bool is True.  Essentially identical syntax
now works in MATLAB too.  If you want an index array instead:

>>> index=where(A<7)
>>> A[index]

will do the trick.  For arrays of rank greater than 1:

>>> A=arange(25,shape=(5,5))
>>> where(A<7)
(array([0, 0, 0, 0, 0, 1, 1]), array([0, 1, 2, 3, 4, 0, 1]))

which is a tuple of two arrays that can be used to index A:

>>> ind1,ind2=where(A<7)
>>> A[ind1,ind2]
array([0, 1, 2, 3, 4, 5, 6])
>>> A[ind1,ind2]=[6,5,4,3,2,1,0]	# assignment works too
>>> A
array([[ 6,  5,  4,  3,  2],
       [ 1,  0,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

Does this help?

-- 
Stephen Walton <stephen.walton at csun.edu>
Physics & Astronomy CSUN


From verveer at embl-heidelberg.de  Sat Oct 23 04:14:04 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Sat Oct 23 04:14:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <p06200517bd9f307eeee5@[128.95.99.44]>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098480955.11372.19.camel@freyer.sfo.csun.edu> <p06200517bd9f307eeee5@[128.95.99.44]>
Message-ID: <9633F2FA-24E4-11D9-B9D4-000D932805AC@embl-heidelberg.de>

I thought I just give my point of view on this, since I do believe we 
should give these some thought.

On Oct 23, 2004, at 12:18 AM, Russell E Owen wrote:

> OK, since I seem to be in a grumpy mood today, here are some examples 
> (probably nothing new here):
> - I'll expose my ignorance, but I find the take stuff and fancy 
> indexing nearly incomprehensible. I've tried to follow the examples 
> (several times--i.e. every time I need to do something fancy), but 
> generally I either flail around until I find something that works, or 
> give up and write a C extension.

I agree, it is  very complicated, I always have trouble getting 
understanding what is going on when I use take and indexing. More 
documentation may help.

> - I'd like to write C/C++ code that would work on multiple array 
> types. This seems a natural use of C++ templates, but that doesn't 
> seem to be "how it's done". I hate to think how the internal code is 
> managing this without being a horrible sphaghetti of code repeated for 
> each array type.

This is a good point. If you look at examples for implementing 
something in C, you always see that the code only handles a single data 
type, usually converting all input to double type. That is not always a 
good way to write an extension if you want it to be of generic use 
(e.g. the FFT module does not handle 32 bits floating point well, which 
is a problem for big arrays). Some support in writing functions that 
handle multiple data types would be good.

> The nd_image package is the closest I've come to finding source code 
> that makes any sense to me in this areay. But it uses so many 
> custom-defined specialized functions that I figured it was just too 
> much work to figure out w/out a manual (and risky to rely on these 
> functions since they are internal to the package).

The internal nd_image C functions are indeed not exported and should 
not be used to implement extensions. That is going to stay that way 
since I do not plan to document these, and in any case, exposing such 
functions is not the purpose of the module.

On the other hand, some of the techniques use may be generally useful. 
I could try to factor some of the functions and macros out and write 
something up on the use of these to write extensions that handle 
multiple data types.

> So I gave up and just support the one data type I really need now. 
> Very disappointing.

Yes, it should be easier to do this, I agree. Using C macros as a 'poor 
man' templating system is in fact not too complicated (although pretty 
ugly).

Another approach that I have tried to use in nd_image is to provide 
generic functions that take a python or a C function to implement 
functionality. For instance to implement an arbitrary filter function 
in nd_image  you only need to implement a function that calculates the 
filter at one point. You then call a generic filter function that does 
the heavy lifting of dealing with multiple array types,  iterating over 
the array, dealing with borders and such, applying the function at each 
array element. The filter function can be in python, but can also be a 
C function, communicated by a CObject.

Maybe some of these type functions could be provided with the numarray 
package. This could simplify writing extensions a lot. Would there be 
interest for a package of such functions? If there is I could think 
about it a bit more, and propose (and implement) something in the form 
of an extension.

> - Important functions are sometimes buried in a non-obvious (to me) 
> sub-package.
>
> For example: try to find that location at which an array has a minimum 
> value (if there's more than one such point, pick any). You'd think 
> it'd be a standard numarray function, wouldn't you? After all, you can 
> ask for the minimum value. Now try to find it.

Agreed, this bothered me too.

> Well, I started out by trying to figure out how to get argmin to do 
> the job. Horrible.
>
> Fortunately I finally found minimum_position buried in nd_image.

It is there because numarray did not provide it... But it is also there 
because it offers much functionality that would not be appropriate for 
the main package. It is part of the object measurement functions. A 
simpler, possibly more efficient routine should maybe be part of the 
main package.

> - Masked arrays are not integrated. Thus a lot of important filtering 
> and stuff simply cannot be done on masked data without writing custom 
> extensions. For instance I'd like to do a median-filter that ignores 
> masked data (taking the median of non-masked data only).

I agree very much! To be honest, I do not like the ma package much. I 
don't like the idea of having to use a separate package with a 
different array type that duplicates the functionality in the main 
package. I think it would be much better if all functions (where it 
makes sense) in numarray would accept an optional mask argument. To me 
it makes more sense to provide the mask with the operation, not as part 
of the array like in ma (a package like ma could still be layered on 
top.) I realize it would be a lot of work to make all numarray 
functions mask aware, but it is something to think about maybe.

> - For 2-d images x and y are reversed. I know this isn't going to 
> change, but it is a headache every time I have to write new image 
> processing code.

This is not really a problem I think, but you have to get used to it. 
If you treat the last dimension always as X and the first as Y, you 
have the same layout in memory as is usual in most image processing 
software. So X corresponds to axis=1 and Y to axis=0. Or use axis=-1 
and axis=-2.

Cheers, Peter


From aisaac at american.edu  Sat Oct 23 12:01:04 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Sat Oct 23 12:01:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <Mahogany-0.66.0-1388-20041022-181444.00@american.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu><417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu><41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]><1098479844.29804.260.camel@halloween.stsci.edu><Mahogany-0.66.0-1388-20041022-181444.00@american.edu>
Message-ID: <Mahogany-0.66.0-1768-20041023-150116.00@american.edu>

On Fri, 22 Oct 2004 Alan G Isaac apparently wrote:
> Just two thoughts from a new user.
> i. I agree that .sumAll is better than the current name
> confusion.
> ii. even better, I propose, would be for .sum to take
> an axis argument, with default matching the sum function,
> and possible value axis="all".
> For the transition, the axis argument can be required.


That should have been: axis=None

fwiw,
Alan Isaac


From stephen.walton at csun.edu  Sun Oct 24 19:22:03 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Sun Oct 24 19:22:03 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <41797FFE.8090802@colorado.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
	 <1098479844.29804.260.camel@halloween.stsci.edu>
	 <41797FFE.8090802@colorado.edu>
Message-ID: <1098670236.1907.21.camel@localhost.localdomain>

On Fri, 2004-10-22 at 14:47, Fernando Perez wrote:

> silly, minor nit: can we avoid mixed case names? Either sum_all or SumAll? I'm 
> not too fond of CamelCase, but camelCase looks even worse to me :)

I agree with Fernando about CamelCase (which among other things
seriously bites one when moving from case-sensitive to case-insensitive
OS's).  But I want to make a broader point:

I don't think we need sumall.  The methods and the functions should
simply work the same way.  If one wants sumall, use A.flat.sum() or, if
you can't use the methods or attributes on your old version of Python,
sum(ravel(A)).  If you start writing sumall, then you'll need meanall,
stdall, prodall, etc, etc.  The flat attribute and ravel function/method
already provide all the needed functionality.

Just trying to save Todd some work.

Steve

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041024/011e3cb7/attachment.sig>

From verveer at embl-heidelberg.de  Mon Oct 25 01:37:05 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 01:37:05 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098670236.1907.21.camel@localhost.localdomain>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain>
Message-ID: <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

On 25 Oct 2004, at 04:17, Stephen Walton wrote:

> On Fri, 2004-10-22 at 14:47, Fernando Perez wrote:
>
>> silly, minor nit: can we avoid mixed case names? Either sum_all or 
>> SumAll? I'm
>> not too fond of CamelCase, but camelCase looks even worse to me :)
>
> I agree with Fernando about CamelCase (which among other things
> seriously bites one when moving from case-sensitive to case-insensitive
> OS's).  But I want to make a broader point:
>
> I don't think we need sumall.  The methods and the functions should
> simply work the same way.  If one wants sumall, use A.flat.sum() or, if
> you can't use the methods or attributes on your old version of Python,
> sum(ravel(A)).  If you start writing sumall, then you'll need meanall,
> stdall, prodall, etc, etc.  The flat attribute and ravel 
> function/method
> already provide all the needed functionality.

I think this may be inefficient, because ravel and flat may make a copy 
of the data. Also I think using flat/ravel in such a way is plain ugly 
and a complex way to do it.

But I do agree that it is not a good idea to introduce another set of 
names. In my opinion functions that calculate a statistic like sum 
should return the total in the first place, rather then over a single 
axis. But I guess it is too late to change that for sum, because of 
backward compatibility.

Cheers, Peter


From stephen.walton at csun.edu  Mon Oct 25 09:20:02 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Mon Oct 25 09:20:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
	 <1098479844.29804.260.camel@halloween.stsci.edu>
	 <41797FFE.8090802@colorado.edu>
	 <1098670236.1907.21.camel@localhost.localdomain>
	 <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
Message-ID: <1098721171.19183.12.camel@sunspot.csun.edu>

On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote:
> On 25 Oct 2004, at 04:17, Stephen Walton wrote:
> >
> > I don't think we need sumall.  The methods and the functions should
> > simply work the same way.  If one wants sumall, use A.flat.sum() or, if
> > you can't use the methods or attributes on your old version of Python,
> > sum(ravel(A)).
>
> I think this may be inefficient, because ravel and flat may make a copy 
> of the data. Also I think using flat/ravel in such a way is plain ugly 
> and a complex way to do it.

You may be right about the copying, I couldn't say.  I don't think
sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array,
but ugly is in the eye of the beholder.

> In my opinion functions that calculate a statistic like sum 
> should return the total in the first place, rather then over a single 
> axis.

It depends on the data.  I use rank-2 arrays which are images and are
therefore homogeneous.  Even there, though, I often want the sum of all
rows or all columns.  For heterogeneous data (e.g., columns of different
Y's as a function of X), the present sum() makes sense.  In other words,
we will always need ways to sum over just one dimension and over all
dimensions.  By analogy with MATLAB (I'm guessing), sum() in Numeric and
numarray does a one-D sum.

-- 
Stephen Walton, Professor of Physics and Astronomy,
California State University, Northridge
stephen.walton at csun.edu


From tim.hochberg at cox.net  Mon Oct 25 09:32:01 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Oct 25 09:32:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>	 <1098479844.29804.260.camel@halloween.stsci.edu>	 <41797FFE.8090802@colorado.edu>	 <1098670236.1907.21.camel@localhost.localdomain>	 <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu>
Message-ID: <417D2A3C.7010108@cox.net>

Stephen Walton wrote:

>On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote:
>  
>
>>On 25 Oct 2004, at 04:17, Stephen Walton wrote:
>>    
>>
>>>I don't think we need sumall.  The methods and the functions should
>>>simply work the same way.  If one wants sumall, use A.flat.sum() or, if
>>>you can't use the methods or attributes on your old version of Python,
>>>sum(ravel(A)).
>>>      
>>>
>>I think this may be inefficient, because ravel and flat may make a copy 
>>of the data. Also I think using flat/ravel in such a way is plain ugly 
>>and a complex way to do it.
>>    
>>
>
>You may be right about the copying, I couldn't say.  I don't think
>sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array,
>but ugly is in the eye of the beholder.
>  
>
I'm not sure how feasible it is, but I'd much rather an efficient, 
non-copying, 1-D view of an noncontiguous array (from an enhanced 
version of flat or ravel or whatever) than a bunch of extra methods. The 
former allows all of the standard methods to just work efficiently using 
sum(ravel(A)) or sum(A.flat) [ and max and min, etc]. Making special 
whole array methods for everything just leads to method eplosion.

-tim


>  
>
>>In my opinion functions that calculate a statistic like sum 
>>should return the total in the first place, rather then over a single 
>>axis.
>>    
>>
>
>It depends on the data.  I use rank-2 arrays which are images and are
>therefore homogeneous.  Even there, though, I often want the sum of all
>rows or all columns.  For heterogeneous data (e.g., columns of different
>Y's as a function of X), the present sum() makes sense.  In other words,
>we will always need ways to sum over just one dimension and over all
>dimensions.  By analogy with MATLAB (I'm guessing), sum() in Numeric and
>numarray does a one-D sum.
>
>  
>


From stephen.walton at csun.edu  Mon Oct 25 09:35:06 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Mon Oct 25 09:35:06 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
	 <1098479844.29804.260.camel@halloween.stsci.edu>
	 <41797FFE.8090802@colorado.edu>
	 <1098670236.1907.21.camel@localhost.localdomain>
	 <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
	 <1098721171.19183.12.camel@sunspot.csun.edu>
Message-ID: <1098722079.19183.22.camel@sunspot.csun.edu>

On Mon, 2004-10-25 at 09:19 -0700, Stephen Walton wrote:
> On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote:
>
> > I think this may be inefficient, because ravel and flat may make a copy 
> > of the data. Also I think using flat/ravel in such a way is plain ugly 
> > and a complex way to do it.
> 
> You may be right about the copying, I couldn't say. 

I just looked at the source (numeric-1.1/Lib/generic.py).  The comment
to the ravel() function states that it returns a view, not a copy;  but
it calls reshape() which does make a copy if the input array is not
contiguous.  I just tested this:

A=arange(25,shape=(5,5))
A.transpose()		# now A is not contiguous
v=ravel(A)
A[2,2]=-17
v			# verifies that v did not change.

So, in the above, it does look like ravel() made a copy, and your fears
about inefficiency are warranted.  Another test shows that changing
ravel(A) to A.flat above also results in a copy.  Mayhaps we need
sumall() after all.

-- 
Stephen Walton, Professor of Physics and Astronomy,
California State University, Northridge
stephen.walton at csun.edu


From verveer at embl-heidelberg.de  Mon Oct 25 09:44:04 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 09:44:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu>
Message-ID: <0BC8D972-26A5-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

On 25 Oct 2004, at 18:19, Stephen Walton wrote:

> On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote:
>> On 25 Oct 2004, at 04:17, Stephen Walton wrote:
>>>
>>> I don't think we need sumall.  The methods and the functions should
>>> simply work the same way.  If one wants sumall, use A.flat.sum() or, 
>>> if
>>> you can't use the methods or attributes on your old version of 
>>> Python,
>>> sum(ravel(A)).
>>
>> I think this may be inefficient, because ravel and flat may make a 
>> copy
>> of the data. Also I think using flat/ravel in such a way is plain ugly
>> and a complex way to do it.
>
> You may be right about the copying, I couldn't say.  I don't think
> sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array,
> but ugly is in the eye of the beholder.

It does not look worse, I agree with that! But I would argue it should 
have been sum(A) in the first place to sum over al axes... The sumall 
would not have been needed, and summing over one (or a sub-set) axis 
could have been implemented as a an optional argument to sum().
>
>> In my opinion functions that calculate a statistic like sum
>> should return the total in the first place, rather then over a single
>> axis.
>
> It depends on the data.  I use rank-2 arrays which are images and are
> therefore homogeneous.  Even there, though, I often want the sum of all
> rows or all columns.  For heterogeneous data (e.g., columns of 
> different
> Y's as a function of X), the present sum() makes sense.  In other 
> words,
> we will always need ways to sum over just one dimension and over all
> dimensions.  By analogy with MATLAB (I'm guessing), sum() in Numeric 
> and
> numarray does a one-D sum.

I agree it is a useful feature, and it should still be possible to do 
that using an optional axis argument, even better I would love to be 
able to sum over several axes in one go, I find the one-dimensional 
character of reduce limiting, but I digress. In any case, I suppose we 
will stick with the current behaviour for backwards compatibility.

Cheers, Peter


From verveer at embl-heidelberg.de  Mon Oct 25 09:47:01 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 09:47:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098722079.19183.22.camel@sunspot.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <1098722079.19183.22.camel@sunspot.csun.edu>
Message-ID: <60595242-26A5-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

On 25 Oct 2004, at 18:34, Stephen Walton wrote:

> On Mon, 2004-10-25 at 09:19 -0700, Stephen Walton wrote:
>> On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote:
>>
>>> I think this may be inefficient, because ravel and flat may make a 
>>> copy
>>> of the data. Also I think using flat/ravel in such a way is plain 
>>> ugly
>>> and a complex way to do it.
>>
>> You may be right about the copying, I couldn't say.
>
> I just looked at the source (numeric-1.1/Lib/generic.py).  The comment
> to the ravel() function states that it returns a view, not a copy;  but
> it calls reshape() which does make a copy if the input array is not
> contiguous.  I just tested this:
>
> A=arange(25,shape=(5,5))
> A.transpose()		# now A is not contiguous
> v=ravel(A)
> A[2,2]=-17
> v			# verifies that v did not change.
>
> So, in the above, it does look like ravel() made a copy, and your fears
> about inefficiency are warranted.  Another test shows that changing
> ravel(A) to A.flat above also results in a copy.  Mayhaps we need
> sumall() after all.

Yes, we do I guess, but I do not like such things creeping into an 
otherwise elegant package if I may be frank...

Peter


From strang at nmr.mgh.harvard.edu  Mon Oct 25 09:53:00 2004
From: strang at nmr.mgh.harvard.edu (Gary Strangman)
Date: Mon Oct 25 09:53:00 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <417D2A3C.7010108@cox.net>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov>
  <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu>
  <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain>
  <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
 <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net>
Message-ID: <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu>

> I'm not sure how feasible it is, but I'd much rather an efficient, 
> non-copying, 1-D view of an noncontiguous array (from an enhanced version of 
> flat or ravel or whatever) than a bunch of extra methods. The former allows 
> all of the standard methods to just work efficiently using sum(ravel(A)) or 
> sum(A.flat) [ and max and min, etc]. Making special whole array methods for 
> everything just leads to method eplosion.

I completely agree with this ... an efficient flat/ravel would seem to 
solve many of the issues being raised. Forgive the potentially naive 
question here, but is there any reason such an efficient, enhanced view 
can't be implemented for the .flat method? I like the concept of .flat, 
but I regularly call functions with arguments that may-or-may-not be 
contiguous. For robustness, such functions _must_ be coded with ravel() 
because .flat fails for non-contiguous arrays. I never fully understood 
why there were two ways of "flattening" in the first place.

Gary

--------------------------------------------------------------
Gary Strangman, PhD        |  Director, Neural Systems Group
Office: 617-724-0662       |  Massachusetts General Hospital
Fax:    617-726-4078       |  149 13th Street, Ste 10018
                            |  Charlestown, MA  02129


From verveer at embl-heidelberg.de  Mon Oct 25 10:09:05 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 10:09:05 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu>
Message-ID: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

On 25 Oct 2004, at 18:51, Gary Strangman wrote:

>
>> I'm not sure how feasible it is, but I'd much rather an efficient, 
>> non-copying, 1-D view of an noncontiguous array (from an enhanced 
>> version of flat or ravel or whatever) than a bunch of extra methods. 
>> The former allows all of the standard methods to just work 
>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, 
>> etc]. Making special whole array methods for everything just leads to 
>> method eplosion.
>
> I completely agree with this ... an efficient flat/ravel would seem to 
> solve many of the issues being raised. Forgive the potentially naive 
> question here, but is there any reason such an efficient, enhanced 
> view can't be implemented for the .flat method?

I believe it is not possible without copying data. The strides between 
elements of a noncontiguous array are not always the same, so you 
cannot efficiently view it as a 1D array.

>  I like the concept of .flat, but I regularly call functions with 
> arguments that may-or-may-not be contiguous. For robustness, such 
> functions _must_ be coded with ravel() because .flat fails for 
> non-contiguous arrays.

Functions should be coded in the first place to take multi-dimensional 
nature into account in my opinion. One of the points of numarray is 
that it is multi-dimensional. If a function can work over multiple 
dimensions, but it only works for 1D arrays, it is broken in my 
opinion. In my opinion sum() _is_ broken, and introducing a separate 
sum_all() is an ugly hack.

> I never fully understood why there were two ways of "flattening" in 
> the first place.

I suppose it is for efficiency reasons, flat may not always works, but 
if it does, it is efficient since it would not need to copy any data.

Peter


From Chris.Barker at noaa.gov  Mon Oct 25 10:10:20 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Mon Oct 25 10:10:20 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098508579.3403.6.camel@localhost.localdomain>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>	 <1098480955.11372.19.camel@freyer.sfo.csun.edu>	 <p06200517bd9f307eeee5@[128.95.99.44]> <1098508579.3403.6.camel@localhost.localdomain>
Message-ID: <417D3309.9070302@noaa.gov>

A few comments on a number of posts in this thread:

Stephen Walton wrote:
>>- I'd like to write C/C++ code that would work on multiple array 
>>types.
> 
> I can't help much here, other than to say that C and C++ are pretty low
> level languages, not well suited for this level of abstraction.

Well, this is certainly true for C, but not so much for C++. I'm not 
expert, but C++ templates could be very handy here. When the numarray 
projects was just getting started, there was some discussion about using 
a template-based array package as the base, perhaps Blitz++. I still 
this this was a great idea, but I think the biggest issue at the time 
was that templates were still not constantly well supported by the wide 
variety of compilers that numarray should work with. Personally I think 
that anything supported by gcc should be fine, as anyone can use gcc on 
virtually any platform, if they want.

Anyway, it's too late to re-write numarray, but maybe a numarray <--> 
blitz++ conversion package would make it easy to write numarray 
extensions with blitz++. Perhaps even integrate it with Boost.Python. 
Another option would be to write a template-based wrapper around the 
existing Numarray objects.

By the way, my other issue with extensions is the difficulty of writing 
extensions that support discontinuous arrays, in addition to multiple 
data types. It seems someone smarter than me could use C++ classes to 
solve this one as well.

Peter Verveer wrote:

> But I do agree that it is not a good idea to introduce another set of 
> names. In my opinion functions that calculate a statistic like sum 
> should return the total in the first place, rather then over a single 
> axis.

Absolutely not! I'm far more likely to want it over a single axis, it's 
the core of "vectorizing" your code. If the data are mean the same 
thing, why aren't you storing it in a 1-d array? That being said, it 
should be easy to do various reductions over all axis, which I think 
.flat() does nicely. I thought .flat() never made a copy: am I wrong?

Stephen Walton wrote:
> It depends on the data.  I use rank-2 arrays which are images and are
> therefore homogeneous.

OK, good example.... I take back some of what I said above!

> By analogy with MATLAB (I'm guessing), sum() in Numeric and
> numarray does a one-D sum.

except Matab does it worse. If your 2-d array happens to have only one 
row, you get the sum over that..yecch!

Tim Hochberg wrote:
> I'm not sure how feasible it is, but I'd much rather an efficient, 
> non-copying, 1-D view of an noncontiguous array (from an enhanced 
> version of flat or ravel or whatever) than a bunch of extra methods. The 
> former allows all of the standard methods to just work efficiently using 
> sum(ravel(A)) or sum(A.flat) [ and max and min, etc]. Making special 
> whole array methods for everything just leads to method eplosion.

here! here! I thought that was exactly what .flat() was for. Shows what 
I know!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From rowen at u.washington.edu  Mon Oct 25 10:33:02 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Mon Oct 25 10:33:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> 
 <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu> 
 <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]> 
 <1098479844.29804.260.camel@halloween.stsci.edu>
 <41797FFE.8090802@colorado.edu> 
 <1098670236.1907.21.camel@localhost.localdomain>
 <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
 <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net>
 <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu>
 <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
Message-ID: <p06200505bda2e56656f4@[128.95.99.44]>

At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
>On 25 Oct 2004, at 18:51, Gary Strangman wrote:
>
>>
>>>  I'm not sure how feasible it is, but I'd much rather an 
>>>efficient, non-copying, 1-D view of an noncontiguous array (from 
>>>an enhanced version of flat or ravel or whatever) than a bunch of 
>>>extra methods. The former allows all of the standard methods to 
>>>just work efficiently using sum(ravel(A)) or sum(A.flat) [ and max 
>>>and min, etc]. Making special whole array methods for everything 
>>>just leads to method eplosion.
>>
>>  I completely agree with this ... an efficient flat/ravel would 
>>seem to solve many of the issues being raised. Forgive the 
>>potentially naive question here, but is there any reason such an 
>>efficient, enhanced view can't be implemented for the .flat method?
>
>I believe it is not possible without copying data. The strides 
>between elements of a noncontiguous array are not always the same, 
>so you cannot efficiently view it as a 1D array.

How about providing an iterator that counts through all the elements 
of an array (e.g. arr.itervalues()). So long as C extensions could 
efficiently make use of such an iterator, I think it'd do the job.

One could also imagine:
- arr.iteritems(), which returned (index, value) for each item
- a mask argument: a boolean array the same shape as the data array; 
True means elide the corresponding value from the data array
- general support for indexing

More generally, I agree that sum should work the same as a function 
and a method, and that an extra axis argument could be a good thing 
(it is so common elsewhere, e.g. size). I'd be tempted to break 
backwards compatibility to fix this, since numarray is still new and 
the current situation is very confusing.

-- Russell


From strang at nmr.mgh.harvard.edu  Mon Oct 25 10:38:01 2004
From: strang at nmr.mgh.harvard.edu (Gary Strangman)
Date: Mon Oct 25 10:38:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov>
 <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu>
 <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain>
 <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
 <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net>
 <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu>
 <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
Message-ID: <Pine.LNX.4.60.0410251322280.27302@gate.nmr.mgh.harvard.edu>

>> I completely agree with this ... an efficient flat/ravel would seem to 
>> solve many of the issues being raised. Forgive the potentially naive 
>> question here, but is there any reason such an efficient, enhanced view 
>> can't be implemented for the .flat method?
>
> I believe it is not possible without copying data. The strides between 
> elements of a noncontiguous array are not always the same, so you cannot 
> efficiently view it as a 1D array.

And it gets even worse for different-stride slices of N-D arrays (though 
I'm not yet ready to say it's impossible to do without copying). Maybe 
it's just me, but it does seem somewhat non-pythonic for a function/method 
to break for an inefficient case, instead of dropping back to less 
efficient (i.e., copying) behavior.

> Functions should be coded in the first place to take multi-dimensional nature 
> into account in my opinion. One of the points of numarray is that it is 
> multi-dimensional. If a function can work over multiple dimensions, but it 
> only works for 1D arrays, it is broken in my opinion. In my opinion sum() 
> _is_ broken, and introducing a separate sum_all() is an ugly hack.

+1. ;-) Hence the thought to make flattening a single "enhanced" 
method/fcn ... to essentially eliminate the need for such ugly hacks. 
Typically, my functions accept N-D arguments, and can operate over a 
user-selected subset of these dimensions. I may pass a whole array, or 
every other column, or whatever. Judging from the history of this thread, 
I think a .flat that is as-efficient-as-possible and also robust to all 
forms of non-contiguity would benefit many, while also reducing the 
learning-curve issues associated with .flat vs ravel().

As for where/when/how to introduce .newandimprovedflat, welllllll, that's 
for another thread. ;-)

Gary

--------------------------------------------------------------
Gary Strangman, PhD        |  Director, Neural Systems Group
Office: 617-724-0662       |  Massachusetts General Hospital
Fax:    617-726-4078       |  149 13th Street, Ste 10018
                            |  Charlestown, MA  02129


From verveer at embl-heidelberg.de  Mon Oct 25 10:42:03 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 10:42:03 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <417D3309.9070302@noaa.gov>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>	 <1098480955.11372.19.camel@freyer.sfo.csun.edu>	 <p06200517bd9f307eeee5@[128.95.99.44]> <1098508579.3403.6.camel@localhost.localdomain> <417D3309.9070302@noaa.gov>
Message-ID: <1A9085AC-26AD-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

> Stephen Walton wrote:
>>> - I'd like to write C/C++ code that would work on multiple array 
>>> types.
>> I can't help much here, other than to say that C and C++ are pretty 
>> low
>> level languages, not well suited for this level of abstraction.
>
> Well, this is certainly true for C, but not so much for C++. I'm not 
> expert, but C++ templates could be very handy here. When the numarray 
> projects was just getting started, there was some discussion about 
> using a template-based array package as the base, perhaps Blitz++. I 
> still this this was a great idea, but I think the biggest issue at the 
> time was that templates were still not constantly well supported by 
> the wide variety of compilers that numarray should work with. 
> Personally I think that anything supported by gcc should be fine, as 
> anyone can use gcc on virtually any platform, if they want.

I think having the option of using C++ would be cool. But as soon as we 
would 'require' it, I would not develop for numarray anymore. C++ is a 
big pain in my opinion, although I do agree that a well written 
templating system like Blitz++ is nice if you actually use C++.

> Anyway, it's too late to re-write numarray, but maybe a numarray <--> 
> blitz++ conversion package would make it easy to write numarray 
> extensions with blitz++. Perhaps even integrate it with Boost.Python. 
> Another option would be to write a template-based wrapper around the 
> existing Numarray objects.

yes, it would be nice to have the option. There is no reason why there 
could not be a C++ API which would include the use of templates layered 
on top of the current C API for those people that would like to use it.

> By the way, my other issue with extensions is the difficulty of 
> writing extensions that support discontinuous arrays, in addition to 
> multiple data types. It seems someone smarter than me could use C++ 
> classes to solve this one as well.

I had to deal with that problem too in nd_image. It is doable, albeit 
ugly if you depend on plain C. Probably C++ could do it differently and 
more nicely, Blitz++ possible does. Again, not for me.

> Peter Verveer wrote:
>
>> But I do agree that it is not a good idea to introduce another set of 
>> names. In my opinion functions that calculate a statistic like sum 
>> should return the total in the first place, rather then over a single 
>> axis.
>
> Absolutely not! I'm far more likely to want it over a single axis, 
> it's the core of "vectorizing" your code. If the data are mean the 
> same thing, why aren't you storing it in a 1-d array?

I agree that it is important, I am just saying that both are very 
common operations. Why not support operations over an axis by a 
optional argument, you will often have to specify which axis you want 
anyway.

> That being said, it should be easy to do various reductions over all 
> axis, which I think .flat() does nicely. I thought .flat() never made 
> a copy: am I wrong?

Unfortunately, flattening an array is not always possible without 
copying, due to the fact that arrays may be not contiguous in memory.

> Tim Hochberg wrote:
>> I'm not sure how feasible it is, but I'd much rather an efficient, 
>> non-copying, 1-D view of an noncontiguous array (from an enhanced 
>> version of flat or ravel or whatever) than a bunch of extra methods. 
>> The former allows all of the standard methods to just work 
>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, 
>> etc]. Making special whole array methods for everything just leads to 
>> method eplosion.
>
> here! here! I thought that was exactly what .flat() was for. Shows 
> what I know!

It is however not feasible I think to do it efficiently. It seems to me 
that a set of functions is necessary to do things like sum, minimum and 
so on, that work on the whole array. I would also prefer they are not 
methods. Introducing a whole array of sum_all() like functions is also 
not great.

Cheers, Peter


From verveer at embl-heidelberg.de  Mon Oct 25 11:04:01 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 11:04:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <p06200505bda2e56656f4@[128.95.99.44]>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <p06200505bda2e56656f4@[128.95.99.44]>
Message-ID: <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

On 25 Oct 2004, at 19:32, Russell E Owen wrote:

> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
>> On 25 Oct 2004, at 18:51, Gary Strangman wrote:
>>
>>>
>>>>  I'm not sure how feasible it is, but I'd much rather an efficient, 
>>>> non-copying, 1-D view of an noncontiguous array (from an enhanced 
>>>> version of flat or ravel or whatever) than a bunch of extra 
>>>> methods. The former allows all of the standard methods to just work 
>>>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, 
>>>> etc]. Making special whole array methods for everything just leads 
>>>> to method eplosion.
>>>
>>>  I completely agree with this ... an efficient flat/ravel would seem 
>>> to solve many of the issues being raised. Forgive the potentially 
>>> naive question here, but is there any reason such an efficient, 
>>> enhanced view can't be implemented for the .flat method?
>>
>> I believe it is not possible without copying data. The strides 
>> between elements of a noncontiguous array are not always the same, so 
>> you cannot efficiently view it as a 1D array.
>
> How about providing an iterator that counts through all the elements 
> of an array (e.g. arr.itervalues()). So long as C extensions could 
> efficiently make use of such an iterator, I think it'd do the job.

It would still be slower, because you would need a function call at 
each element that returns a value. Not a problem if you do a lot of 
work at each element, but if you are just adding values you want a 
custom written C function. You can do it a the C level with macros or 
so, (I do that in nd_image) but that would not help at the python 
level.

> One could also imagine:
> - arr.iteritems(), which returned (index, value) for each item
> - a mask argument: a boolean array the same shape as the data array; 
> True means elide the corresponding value from the data array
> - general support for indexing

Essentially you are suggesting to expose iterators at the python level 
that iterate over an array in some predefined way. That is possible, 
but I doubt it will be efficient.

At the C level however, it might be worth thinking about as a way of 
easing writing functions in C. I proposed to do it the other way around 
in an earlier mail: providing a set of generic functions that take a 
python or a C function to be applied at each element. I most likely 
will implement something in that direction, but I should give your idea 
also some thought.

> More generally, I agree that sum should work the same as a function 
> and a method, and that an extra axis argument could be a good thing 
> (it is so common elsewhere, e.g. size). I'd be tempted to break 
> backwards compatibility to fix this, since numarray is still new and 
> the current situation is very confusing.

I would absolutely vote for such a change. Simply because we would like 
a range of such functions, e.g. minimum, maximum, and so on. Even if we 
have to leave sum() as it is, I think we should have the alternatives, 
we would just have to come up with an alternative name for sum(). In 
fact I would consider volunteering implementing these functions.

Peter


From tim.hochberg at cox.net  Mon Oct 25 14:03:03 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Oct 25 14:03:03 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <p06200505bda2e56656f4@[128.95.99.44]> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
Message-ID: <417D69CD.7070604@cox.net>

Peter Verveer wrote:

>
> On 25 Oct 2004, at 19:32, Russell E Owen wrote:
>
>> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
>>
>>> On 25 Oct 2004, at 18:51, Gary Strangman wrote:
>>>
>>>>
>>>>>  I'm not sure how feasible it is, but I'd much rather an 
>>>>> efficient, non-copying, 1-D view of an noncontiguous array (from 
>>>>> an enhanced version of flat or ravel or whatever) than a bunch of 
>>>>> extra methods. The former allows all of the standard methods to 
>>>>> just work efficiently using sum(ravel(A)) or sum(A.flat) [ and max 
>>>>> and min, etc]. Making special whole array methods for everything 
>>>>> just leads to method eplosion.
>>>>
>>>>
>>>>  I completely agree with this ... an efficient flat/ravel would 
>>>> seem to solve many of the issues being raised. Forgive the 
>>>> potentially naive question here, but is there any reason such an 
>>>> efficient, enhanced view can't be implemented for the .flat method?
>>>
>>>
>>> I believe it is not possible without copying data. The strides 
>>> between elements of a noncontiguous array are not always the same, 
>>> so you cannot efficiently view it as a 1D array.
>>
>>
>> How about providing an iterator that counts through all the elements 
>> of an array (e.g. arr.itervalues()). So long as C extensions could 
>> efficiently make use of such an iterator, I think it'd do the job.
>
>
> It would still be slower, because you would need a function call at 
> each element that returns a value. Not a problem if you do a lot of 
> work at each element, but if you are just adding values you want a 
> custom written C function. You can do it a the C level with macros or 
> so, (I do that in nd_image) but that would not help at the python level.
>
>> One could also imagine:
>> - arr.iteritems(), which returned (index, value) for each item
>> - a mask argument: a boolean array the same shape as the data array; 
>> True means elide the corresponding value from the data array
>> - general support for indexing
>
>
> Essentially you are suggesting to expose iterators at the python level 
> that iterate over an array in some predefined way. That is possible, 
> but I doubt it will be efficient.
>
> At the C level however, it might be worth thinking about as a way of 
> easing writing functions in C. I proposed to do it the other way 
> around in an earlier mail: providing a set of generic functions that 
> take a python or a C function to be applied at each element. I most 
> likely will implement something in that direction, but I should give 
> your idea also some thought.
>
>> More generally, I agree that sum should work the same as a function 
>> and a method, and that an extra axis argument could be a good thing 
>> (it is so common elsewhere, e.g. size). I'd be tempted to break 
>> backwards compatibility to fix this, since numarray is still new and 
>> the current situation is very confusing.
>
>
> I would absolutely vote for such a change. Simply because we would 
> like a range of such functions, e.g. minimum, maximum, and so on. Even 
> if we have to leave sum() as it is, I think we should have the 
> alternatives, we would just have to come up with an alternative name 
> for sum(). In fact I would consider volunteering implementing these 
> functions.

Why the need to break backwards compatability? If one is going to 
reimplement sum, et al so as to operate on an arbitrary set of axes 
there's no reason one couldn't maintain the current behaviour as the 
default. All that is required is to allow axis to be a number (current 
behaviour), a tuple (reduce across the designated axes) or some special 
value to sum over all (None?, "all"?).

Having two sum functions with different names is not particularly better 
than the current proposal of a method and a function.

-tim


From verveer at embl-heidelberg.de  Mon Oct 25 15:48:03 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 15:48:03 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <417D69CD.7070604@cox.net>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <p06200505bda2e56656f4@[128.95.99.44]> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net>
Message-ID: <E4AF47EA-26D7-11D9-8DC3-000D932805AC@embl-heidelberg.de>

On Oct 25, 2004, at 11:02 PM, Tim Hochberg wrote:

> Peter Verveer wrote:
>
>>
>> On 25 Oct 2004, at 19:32, Russell E Owen wrote:
>>
>>> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
>>>
>>>> On 25 Oct 2004, at 18:51, Gary Strangman wrote:
>>>>
>>>>>
>>>>>>  I'm not sure how feasible it is, but I'd much rather an 
>>>>>> efficient, non-copying, 1-D view of an noncontiguous array (from 
>>>>>> an enhanced version of flat or ravel or whatever) than a bunch of 
>>>>>> extra methods. The former allows all of the standard methods to 
>>>>>> just work efficiently using sum(ravel(A)) or sum(A.flat) [ and 
>>>>>> max and min, etc]. Making special whole array methods for 
>>>>>> everything just leads to method eplosion.
>>>>>
>>>>>
>>>>>  I completely agree with this ... an efficient flat/ravel would 
>>>>> seem to solve many of the issues being raised. Forgive the 
>>>>> potentially naive question here, but is there any reason such an 
>>>>> efficient, enhanced view can't be implemented for the .flat 
>>>>> method?
>>>>
>>>>
>>>> I believe it is not possible without copying data. The strides 
>>>> between elements of a noncontiguous array are not always the same, 
>>>> so you cannot efficiently view it as a 1D array.
>>>
>>>
>>> How about providing an iterator that counts through all the elements 
>>> of an array (e.g. arr.itervalues()). So long as C extensions could 
>>> efficiently make use of such an iterator, I think it'd do the job.
>>
>>
>> It would still be slower, because you would need a function call at 
>> each element that returns a value. Not a problem if you do a lot of 
>> work at each element, but if you are just adding values you want a 
>> custom written C function. You can do it a the C level with macros or 
>> so, (I do that in nd_image) but that would not help at the python 
>> level.
>>
>>> One could also imagine:
>>> - arr.iteritems(), which returned (index, value) for each item
>>> - a mask argument: a boolean array the same shape as the data array; 
>>> True means elide the corresponding value from the data array
>>> - general support for indexing
>>
>>
>> Essentially you are suggesting to expose iterators at the python 
>> level that iterate over an array in some predefined way. That is 
>> possible, but I doubt it will be efficient.
>>
>> At the C level however, it might be worth thinking about as a way of 
>> easing writing functions in C. I proposed to do it the other way 
>> around in an earlier mail: providing a set of generic functions that 
>> take a python or a C function to be applied at each element. I most 
>> likely will implement something in that direction, but I should give 
>> your idea also some thought.
>>
>>> More generally, I agree that sum should work the same as a function 
>>> and a method, and that an extra axis argument could be a good thing 
>>> (it is so common elsewhere, e.g. size). I'd be tempted to break 
>>> backwards compatibility to fix this, since numarray is still new and 
>>> the current situation is very confusing.
>>
>>
>> I would absolutely vote for such a change. Simply because we would 
>> like a range of such functions, e.g. minimum, maximum, and so on. 
>> Even if we have to leave sum() as it is, I think we should have the 
>> alternatives, we would just have to come up with an alternative name 
>> for sum(). In fact I would consider volunteering implementing these 
>> functions.
>
> Why the need to break backwards compatability? If one is going to 
> reimplement sum, et al so as to operate on an arbitrary set of axes 
> there's no reason one couldn't maintain the current behaviour as the 
> default.

It seems to me that the behavior one would expect for a function like 
that, would be to apply the operation to the whole array. Not along an 
axis. What would you expect as a new user if you call a minimum() 
function? A single value that is the minimum. So that is the logical 
choice for the default behavior, I would think.

>  All that is required is to allow axis to be a number (current 
> behaviour), a tuple (reduce across the designated axes) or some 
> special value to sum over all (None?, "all"?).

Yes, that would be the idea anyway. The question is what should be the 
default behavior for this type of functions, something I think we 
should not decide based on the current behavior of a single existing 
function, but based on what makes the most sense. That is obviously 
something that can be discussed...

>
> Having two sum functions with different names is not particularly 
> better than the current proposal of a method and a function.

This is certainly true. I would prefer breaking compability...

Peter


From meikuan75 at hotmail.com  Tue Oct 26 02:22:05 2004
From: meikuan75 at hotmail.com (Mei Kuan)
Date: Tue Oct 26 02:22:05 2004
Subject: [Numpy-discussion] Singaporeans ay tumutulong para mapaunlad ang sariling negosyo
Message-ID: <E1CMNW4-00022D-Q8@sc8-sf-mx2.sourceforge.net>

Dear Filipino friend,
 
Kumusta ka na?
 
We were looking and your email just appeared, perhaps it was GOD's will. We sincerely hope that you read on this letter. This may be of significant relevance to you and your loved ones and give you something you are looking for in life. 
 
Do allow us to provide you with a brief introduction of ourselves.
 
We are a team of Singaporean entrepreneurs hailing from various professional fields. 
 
We know that, in the new millennium, more Filipino employees and professionals are finding it harder to get ahead in life due to greater job insecurity as a result of corporate downsizing and global outsourcing, diminishing wages, office politics, not forgetting constant retrenchment threats. They are further affected by the rising costs of living and interest rates, not forgetting the current economic difficulties that Philippines is currently facing.
 
There are also thousands of Filipinos who have to endure the heart-break of leaving their loved ones to venture overseas in order to support their loved ones and the Philippines economy once again.
 
Filipino businessmen too, have to grapple with increasing economic and political uncertainties, epidemic threats such as the Avian Flu, competitive threats and unstable crude oil crisis. Further, due to the increasingly rapid changes in the business environment, they find it harder to keep up with the increasingly volatile business cycles. 
 
We recognise these problems faced by many Filipinos today and decide to embark on a more fulfilling long term career of helping them solve their problems and improving their lives in the process.
 
What we do is to help Filipinos develop/diversify into their own businesses in a new, potentially huge and expanding industry so that they can start managing the above adversities and making significant progress towards what they and their loved ones want in life once again. 
 
Would this be something that may be deemed as a long term solution in your life? 
 
Our fellow associates from Singapore will be flying specially to the Philippines to conduct a series of exclusive previews in Makati, Cebu and Naga in November. 
 
Would you be interested to attend one of our previews for you to discover how our revolutionary platform can possibly help you and your loved ones improve your results on a long-term basis?
 
If you are interested to attend, could you kindly provide us with your cellphone no. for our senior associate, Mr. Chew to text you when he is in Philippines next month?
 
Mr. Chew was a very successful corporate executive from a Multi-National Corporation and a former Economic Lecturer. He held a Master of Science Degree in Financial Economics. Hence, he knows what it takes for a business to be considered a viable one and of course, what it takes to succeed in the business. He gave up a very successful corporate life to help many Filipinos change their lives. Despite his busy schedule, he is committed to flying to Philippines to help them. As such, he is a great mentor, inspirational, dynamic leader to many of us. He gained great respects from many of our Filipinos and non Filipinos friends. We believe he is the best person to share with you in depth how our revolutionary platform can fulfill your goals in life and turn your dreams into reality.
 
We would handle all enquiries via Chikka: 001877961 or Skype: Reychell 
 
We sincerely urge you to communicate with us on Chikka/Skype to know you better as a friend and understand the challenges you are currently facing because we are looking to help you on a long-term basis. 
 
Ingats.
 
GOD BLESS.
 
Chow Mei Kuan (Ms) / Don (Mr.) 
Email: reychell at singnet.com.sg /chewlw at singnet.com.sg
Chikka No.: 001877961
Skype ID: Reychell
 
P.S.: This may be a GOD-send opportunity to improve your life. 
 

Disclaimer:
This email, together with any attachments, is intended ONLY for the use of the individual or entity to which it is addressed, and may contain information that is legally privileged, confidential, and/or subject to copyright. If you are not the intended recipient, please be informed that any dissemination, distribution or copying of this email, any attachment, or part thereof is strictly prohibited. Kindly note that internet communications are not secure, and therefore are susceptible to alterations. If you have received this email in error, please advise the sender by reply email, and delete this message. Your co-operation on this matter is highly appreciated. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041026/b20d2176/attachment.html>

From Chris.Barker at noaa.gov  Tue Oct 26 09:21:08 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Tue Oct 26 09:21:08 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <E4AF47EA-26D7-11D9-8DC3-000D932805AC@embl-heidelberg.de>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <p06200505bda2e56656f4@[128.95.99.44]> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net> <E4AF47EA-26D7-11D9-8DC3-000D932805AC@embl-heidelberg.de>
Message-ID: <417E7907.9060107@noaa.gov>

Peter Verveer wrote:
> On Oct 25, 2004, at 11:02 PM, Tim Hochberg wrote:
>> Why the need to break backwards compatability? If one is going to 
>> reimplement sum, et al so as to operate on an arbitrary set of axes 
>> there's no reason one couldn't maintain the current behaviour as the 
>> default.

Great idea!

> It seems to me that the behavior one would expect for a function like 
> that, would be to apply the operation to the whole array. Not along an 
> axis. What would you expect as a new user if you call a minimum() 
> function?  A single value that is the minimum. So that is the logical 
> choice for the default behavior, I would think.

nope. I'd expect it to be along an axis, by default the last one. To me, 
that's what vectorization is all about. Maybe this is because of my 
MATLAB (and now Numeric) background, but it makes the most sense to me 
that a method either returns an array of the same rank, or "reducing" 
methods return an array of rank reduced by one. Having a method return 
the same rank answer, no matter the rank of the input, is weird to me.

This all depends on how you use arrays. I can see that if you tend to 
use a 2-d array to store an image, that the single minimum would seem 
logical, but for many other uses, each dimension has an independent meaning.

> Yes, that would be the idea anyway. The question is what should be the 
> default behavior for this type of functions, something I think we should 
> not decide based on the current behavior of a single existing function, 
> but based on what makes the most sense. That is obviously something that 
> can be discussed...

yup, but frankly, this isn't about just one function, it's really about 
all the reductions: min, max, sum, etc, etc. I think the rule of thumb 
is not to break backward compatibility unless there is a compelling 
reason, and given that it's not clear what is most "natural" in this 
case, keeping the default the same makes the most sense.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From verveer at embl-heidelberg.de  Tue Oct 26 11:20:02 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Tue Oct 26 11:20:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <417E7907.9060107@noaa.gov>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <p06200505bda2e56656f4@[128.95.99.44]> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net> <E4AF47EA-26D7-11D9-8DC3-000D932805AC@embl-heidelberg.de> <417E7907.9060107@noaa.gov>
Message-ID: <8629C0DC-277B-11D9-8DC3-000D932805AC@embl-heidelberg.de>

On Oct 26, 2004, at 6:19 PM, Chris Barker wrote:

> Peter Verveer wrote:
>> It seems to me that the behavior one would expect for a function like 
>> that, would be to apply the operation to the whole array. Not along 
>> an axis. What would you expect as a new user if you call a minimum() 
>> function?  A single value that is the minimum. So that is the logical 
>> choice for the default behavior, I would think.
>
> nope. I'd expect it to be along an axis, by default the last one.

I still do not agree completely with that, I will elaborate more below, 
because I also do not agree anymore with my own earlier writings :-).

But I see your point that this type of operation can be natural 
depending on what you are doing. Sometimes a single value does make 
sense, sometimes not, I think we can agree on that.

>> Yes, that would be the idea anyway. The question is what should be 
>> the default behavior for this type of functions, something I think we 
>> should not decide based on the current behavior of a single existing 
>> function, but based on what makes the most sense. That is obviously 
>> something that can be discussed...
>
> yup, but frankly, this isn't about just one function, it's really 
> about all the reductions: min, max, sum, etc, etc.

Actually no. It seems that sum() is a special case, along with a few 
others. Again: I elaborate on the general case below.

> I think the rule of thumb is not to break backward compatibility 
> unless there is a compelling reason, and given that it's not clear 
> what is most "natural" in this case, keeping the default the same 
> makes the most sense.

I agree. In contrast what I have said before I think we should keep it 
as it is, for compatibility.

Now to elaborate on the general problem, please correct me if I get 
something wrong. I will use the minimum function as an example and come 
back to sum() later.

If you look at a minimum operation then there are three different 
things you might like to do:

1) An element by element minimum: minimum(a1, a2). This is the current 
behaviour. Like all binary ufuncs of this type, it operates on pairs of 
arrays. So by default it does not do reduction or calculate a single 
minimum. For most ufuncs that is the natural behavior anyway.

2) A reduction: minimum.reduce(a1). The reduce method of ufuncs is 
generally used for reductions. Having to use .reduce makes clear what 
you are doing. Although a bit odd at first sight, I think it is a 
clever way to overload ufuncs names with different functionality.

3) The minimum of the array:  In numarray you do a1.min(). I think in 
Numeric, you have to do something like minimum.reduce(a1.flat), correct 
me if I am wrong. Not nice in both cases...

Note that calling a binary ufunc with a single argument will give an 
error: minimum(a1) raises a TypeError. That seems to be a good 
decision, because people seem to have different ideas of what should 
happen: I would expect the minimum of the array, others expect a 
reduction. Generally I guess it was a wise decision not to change the 
meaning of a function depending on wether it has one or two arguments.

The sum() function is an alias to add.reduce. there are a few more of 
these aliases (i.e. product). I would still say that it is a bit 
unfortunate, since not everybody may immediately realize that these 
functions are in fact reductions.

I wonder if one would not be better of without these functions at all, 
after all you can access the functionality through .reduce(). If you 
mind the extra typing, just define your own alias. Can't we shift them 
into numarray.numeric? Just a thought...

In any case, clearly these functions need to stay around as they are 
for compatibility reasons. It is far more productive to add the 
functionality that a few people already proposed: allow reductions over 
multiple axes. I would welcome that, I always found 1D reductions a bit 
limited anyway. Obviously you can do sequential 1D reductions, but that 
can be quite inefficient. As proposed, the axis argument would take 
maybe a list of dimensions, and 'all' or None. I would like to propose 
an additional possibility: like minimum.reduce(), we could have a 
minimum.all() function that reduces over all dimensions (with a 
potentially much more efficient implementation.) We don't need a 
sum_all(a1) then, you would use add.all(a1). I guess this would be 
easily prototyped using sequential reductions, one can worry  about 
efficiency later.

Sorry for the long story...

Cheers, Peter


From haase at msg.ucsf.edu  Wed Oct 27 09:59:02 2004
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Wed Oct 27 09:59:02 2004
Subject: [Numpy-discussion] bug? in len(arr.flat)
Message-ID: <200410270958.20025.haase@msg.ucsf.edu>

Hi,
I have a (UInt16) 3d data stack and want to get to it's underlying buffer (to 
(later) feed it into memmap) ...  
I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't 
multiply itemsize in.
>>> pr2.shape
(40, 512, 512)
>>> pr2.flat.shape
(10485760)
>>> 512*512*40
10485760
>>> len(pr2.flat)
10485760
>>> pr2.flat._itemsize
2
>>> len(pr2._data)
20971520
>>> pr2._byteoffset
0

Is this a bug or am I missunderstanding ?

Thanks,
Sebastian Haase
  

From strawman at astraw.com  Thu Oct 28 19:21:02 2004
From: strawman at astraw.com (Andrew Straw)
Date: Thu Oct 28 19:21:02 2004
Subject: [Numpy-discussion] floating point exception weirdness
In-Reply-To: <41795006.1040807@astraw.com>
References: <4119BBFC.6020304@astraw.com> <1092221365.3752.32.camel@localhost.localdomain> <411A08FA.7000601@astraw.com> <41795006.1040807@astraw.com>
Message-ID: <4181A8CC.2040807@astraw.com>

Just a small addendum, (which I hope will spur on bug-fixing once Todd 
et al. are back from the conference -- let me know if I should file a 
sourceforge bug report):

Numeric is not necessary to trigger the bug in the below code -- 
numarray is sufficient on its own.  Furthermore, I can confirm that 
merely removing the "atlas3-sse2" Debian package from my system causes 
the code, whether or not numarray.ieeespecial is imported, to run 
without being killed by an FPE.

Andrew Straw wrote:

> I've isolated a bug I first reported on this mailing list in August.  
> I've now confined it to a small code snippet using entirely 
> open-source software (previously I saw it while using Intel's IPP).  
> In a nutshell, importing numarray.ieeespecial triggers a floating 
> point exception (which kills my program) when I call Numeric's 
> singular_value_decomposition() function:
>
> import Numeric
> from LinearAlgebra import singular_value_decomposition
>
> if want_FPE:
>    import numarray.ieeespecial
>
> A= [[-5.7, 2.2, -0.53, 46.0],
>    [-2.3, -5.5, -1.0, 1091.0],
>    [5.9, 1.4, -0.1, -142.0],
>    [-1.3, 5.7, -1.5, 2673.0]]
> A=Numeric.array(A)
> u,s,v = singular_value_decomposition(A) # FPE triggered here
>
> Here's my setup:
>
> $ python
> Python 2.3.4 (#2, Sep 24 2004, 08:39:09)
> [GCC 3.3.4 (Debian 1:3.3.4-12)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import Numeric
> >>> Numeric.__version__
> '23.6'
> >>> import numarray
> >>> numarray.__version__
> '1.2a'
>
> $ gcc -v
> Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.4/specs
> Configured with: ../src/configure -v 
> --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang 
> --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info 
> --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared 
> --with-system-zlib --enable-nls --without-included-gettext 
> --enable-__cxa_atexit --enable-clocale=gnu --enable-debug 
> --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
> Thread model: posix
> gcc version 3.3.4 (Debian 1:3.3.4-13)
>
> Now, for the clue:  the above error is ONLY triggered when I compile 
> Numeric to use system blas and friends, not when I use lapack_lite 
> included with Numeric.  This leads me to suspect it is related to the 
> SSE2 unit -- I have Debian sarge's atlas3-base, atlas3-see, 
> atlas3-sse2, blas, lapack, lapack3, and refblas3 packages installed on 
> my P4 machine.
>
> So, to propose a hypothesis: numarray.ieeespecial sets the FPE bit in 
> the SSE2 hardware, but for some reason this does not raise SIGFPE.  
> However, when the next call that touches SSE2 happens, the kernel sees 
> that error bit and throws the signal.  Does this explanation make 
> sense?  Is it easy to fix?
>
> Cheers!
> Andrew
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out 
> more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From stevech1097 at yahoo.com.au  Thu Oct 28 21:56:30 2004
From: stevech1097 at yahoo.com.au (Steve Chaplin)
Date: Thu Oct 28 21:56:30 2004
Subject: [Numpy-discussion] Re: floating point exception weirdness (Andrew Straw)
In-Reply-To: <E1CNNnd-0004TT-4x@sc8-sf-list2.sourceforge.net>
References: <E1CNNnd-0004TT-4x@sc8-sf-list2.sourceforge.net>
Message-ID: <1099025806.2742.23.camel@f1>

> Just a small addendum, (which I hope will spur on bug-fixing once Todd 
> et al. are back from the conference -- let me know if I should file a 
> sourceforge bug report):

I've not read all this thread so I don't know the full background. But I
had a floating point / SSE problem using numarray.

It turned out to be a glibc not numarray problem and was solved by
upgrading glibc.
http://sources.redhat.com/bugzilla/show_bug.cgi?id=10
There was also a SourceForge bug report but I can't locate it.

Regards
Steve


From jmiller at stsci.edu  Fri Oct 29 06:27:11 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Oct 29 06:27:11 2004
Subject: [Numpy-discussion] bug? in len(arr.flat)
In-Reply-To: <200410270958.20025.haase@msg.ucsf.edu>
References: <200410270958.20025.haase@msg.ucsf.edu>
Message-ID: <1099056380.4904.12.camel@localhost.localdomain>

On Wed, 2004-10-27 at 12:58, Sebastian Haase wrote:
> Hi,
> I have a (UInt16) 3d data stack and want to get to it's underlying buffer (to 
> (later) feed it into memmap) ...  
> I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't 
> multiply itemsize in.
> >>> pr2.shape
> (40, 512, 512)
> >>> pr2.flat.shape
> (10485760)
> >>> 512*512*40
> 10485760
> >>> len(pr2.flat)
> 10485760
> >>> pr2.flat._itemsize
> 2
> >>> len(pr2._data)
> 20971520
> >>> pr2._byteoffset
> 0
> 
> Is this a bug 

No.

> or am I missunderstanding ?

Yes.  _data is "an object which supports the buffer protocol".  In this
context,  it is effectively a string and thus the product of the total
number of elements and the itemsize.  (We'll ignore for now the fact
that not every array uses the entire buffer.)  In contrast, shape(.flat)
is only the total number of elements and is independent of itemsize.

Regards,
Todd


From haase at msg.ucsf.edu  Fri Oct 29 09:03:25 2004
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri Oct 29 09:03:25 2004
Subject: [Numpy-discussion] bug? in len(arr.flat)
In-Reply-To: <1099056380.4904.12.camel@localhost.localdomain>
References: <200410270958.20025.haase@msg.ucsf.edu> <1099056380.4904.12.camel@localhost.localdomain>
Message-ID: <200410290902.25410.haase@msg.ucsf.edu>

Of course !  sorry I forgot.

Thanks,
Sebastian


On Friday 29 October 2004 06:26 am, Todd Miller wrote:
> On Wed, 2004-10-27 at 12:58, Sebastian Haase wrote:
> > Hi,
> > I have a (UInt16) 3d data stack and want to get to it's underlying buffer
> > (to (later) feed it into memmap) ...
> > I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't
> > multiply itemsize in.
> >
> > >>> pr2.shape
> >
> > (40, 512, 512)
> >
> > >>> pr2.flat.shape
> >
> > (10485760)
> >
> > >>> 512*512*40
> >
> > 10485760
> >
> > >>> len(pr2.flat)
> >
> > 10485760
> >
> > >>> pr2.flat._itemsize
> >
> > 2
> >
> > >>> len(pr2._data)
> >
> > 20971520
> >
> > >>> pr2._byteoffset
> >
> > 0
> >
> > Is this a bug
>
> No.
>
> > or am I missunderstanding ?
>
> Yes.  _data is "an object which supports the buffer protocol".  In this
> context,  it is effectively a string and thus the product of the total
> number of elements and the itemsize.  (We'll ignore for now the fact
> that not every array uses the entire buffer.)  In contrast, shape(.flat)
> is only the total number of elements and is independent of itemsize.
>
> Regards,
> Todd
>
>
>
>
> -------------------------------------------------------
> This Newsletter Sponsored by: Macrovision
> For reliable Linux application installations, use the industry's leading
> setup authoring tool, InstallShield X. Learn more and evaluate
> today. http://clk.atdmt.com/MSI/go/ins0030000001msi/direct/01/
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From jmiller at stsci.edu  Fri Oct 29 11:19:14 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Oct 29 11:19:14 2004
Subject: [Numpy-discussion] Counting array elements
Message-ID: <1099073854.4904.321.camel@localhost.localdomain>

I have returned from our astronomical data systems conference and I am
going to take a short cut and summarize what I saw as the key
developments of this thread.  I apologize for not responding sooner and
individually but the web-mail system I use isn't effective for
conducting any kind of discussion.  You guys did a great job sorting
this out this week.  I marked my key points with **.  The rest is
probably only for people with a lot of patience.

** I've finally come to terms with the fact that functions are the right
way to do numarray rather than methods.   The arguments in the Numeric
manual are no more persuasive now than they ever were,  but Stephen
Walton's remarks about method explosion finally convinced me what the
"real" reason for doing functions is that using methods combines every
new feature under the umbrella of a single namespace, the NumArray
class.  Using functions lets us partition things into modules which can
be used selectively and makes a more extensible and understandable
system.  Thanks Stephen.

A couple people remarked that using .flat might solve everything with
something like a.flat.sum() or sum(ravel(a).  This gets to the original
motivation for the sum() method, which was the codification of a simple
and storage efficient technique for reducing noncontiguous arrays.  The
first point is that a non-contiguous array cannot generally be reshaped
without making a copy.   The basic idea of the sum() method is to do
*two* reductions,  the first, along a single axis,  results in a smaller
contiguous array.  In the case of astronomical images which are
generally square or at least non-degenerate,  the reduction result is a
*much* smaller array.  The second reduction handles all the remaining
dimensions since .flat is guaranteed to work because the array is
contiguous.  The end result is a complete sum() without righting
additional ufuncs or making an array copy.

There was understandable confusion about why .flat is sometimes allowed
to fail.  Since it is an attribute,  we thought it inappropriate to make
it return a copy of the source array and chose instead to raise an
exception.  In contrast, it is reasonable for the ravel() function to
return a completely different array, so it always works.  (I just
noticed that ravel() is not named flat()).  Some of our more
contemporary thinkers suggested using iterators to produce a .flat which
always works.  If anyone has an idea how to make this work with good
performance,  please let me know;  I don't.

** Tim Hochberg pointed out that we can overload the reduction (and not
accumulation?) axis parameter with an "all" or a tuple describing a
sequence of axes to reduce along.  My perception was that there was a
consensus behind this and in any case I'm in agreement with Tim.  Alan
Isaac pointed out that None might be better here than "all" and I
agree.  At this point,  I think sumAll() is dead, the sum() method will
be deprecated, and the reductions should be expanded as Tim suggested.

** Peter Verveer made some comments about the expectations of a naive
user regarding reductions, namely that "all" should be the default.   My
own experience bears this out,  and I am torn about what to do here. 
Chris Barker pointed out the need for backward compatibility with
Numeric,  and given the current numarray goal of supporting SciPy,  this
need is growing stronger and more complex.  SciPy uses yet another axis
convention.  If anyone has any ideas how to handle these multiple
conventions with elegance,  let me know.

A number of people commented on our naming conventions, an issue which
we have side stepped for the moment with sumAll().  My impression is
that, for better or worse, numarray uses the lowerUpper() version of
Camel case.  I think this is very much a matter of personal taste and
don't claim to have any.   My guess is that numarray is probably
inconsistent at the moment, in part because lowerUpper() often
degenerates into merely lower() which degenerates into confusion. 

Regards,
Todd


From verveer at embl-heidelberg.de  Sat Oct 30 08:39:28 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Sat Oct 30 08:39:28 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain>
References: <1099073854.4904.321.camel@localhost.localdomain>
Message-ID: <BE9A7550-2A89-11D9-B5CD-000D932805AC@embl-heidelberg.de>

> ** Peter Verveer made some comments about the expectations of a naive
> user regarding reductions, namely that "all" should be the default.   
> My
> own experience bears this out,  and I am torn about what to do here.
> Chris Barker pointed out the need for backward compatibility with
> Numeric,  and given the current numarray goal of supporting SciPy,  
> this
> need is growing stronger and more complex.  SciPy uses yet another axis
> convention.  If anyone has any ideas how to handle these multiple
> conventions with elegance,  let me know.

Numarray should probably be either completely compatible in every small 
detail, or we could take the opportunity to change what we believe was 
the wrong choice. Not sure what is really best, although personally 
feel breaking compatibility is fine if the result is better. Is there 
not already a sub-package numeric within numarray that provides Numeric 
compatibility? Such a package could at  least provide wrappers with 
compatible behavior for people who need that.

Peter


From tim.hochberg at cox.net  Sat Oct 30 11:49:36 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sat Oct 30 11:49:36 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain>
References: <1099073854.4904.321.camel@localhost.localdomain>
Message-ID: <4183E208.6050001@cox.net>

Todd Miller wrote:

[SNIP]

>** Tim Hochberg pointed out that we can overload the reduction (and not
>accumulation?) 
>
It seems possible. It's probably marginally useful at best. However, it 
might be worth doing if not too painful, just so that the accumulate and 
reduce signatures match.

>axis parameter with an "all" or a tuple describing a
>sequence of axes to reduce along.  My perception was that there was a
>consensus behind this and in any case I'm in agreement with Tim.  Alan
>Isaac pointed out that None might be better here than "all" and I
>agree.  
>
Using None to mean ALL seems a little perverse to me, but I'll grant 
that using an existing singleton makes things simpler. I'll just point 
out that it would also be possible to define an ALL singleton and use that.

Very tangential: it's too bad that '...' can't be typed more places: the 
natural spelling for ALL is [...] as in:
    add.reduce(a, axis=[...])
Sadly, that won't work.

>At this point,  I think sumAll() is dead, the sum() method will
>be deprecated, and the reductions should be expanded as Tim suggested.
>
>** Peter Verveer made some comments about the expectations of a naive
>user regarding reductions, namely that "all" should be the default.   My
>own experience bears this out,  and I am torn about what to do here. 
>  
>
I suspect that one's experience here depends on your typical problem 
domain. If one does a lot 2D work ALL would seem to be the natural 
choice. If you use a lot of arrays of vectors, as I do, -1 is the 
natural choice. At this point I can't recall a case where ALL would have 
been the natural choice for me.

In addition to backwards compatibility, one argument for not using ALL 
as the default is that it makes little sense or no sense for accumulate. 
Having the default for reduce be ALL, but that for accumulate be -1 (for 
instance) would be confusing.
 

>Chris Barker pointed out the need for backward compatibility with
>Numeric,  
>
I'd think that the importance of backward compatibility with not just 
Numeric, but with Numarray itself has been underrated. Changing the 
default for reduce / sum is a particularly insiduous since many uses 
will fail silently, producing the wrong answer, but continuing to run.  
This means that all instances of sum, product and reduce will need to be 
inspected and corrected. Having 10k LOC that use Numarray, I'll be a bit 
irked if this gets changed without a better justification than what I've 
seen thus far.

>and given the current numarray goal of supporting SciPy,  this
>need is growing stronger and more complex.  SciPy uses yet another axis
>convention.  If anyone has any ideas how to handle these multiple
>conventions with elegance,  let me know.
>  
>
Could you describe the SciPy axis convention: I'm not familiar with it.

[SNIP]

-tim


From gazzar at email.com  Sun Oct 31 04:22:01 2004
From: gazzar at email.com (Gary Ruben)
Date: Sun Oct 31 04:22:01 2004
Subject: [Numpy-discussion] vector cross product
Message-ID: <20041031121856.E2DDC1CE304@ws1-6.us4.outblaze.com>

Not that I have a really urgent need, but is there a reason that nice, fast C-based vector operations aren't implemented in Numeric or numarray? I notice Fernando Perez has a cross product as a useful SciPy weave example on his site. I've also seen comments elsewhere about Numpy's lack of a cross product. eg. <http://mail.python.org/pipermail/python-list/2004-March/213878.html>
I'm using Konrad Hinsen's Scientific Python for the convenience value of his Vector class, which also provides a nice angle() method but it bothers me that it's implemented in native Python. The Vector type in vpython probably does it 'properly', but I don't use it just for the convenience since it adds an extra dependency to my code.

comments?
Gary R.
-- 
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm


From perry at stsci.edu  Sun Oct 31 09:22:28 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Sun Oct 31 09:22:28 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain>
Message-ID: <NEBBIJKBMLDBLNCEEFOCEEDDFIAA.perry@stsci.edu>

Todd Miller wrote:
> 
> There was understandable confusion about why .flat is sometimes allowed
> to fail.  Since it is an attribute,  we thought it inappropriate to make
> it return a copy of the source array and chose instead to raise an
> exception.  In contrast, it is reasonable for the ravel() function to
> return a completely different array, so it always works.  (I just
> noticed that ravel() is not named flat()).  Some of our more
> contemporary thinkers suggested using iterators to produce a .flat which
> always works.  If anyone has an idea how to make this work with good
> performance,  please let me know;  I don't.
> 
This aspect of flat can be considered a wart. There are three different
desired behaviors depending on who you talk to. For efficiency reasons,
some only want flat (and even ravel) to work if the array is already
contiguous; that is, they don't want copies unless they ask for them.
Others want it to always work, producing a copy if necessary but
otherwise for it to return a view. Yet others always want a copy.
So, are three different versions needed? Or options to a function?
The drawback of .flat (as an attribute) is there is only one choice
for behavior. For a function (or a method) we could modify the
behavior with a keyword argument. Personally, I would rather .flat
always work, even if it means returning a copy. Is there any 
consensus on how this problem should be handled?

> ** Peter Verveer made some comments about the expectations of a naive
> user regarding reductions, namely that "all" should be the default.   My
> own experience bears this out,  and I am torn about what to do here. 
> Chris Barker pointed out the need for backward compatibility with
> Numeric,  and given the current numarray goal of supporting SciPy,  this
> need is growing stronger and more complex.  SciPy uses yet another axis
> convention.  If anyone has any ideas how to handle these multiple
> conventions with elegance,  let me know.
> 
I find this issue particularly vexing as well. Let's be clear about 
this, scipy changes the behavior of Numeric to produce a new flavor.
What should numarray do? Follow the scipy behavior or the Numeric
behavior? Or should there be a scipy/numarray flavor vs the more
Numeric compatible numarray? Note, we never intended numarray to be
100% compatible with Numeric since there were aspects we thought
should be changed (e.g., scalar/array type coercions). Yet there
appear to be two camps of the Numeric community. Some sort of 
survey may be in order here. Is scipy where all the new growth is
now? Should we just adopt the axis convention used there? I'd
very much prefer not proliferate any more flavors of behavior
and just settle on one.

> A number of people commented on our naming conventions, an issue which
> we have side stepped for the moment with sumAll().  My impression is
> that, for better or worse, numarray uses the lowerUpper() version of
> Camel case.  I think this is very much a matter of personal taste and
> don't claim to have any.   My guess is that numarray is probably
> inconsistent at the moment, in part because lowerUpper() often
> degenerates into merely lower() which degenerates into confusion. 
> 
How much of the public interface uses camelCase? I don't think
all that much if any. It seems to me the inclination of scipy
is to avoid it and I'm happy with that. The internal implementation
is a different issue, and there I think Todd is right that it 
probably is somewhat inconsistent on that front.

Perry


From perry at stsci.edu  Sun Oct 31 09:30:28 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Sun Oct 31 09:30:28 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <BE9A7550-2A89-11D9-B5CD-000D932805AC@embl-heidelberg.de>
Message-ID: <NEBBIJKBMLDBLNCEEFOCMEDDFIAA.perry@stsci.edu>

Peter Verveer wrote:

> Numarray should probably be either completely compatible in every small 
> detail, or we could take the opportunity to change what we believe was 

Well, as I mentioned before having numarray match Numeric in every
small detail is not going to happen (and even there, which flavor?
the original Numeric or the scipy version?). We've been pretty clear
about where incompatibilities were deliberate. But on the other hand,
that leaves many other choices that could be revisited if enough 
people support them. The problem is that no matter what is done,
I suspect some people are going to be inconvenienced since there
is already (without numarray) a split in the community because
of scipy. 

> the wrong choice. Not sure what is really best, although personally 
> feel breaking compatibility is fine if the result is better. Is there 
> not already a sub-package numeric within numarray that provides Numeric 
> compatibility? Such a package could at  least provide wrappers with 
> compatible behavior for people who need that.
> 
At the moment the numeric module provides more Numeric compatibility
(but not complete). In matplotlib we use a module called numerix to
provide a uniform interface to both Numeric and numerix (along with
prohibitions on use of certain features that don't exist in the other).
We are looking at scipy_base now that undoubtably will highlight
similar cases where we will suggest internal reorganization to 
do the same sort of thing that was done for matplotlib.

For those that intend to use numarray only now and forever, one is
free to use all the features they desire. But there still is the
behavior issue of those things that are currently incompatible like
the axis issue.

Perry


From tim.hochberg at cox.net  Sun Oct 31 14:24:01 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Oct 31 14:24:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <4183F168.3060205@ucsd.edu>
References: <1099073854.4904.321.camel@localhost.localdomain> <4183E208.6050001@cox.net> <4183F168.3060205@ucsd.edu>
Message-ID: <418564AE.6050206@cox.net>

Robert Kern wrote:

> Tim Hochberg wrote:
>
>> Could you describe the SciPy axis convention: I'm not familiar with it.
>
>
> axis=-1


OK, so Numarray (currently) and Numeric use axis=0, SciPy uses axis=-1 
and there is some desire to use axis=ALL as instead.

One advantage of ALL is that it breaks everyone's code equally, so there 
wouldn't be any charges of favoritism <0.8 wink>.

I can't come up with any way to reconcile the three, but I can suggest a 
transition strategy whatever the decision. Supply an option so that one 
can require axis arguments to all calls to reduce. Then it's relatively 
easy to track down all the reduce calls and fix the ones that are 
broken. Something like numarray.setRequireReduceAxisArg(True).

FWIW, it wouldn't bother me much to use SciPy's default here: supporting 
SciPy is a worthwhile goal and I think SciPy's choice here is a 
reasonable one. Another alternative that wouldn't bother me much is "In 
the face of ambiguity, refuse the temptation to guess". That is, always 
require axis arguments for multidimensional arrays. While not backwards 
compatible, this would make the transition relatively easy, since uses 
that might fail would raise exceptions.

-tim


From rkern at ucsd.edu  Sun Oct 31 16:01:04 2004
From: rkern at ucsd.edu (Robert Kern)
Date: Sun Oct 31 16:01:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <418564AE.6050206@cox.net>
References: <1099073854.4904.321.camel@localhost.localdomain> <4183E208.6050001@cox.net> <4183F168.3060205@ucsd.edu> <418564AE.6050206@cox.net>
Message-ID: <41857B53.5010308@ucsd.edu>

Tim Hochberg wrote:
> Robert Kern wrote:
> 
>> Tim Hochberg wrote:
>>
>>> Could you describe the SciPy axis convention: I'm not familiar with it.
>>
>> axis=-1
> 
> OK, so Numarray (currently) and Numeric use axis=0,

Well, sometimes.  :-)

> SciPy uses axis=-1 

I should note that this convention is for Scipy-defined functions. With 
one unfortunate exception (cumsum), Scipy does not overwrite Numeric's 
axis default for Numeric-defined functions.

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter


From faheem at email.unc.edu  Fri Oct  1 10:19:02 2004
From: faheem at email.unc.edu (Faheem Mitha)
Date: Fri Oct  1 10:19:02 2004
Subject: [Numpy-discussion] random number facilities in numarray and main
 Python libs
In-Reply-To: <982cfc7f.8876956d.8220100@expms6.cites.uiuc.edu>
References: <982cfc7f.8876956d.8220100@expms6.cites.uiuc.edu>
Message-ID: <Pine.LNX.4.61.0410011312170.22431@Chrestomanci>


On Fri, 1 Oct 2004, Bruce Southey wrote:

> Hi,
>
> I presume that you have R and can build the standalone library. I have
> attached my SWIG Smath.i , the SWIG Smath_wrap.c and the
> Smath.py files.  With these last two files, you shouldn't need SWIG.
>
> Note that I have not touched the void functions here as I have yet to check
> how these work in SWIG. Also, there are a few function in the R header that
> are only headers.  Eventually someone has to fixed these and add suitable
> documentation in some package.

I'm not sure what you mean by void functions.

> If you have SWIG you can directly use the Smath.i file - while SWIG can take
> a .h file directly it would not work in Python. So I just edited the header
> file into a .i file.
>
> The following is my process using Linux (I don't know about other platforms):
>
> 0) Have swig installed and built the R math library
> 1) $ swig -python Smath.i
> 2) $ gcc -c Smath_wrap.c -I/usr/local/include/python2.3
> -I/home/bsouthey/Rproject/R-1.9.1/src/nmath
> -I/home/bsouthey/Rproject/R-1.9.1/include
> 3) $ ld -shared Smath_wrap.o -o _Smath.so -lm -lRmath
> -L/home/bsouthey/Rproject/R-1.9.1/src/nmath/standalone
>
> Of course you must change the include (-I) and library (-L) paths to where
> python lives and standard alone Rmath library lives.

Thanks. I'm particularly interested in knowing how you interface with the 
random number generator at the top (Python) level. Can you supply an 
example?

Specifically, I'm looking for the following method.

1) When C/C++ code called, reads seed from python random state.

2) Does its stuff.

3) Writes seed back to python level when it exits.

R has this built it, but here one needs to build ones own mechanism. This 
is complicated by the fact that Numarray and the base Python random 
library use different RNG mechanisms, so one has to chose which one to 
use. Which one did you use?

                                                                 Faheem.


From jmiller at stsci.edu  Fri Oct  1 10:21:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Oct  1 10:21:04 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric and amd64]
Message-ID: <1096651226.9400.25.camel@halloween.stsci.edu>


-- 
-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 38
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041001/1dea7b34/attachment-0001.mht>

From fccoelho at fiocruz.br  Fri Oct  1 13:06:10 2004
From: fccoelho at fiocruz.br (=?iso-8859-1?q?Fl=E1vio_Code=E7o_Coelho?=)
Date: Fri, 1 Oct 2004 17:06:10 +0000
Subject: [Matplotlib-users] warning: Numeric and amd64
Message-ID: <200410011706.10524.fccoelho@fiocruz.br>

Hi,

look at this:

>>> from RandomArray import *

>>> normal(2,2,10)
 array([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.])

This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit 
P4 and it ran fine.
Has anyone else seen this before?

For those that didn't understand, the normal function as called above,  is 
supposed to give me ten samples form a normal distribution with mean = 2 and 
standard deviation = 2

luckily:

>>> from numarray.random_array import *

>>> normal(2,2,10)
array([-0.04525638,  4.31467819, -0.17468357,  5.29377031,  0.84202135,
        5.29593539,  4.69651532,  1.61354655,  1.10839236,  1.7743317 ])

If anybody still needed a reason for switching to numarray, there you go!

I anybody here subscribes the numeric or numarray mailing lists (i.e. if they 
even exist) could you please forward this message to them?

Flavio


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

--=-f+ARSKyzBPwKnxDSn4zh--


From jdhunter at ace.bsd.uchicago.edu  Fri Oct  1 10:33:02 2004
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Fri Oct  1 10:33:02 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
 and amd64]
In-Reply-To: <1096651226.9400.25.camel@halloween.stsci.edu> (Todd Miller's
 message of "01 Oct 2004 13:20:26 -0400")
References: <1096651226.9400.25.camel@halloween.stsci.edu>
Message-ID: <m2vfdu1j96.fsf@mother.paradise.lost>

>>>>> "Todd" == Todd Miller <jmiller at stsci.edu> writes:

    >>>> from RandomArray import *

    >>>> normal(2,2,10)
    Todd>  array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.])

I get this too on a 64bit Opteron 250.

The root of the problem appears to be

  >>> from RandomArray import standard_normal
  >>> standard_normal(10)
  array([  5.31046164e-315,   1.57997427e-314,   5.16421382e-315,   5.22924144e-315,
              1.59247813e-314,   1.58920141e-314,   5.23691141e-315,
              5.24305935e-315,   5.20686204e-315,   1.58739568e-314])


But MLab.randn, which uses a different approach, works fine.

I've have this gnawing feeling I've seen this before, but I can't
remember ....

JDH


From a.schmolck at gmx.net  Fri Oct  1 11:34:01 2004
From: a.schmolck at gmx.net (Alexander Schmolck)
Date: Fri Oct  1 11:34:01 2004
Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is
 present
In-Reply-To: <4159BCA5.6090101@colorado.edu> (Fernando Perez's message of
 "Tue, 28 Sep 2004 13:33:57 -0600")
References: <4159BCA5.6090101@colorado.edu>
Message-ID: <yfs1xgil25h.fsf@black4.ex.ac.uk>

Fernando Perez <Fernando.Perez at colorado.edu> writes:

> Hi all,
>
> I found something today a bit unpleasant: if you install numeric without
> any BLAS support, 'matrixmultiply is dot==True', so they are fully
> interchangeable.  However, to my surprise, if you build numeric with the blas
> optimizations, they are NOT identical.  

Oops, my bad (I submitted the patch and while pretty much all the real coding
was done by Richard Everson this is my oversight).

> The reason is a bug in Numeric.py. After defining dot, the code reads:
>
> #This is obsolete, don't use in new code
> matrixmultiply = dot

On the other hand, it gently nudges people to no longer use the obsoleted
matrixmultiply ;)


> In [4]: timing 1,dot,a,b
> ------> timing(1,dot,a,b)
> Out[4]: 0.55591500000000005
>
> In [5]: timing 1,matrixmultiply,a,b
> ------> timing(1,matrixmultiply,a,b)
> Out[5]: 68.142640999999998
>
> In [6]: _/__
> Out[6]: 122.57744619231356
>
> Pretty significant difference...

Yup, someone should incorporate optional atlas dot support into numarray if it
hasn't happened already (won't be me, IIRC it took some convincing to get this
into Numeric and I won't be using numarray for anything real in the near
future).

cheers,

alex


From stephen.walton at csun.edu  Fri Oct  1 11:37:01 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Fri Oct  1 11:37:01 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
	and amd64]
In-Reply-To: <m2vfdu1j96.fsf@mother.paradise.lost>
References: <1096651226.9400.25.camel@halloween.stsci.edu>
	 <m2vfdu1j96.fsf@mother.paradise.lost>
Message-ID: <1096655567.2678.2.camel@localhost.localdomain>

On Fri, 2004-10-01 at 09:43, John Hunter wrote:

> The root of the problem appears to be
> 
>   >>> from RandomArray import standard_normal
>   >>> standard_normal(10)
>   array([  5.31046164e-315,   1.57997427e-314,
> I've have this gnawing feeling I've seen this before, but I can't
> remember ....

Those values look suspiciously like what one sees if one reads a
big-endian Float as little-endian or vice versa.  I saw similar numbers
recently when using pytables on a big-endian HDF5 (which generated a bug
report for numarray if you recall).

Is the Opteron big-endian?


From stephen.walton at csun.edu  Fri Oct  1 11:40:01 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Fri Oct  1 11:40:01 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
	and amd64]
In-Reply-To: <m2vfdu1j96.fsf@mother.paradise.lost>
References: <1096651226.9400.25.camel@halloween.stsci.edu>
	 <m2vfdu1j96.fsf@mother.paradise.lost>
Message-ID: <1096655567.2678.3.camel@localhost.localdomain>

On Fri, 2004-10-01 at 09:43, John Hunter wrote:

> The root of the problem appears to be
> 
>   >>> from RandomArray import standard_normal
>   >>> standard_normal(10)
>   array([  5.31046164e-315,   1.57997427e-314,
> I've have this gnawing feeling I've seen this before, but I can't
> remember ....

Those values look suspiciously like what one sees if one reads a
big-endian Float as little-endian or vice versa.  I saw similar numbers
recently when using pytables on a big-endian HDF5 (which generated a bug
report for numarray if you recall).

Is the Opteron big-endian?


From stephen.walton at csun.edu  Fri Oct  1 11:43:06 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Fri Oct  1 11:43:06 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
	and amd64]
In-Reply-To: <m2vfdu1j96.fsf@mother.paradise.lost>
References: <1096651226.9400.25.camel@halloween.stsci.edu>
	 <m2vfdu1j96.fsf@mother.paradise.lost>
Message-ID: <1096655567.2678.4.camel@localhost.localdomain>

On Fri, 2004-10-01 at 09:43, John Hunter wrote:

> The root of the problem appears to be
> 
>   >>> from RandomArray import standard_normal
>   >>> standard_normal(10)
>   array([  5.31046164e-315,   1.57997427e-314,
> I've have this gnawing feeling I've seen this before, but I can't
> remember ....

Those values look suspiciously like what one sees if one reads a
big-endian Float as little-endian or vice versa.  I saw similar numbers
recently when using pytables on a big-endian HDF5 (which generated a bug
report for numarray if you recall).

Is the Opteron big-endian?


From Fernando.Perez at colorado.edu  Fri Oct  1 11:51:00 2004
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Fri Oct  1 11:51:00 2004
Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present
In-Reply-To: <yfs1xgil25h.fsf@black4.ex.ac.uk>
References: <4159BCA5.6090101@colorado.edu> <yfs1xgil25h.fsf@black4.ex.ac.uk>
Message-ID: <415DA6D7.4070407@colorado.edu>

Alexander Schmolck schrieb:
> Fernando Perez <Fernando.Perez at colorado.edu> writes:
> 
> 
>>Hi all,
>>
>>I found something today a bit unpleasant: if you install numeric without
>>any BLAS support, 'matrixmultiply is dot==True', so they are fully
>>interchangeable.  However, to my surprise, if you build numeric with the blas
>>optimizations, they are NOT identical.  
> 
> 
> Oops, my bad (I submitted the patch and while pretty much all the real coding
> was done by Richard Everson this is my oversight).

No prob.  It's been fixed in Numeric 23.5, so no more worries.

>>Pretty significant difference...
> 
> 
> Yup, someone should incorporate optional atlas dot support into numarray if it
> hasn't happened already (won't be me, IIRC it took some convincing to get this
> into Numeric and I won't be using numarray for anything real in the near
> future).

I'll leave that question to the numarray guys, I have no idea where it stands 
in terms of blas/atlas support.  I certainly hope it has it or that this 
optimization can be brought in, as it makes a huge difference for the large 
array case.

Best,

f


From perry at stsci.edu  Fri Oct  1 11:57:02 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Oct  1 11:57:02 2004
Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present
In-Reply-To: <415DA6D7.4070407@colorado.edu>
References: <4159BCA5.6090101@colorado.edu> <yfs1xgil25h.fsf@black4.ex.ac.uk> <415DA6D7.4070407@colorado.edu>
Message-ID: <52083A9C-13DB-11D9-B931-000A95B68E50@stsci.edu>

On Oct 1, 2004, at 2:49 PM, Fernando Perez wrote:

> Alexander Schmolck schrieb:
>>> Pretty significant difference...
>> Yup, someone should incorporate optional atlas dot support into 
>> numarray if it
>> hasn't happened already (won't be me, IIRC it took some convincing to 
>> get this
>> into Numeric and I won't be using numarray for anything real in the 
>> near
>> future).
>
> I'll leave that question to the numarray guys, I have no idea where it 
> stands in terms of blas/atlas support.  I certainly hope it has it or 
> that this optimization can be brought in, as it makes a huge 
> difference for the large array case.
>
> Best,
>
> f
I'm not sure when it will get done, but we are working on the early 
stages of getting
scipy working with numarray. You should see visible signs of that 
within a month
(i.e., at least some parts of scipy working with numarray). It will 
probably take
months to finish though.

Perry


From pearu at cens.ioc.ee  Fri Oct  1 12:44:58 2004
From: pearu at cens.ioc.ee (Pearu Peterson)
Date: Fri Oct  1 12:44:58 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
 and amd64]
In-Reply-To: <1096651226.9400.25.camel@halloween.stsci.edu>
Message-ID: <Pine.LNX.4.21.0410012234160.9973-100000@cens.kybi>

On 1 Oct 2004, Todd Miller wrote:

> look at this:
>
> >>> from RandomArray import *
>
> >>> normal(2,2,10)
>  array([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.])
>
> This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a
> 32bit P4 and it ran fine.
> Has anyone else seen this before?

Yes. I just fixed a similar issue in scipy.stats.rand module. Below is the
corresponding patch for Numeric Src/ranlibmodule.c that fixes the issue
for Opteron.

Regards,
Pearu

*** ranlibmodule.c      Fri Oct  1 22:29:57 2004
--- ranlibmodule.c.orig Fri Oct  1 22:12:13 2004
***************
*** 47,49 ****
      case 0:
!       *out_ptr = (double) ((float (*)(void)) fun)();
        break;
--- 47,49 ----
      case 0:
!       *out_ptr = (double) ((double (*)()) fun)();
        break;
***************
*** 81,83 ****
    case 1:
!     if( !PyArg_ParseTuple(args, "lf|i", &int_arg, &float_arg, &n) ) {
        return NULL;
--- 81,83 ----
    case 1:
!     if( !PyArg_ParseTuple(args, "if|i", &int_arg, &float_arg, &n) ) {
        return NULL;
***************
*** 213,215 ****
  
!   if( !PyArg_ParseTuple(args, "lO|i", &num_trials, &priors_object, &n) ) {
      return NULL;
--- 213,215 ----
  
!   if( !PyArg_ParseTuple(args, "iO|i", &num_trials, &priors_object, &n) ) {
      return NULL;


From jmiller at stsci.edu  Fri Oct  1 13:35:07 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Oct  1 13:35:07 2004
Subject: [Numpy-discussion] [Fwd: [Matplotlib-users] warning: Numeric
	and amd64]
In-Reply-To: <Pine.LNX.4.21.0410012234160.9973-100000@cens.kybi>
References: <Pine.LNX.4.21.0410012234160.9973-100000@cens.kybi>
Message-ID: <1096662489.15037.1.camel@halloween.stsci.edu>

Thanks Pearu.

For some unknown reason, numarray.random_array already had the fixes, 
but I applied the patch to Numeric CVS.

Regards,
Todd

On Fri, 2004-10-01 at 15:38, Pearu Peterson wrote:
> On 1 Oct 2004, Todd Miller wrote:
> 
> > look at this:
> >
> > >>> from RandomArray import *
> >
> > >>> normal(2,2,10)
> >  array([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.])
> >
> > This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a
> > 32bit P4 and it ran fine.
> > Has anyone else seen this before?
> 
> Yes. I just fixed a similar issue in scipy.stats.rand module. Below is the
> corresponding patch for Numeric Src/ranlibmodule.c that fixes the issue
> for Opteron.
> 
> Regards,
> Pearu
> 
> *** ranlibmodule.c      Fri Oct  1 22:29:57 2004
> --- ranlibmodule.c.orig Fri Oct  1 22:12:13 2004
> ***************
> *** 47,49 ****
>       case 0:
> !       *out_ptr = (double) ((float (*)(void)) fun)();
>         break;
> --- 47,49 ----
>       case 0:
> !       *out_ptr = (double) ((double (*)()) fun)();
>         break;
> ***************
> *** 81,83 ****
>     case 1:
> !     if( !PyArg_ParseTuple(args, "lf|i", &int_arg, &float_arg, &n) ) {
>         return NULL;
> --- 81,83 ----
>     case 1:
> !     if( !PyArg_ParseTuple(args, "if|i", &int_arg, &float_arg, &n) ) {
>         return NULL;
> ***************
> *** 213,215 ****
>   
> !   if( !PyArg_ParseTuple(args, "lO|i", &num_trials, &priors_object, &n) ) {
>       return NULL;
> --- 213,215 ----
>   
> !   if( !PyArg_ParseTuple(args, "iO|i", &num_trials, &priors_object, &n) ) {
>       return NULL;
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 


From faheem at email.unc.edu  Fri Oct  1 22:28:41 2004
From: faheem at email.unc.edu (Faheem Mitha)
Date: Fri Oct  1 22:28:41 2004
Subject: [Numpy-discussion] numarray.random_array number generation in C code
Message-ID: <Pine.LNX.4.61.0410020003050.30139@Chrestomanci>

Dear People,

I want to write some C++ code to link with Python, using the 
Boost.Python interface. I need to generate random numbers in the C++ 
code, and I was wondering as to the best way of doing this.

Note that it is important that the random number generation interoperate 
seamlessly with Python, in the sense that the behavior of the calls to 
the RNG is the same whether calls are made at the C level or the Python 
level. I hope the reasons why this is important are obvious.

I was thinking that the method should go like this.

1) When C/C++ code called, reads seed from python random state.

2) Does its stuff.

3) Writes seed back to python level when it exits.

After doing a little investigation of the numarray.random_array python 
library and associated extension modules, it seems possible that the 
answer is simpler than I had supposed. However, I would appreciate it if 
someone would tell me if my understanding is incorrect in some places.

Summary: It seems that I can just call all the C entry point routines 
defined in ranlib.h, without worrying about getting or setting seeds.

Rationale:

The structure of this random number facility has three parts, all files in 
Packages/RandomArray2/Src.

1) low-level C routines: Packages/RandomArray2/Src/com.c and 
Packages/RandomArray2/Src/ranlib.c.

com.c: basic RNG stuff; getting and setting seeds etc.
ranlib.c: Random number generator algorithms for different distributions 
etc.

2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c.

This interfaces the stuff in com.c and ranlib.c.

3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py.

This wraps the C interface. In most cases it does not do much else besides 
some basic argument error checking.

>From my perspective, the important thing is that the random number seed is 
only defined at C level as a static object, all the RNG stuff happens at C 
level, and the Python code just calls the C code as necessary. (I'm 
sketchy about the details of what is defined as the seed etc.)

This is in contrast with the R RNG facility (the only other RNG facility I 
am familiar with), which uses macros SetRNGstate() and GetRNGstate() to 
read and write the seed, which is defined at R level.

Therefore, the upshot is that the C routines in ranlib.h read and write 
the same seed as the python level functions do, so no special action is 
necessary with regard to the seed.

Is this correct?

In any case, it would be nice if something like the above was documented, 
so lost souls like myself don't have to go trawling through the source 
code to figure out what is going on. Of course it is nice that the source 
code is available, otherwise even that would be impossible.

R documents this stuff in the "Writing R Extensions" manual, online at 
http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray 
manual could have a small section about this too.

                                                         Regards, Faheem.


From fccoelho at gmail.com  Mon Oct  4 07:59:12 2004
From: fccoelho at gmail.com (Flavio Coelho)
Date: Mon Oct  4 07:59:12 2004
Subject: [Numpy-discussion] Bug Compiling Numeric on amd64
Message-ID: <d9af7a8804100407482601cae4@mail.gmail.com>

Hi,

look at this:

>>> from RandomArray import *

>>> normal(2,2,10)
 array([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.])

This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit 
P4 and it ran fine.
Has anyone else seen this before?


luckily:

>>> from numarray.random_array import *

>>> normal(2,2,10)
array([-0.04525638,  4.31467819, -0.17468357,  5.29377031,  0.84202135,
        5.29593539,  4.69651532,  1.61354655,  1.10839236,  1.7743317 ])

Both modules were compiled on my gentoo box with:

gcc version 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)

any comments?

Flavio
-- 
I use Linux daily to UP my productivity -- Microsoft, UP yours!


From jmiller at stsci.edu  Mon Oct  4 09:21:32 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Oct  4 09:21:32 2004
Subject: [Numpy-discussion] Bug Compiling Numeric on amd64
In-Reply-To: <d9af7a8804100407482601cae4@mail.gmail.com>
References: <d9af7a8804100407482601cae4@mail.gmail.com>
Message-ID: <1096906220.7641.55.camel@localhost.localdomain>

On Mon, 2004-10-04 at 10:48, Flavio Coelho wrote:
> Hi,
> 
> look at this:
> 
> >>> from RandomArray import *
> 
> >>> normal(2,2,10)
>  array([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.])
> 
> This is Numeric 23.1 compiled on my AMD64!!! I ran the same tests on a 32bit 
> P4 and it ran fine.
> Has anyone else seen this before?
> 

This was discussed here briefly last week after I forwarded your post
from matplotlib-users.  Pearu Peterson posted a patch which he had
already performed for SciPy and I applied it to Numeric on Source
Forge.  Thanks for raising the issue.

Regards,
Todd

> 
> luckily:
> 
> >>> from numarray.random_array import *
> 
> >>> normal(2,2,10)
> array([-0.04525638,  4.31467819, -0.17468357,  5.29377031,  0.84202135,
>         5.29593539,  4.69651532,  1.61354655,  1.10839236,  1.7743317 ])
> 
> Both modules were compiled on my gentoo box with:
> 
> gcc version 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)
> 
> any comments?
> 
> Flavio
-- 


From Fernando.Perez at colorado.edu  Mon Oct  4 10:59:49 2004
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Mon Oct  4 10:59:49 2004
Subject: [Numpy-discussion] Small bug in MA with arrays of rank > 1
Message-ID: <41618DFD.7030106@colorado.edu>

Hi all,

a while back I noticed a small problem with MA for rank 2 (and larger) arrays. 
  Here's a simple example:

In [1]: a=RA.random((3,3))

In [2]: a
Out[2]:
array([[ 0.002542,  0.70301 ,  0.705466],
        [ 0.467305,  0.381492,  0.655857],
        [ 0.103372,  0.776988,  0.466528]])

In [3]: import MA

In [4]: a
Out[4]:
[[ 0.002542, 0.70301 , 0.705466,]
  [ 0.467305, 0.381492, 0.655857,]
  [ 0.103372, 0.776988, 0.466528,]]

The bug is that the commas at the end of each line are coming _before_ the 
closing bracket, instead of after.  This seemingly trivial problem turns out 
to be pretty serious for me, because I use this string representation to 
export python arrays into Mathematica files, by simply replacing [] with {} 
(and playing some other tricks).

Unfortunately, this bug means I can't use MA, which is otherwise great because 
of the way it gracefully handles the case where you accidentally say

A

when A is some monster array.  With MA, instead of your CPU getting killed for 
10 minutes, you get a nice summary of A's dimensions and typecode.

Anyway, it would be great if one of the gurus had a chance to fix this one.

Best,

f


From graik at web.de  Tue Oct  5 10:44:13 2004
From: graik at web.de (Raik =?iso-8859-1?q?Gr=FCnberg?=)
Date: Tue Oct  5 10:44:13 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
Message-ID: <200410051941.29807.graik@web.de>

Hi there,

I've just translated a package for molecular modelling, which makes extensive 
use of Numeric, from Numeric to numarray. The outcome is somewhat negative - 
for now we are basically going to postpone the transition - the reasons might 
be interesting for the list and the numarray developpers out there (who are 
doing a brave job!).

Speed:
A typical task in our package is the least-square fitting of a large array of 
coordinate frames ( N1 x N2 x 3) onto a set of reference or average 
coordinates (using a sub-set of coordinates for the matching). The example I 
looked at (500 x 876 x 3 items) took 1.3 s with Numeric and 4.7 s with 
numarray. The main culprits for the slow-down were:
* compress() - factor 10
* average() - factor 7 (average() is missing from Numeric and I hence had to 
write a little function myself)
* LinearAlgebra.singular_value_decomposition() - factor 10
but a lot of extra time is also spent in uufunc.py and various numarraycore.py 
routines.

Memory efficiency:
I hoped numarray would solve some of the Out-of-memory problems that I get 
with Numeric but it turns out that it is rather less memory efficient for my 
kind of applications. Slicing an array that takes up 800MB on disc just about 
runs through with Numeric (and heavy swapping) but gives an Out-of-memory 
with numarray.

Suggestions:
OK, it's easy to make clever comments without contributing any real work...
- compress(), take(), etc, really need some optimization
- a C-coded average() routine would be helpful
- faster LinearAlgebra routines are necessary

Our sysadmin noted that unlike Numeric, numarray is not using any external 
math libraries (like LAPACK) that have been speed-optimized for decades and 
are available in CPU-optimized variants (e.g. ATLAS). It's probably difficult 
to match this efficiency with any new code ...

Greetings
Raik

PS:
I didn't find any useful HowTo for the translation from Numeric to numarray. 
The practical issues were the different nonzero() return value, the more 
restrictive boolean comparison, that take doesn't support 'O' arrays any 
longer, and the missing average().

-- 
-----------------------------------------------------
Raik Gr?nberg		| Bioinformatique Structurale
				| Institut Pasteur
				| Paris, France
-----------------------------------------------------


From southey at uiuc.edu  Tue Oct  5 11:33:27 2004
From: southey at uiuc.edu (Bruce Southey)
Date: Tue Oct  5 11:33:27 2004
Subject: [Numpy-discussion] numarray.random_array
 number generation in C code
Message-ID: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu>

Hi,  
It is rather hard to suggest anything without more detail on what you want to 
actually do.  As you describe it, why do you need the 'seed' returned? It 
would only make sense if you were going in and out of Python multiple times - 
a somewhat undesirable situation due to the overhead costs.  
  
I see at least three options:  
1) Do everything in Python/numarray. 
 
2) Do parts in Python and the other in C/C++. 
   For example, pass a matrix of random numbers to your code from Python. The 
'seed' never needs to leave Python.   
 
3) Do it all in C/C++ - pass the 'seed' into your code that includes the 
random number generator(s) - there is C/C++ code around for this. Do you stuff 
and then return the 'seed' back with whatever else is required.  
  
You can email me privately if you want. 
 
Bruce 
  
   
---- Original message ----  
>Date: Sat, 2 Oct 2004 01:23:21 -0400 (EDT)  
>From: Faheem Mitha <faheem at email.unc.edu>    
>Subject: [Numpy-discussion] numarray.random_array number generation in C code    
>To: numpy-discussion <numpy-discussion at lists.sourceforge.net>  
>  
>  
>Dear People,  
>  
>I want to write some C++ code to link with Python, using the   
>Boost.Python interface. I need to generate random numbers in the C++   
>code, and I was wondering as to the best way of doing this.  
>  
>Note that it is important that the random number generation interoperate   
>seamlessly with Python, in the sense that the behavior of the calls to   
>the RNG is the same whether calls are made at the C level or the Python   
>level. I hope the reasons why this is important are obvious.  
>  
>I was thinking that the method should go like this.  
>  
>1) When C/C++ code called, reads seed from python random state.  
>  
>2) Does its stuff.  
>  
>3) Writes seed back to python level when it exits.  
>  
>After doing a little investigation of the numarray.random_array python   
>library and associated extension modules, it seems possible that the   
>answer is simpler than I had supposed. However, I would appreciate it if   
>someone would tell me if my understanding is incorrect in some places.  
>  
>Summary: It seems that I can just call all the C entry point routines   
>defined in ranlib.h, without worrying about getting or setting seeds.  
>  
>Rationale:  
>  
>The structure of this random number facility has three parts, all files in   
>Packages/RandomArray2/Src.  
>  
>1) low-level C routines: Packages/RandomArray2/Src/com.c and   
>Packages/RandomArray2/Src/ranlib.c.  
>  
>com.c: basic RNG stuff; getting and setting seeds etc.  
>ranlib.c: Random number generator algorithms for different distributions   
>etc.  
>  
>2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c.  
>  
>This interfaces the stuff in com.c and ranlib.c.  
>  
>3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py.  
>  
>This wraps the C interface. In most cases it does not do much else besides   
>some basic argument error checking.  
>  
>From my perspective, the important thing is that the random number seed is   
>only defined at C level as a static object, all the RNG stuff happens at C   
>level, and the Python code just calls the C code as necessary. (I'm   
>sketchy about the details of what is defined as the seed etc.)  
>  
>This is in contrast with the R RNG facility (the only other RNG facility I   
>am familiar with), which uses macros SetRNGstate() and GetRNGstate() to   
>read and write the seed, which is defined at R level.  
>  
>Therefore, the upshot is that the C routines in ranlib.h read and write   
>the same seed as the python level functions do, so no special action is   
>necessary with regard to the seed.  
>  
>Is this correct?  
>  
>In any case, it would be nice if something like the above was documented,   
>so lost souls like myself don't have to go trawling through the source   
>code to figure out what is going on. Of course it is nice that the source   
>code is available, otherwise even that would be impossible.  
>  
>R documents this stuff in the "Writing R Extensions" manual, online at   
>http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray   
>manual could have a small section about this too.  
>  
>                                                         Regards, Faheem.  
>  
>  
>  
>-------------------------------------------------------  
>This SF.net email is sponsored by: IT Product Guide on ITManagersJournal  
>Use IT products in your business? Tell us what you think of them. Give us  
>Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more  
>http://productguide.itmanagersjournal.com/guidepromo.tmpl  
>_______________________________________________  
>Numpy-discussion mailing list  
>Numpy-discussion at lists.sourceforge.net  
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion  
  
 
From stephen.walton at csun.edu  Tue Oct  5 12:20:01 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Tue Oct  5 12:20:01 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <200410051941.29807.graik@web.de>
References: <200410051941.29807.graik@web.de>
Message-ID: <1097003873.13715.17.camel@freyer.sfo.csun.edu>

On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote:

> Our sysadmin noted that unlike Numeric, numarray is not using any external 
> math libraries (like LAPACK) that have been speed-optimized for decades and 
> are available in CPU-optimized variants (e.g. ATLAS). It's probably difficult 
> to match this efficiency with any new code ...

This is a key point.  Have a look at addons.py in numarray, some
previous comments on this list, and build numarray with the line

env USE_LAPACK=1 python setup.py build

after editing addons.py appropriately.  You should see a major speed
improvement.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041005/24ffd730/attachment-0001.sig>

From dd55 at cornell.edu  Tue Oct  5 13:02:01 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Tue Oct  5 13:02:01 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <1097003873.13715.17.camel@freyer.sfo.csun.edu>
References: <200410051941.29807.graik@web.de> <1097003873.13715.17.camel@freyer.sfo.csun.edu>
Message-ID: <200410051600.38254.dd55@cornell.edu>

On Tuesday 05 October 2004 03:17 pm, Stephen Walton wrote:
> On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote:
> > Our sysadmin noted that unlike Numeric, numarray is not using any
> > external math libraries (like LAPACK) that have been speed-optimized for
> > decades and are available in CPU-optimized variants (e.g. ATLAS). It's
> > probably difficult to match this efficiency with any new code ...
>
> This is a key point.  Have a look at addons.py in numarray, some
> previous comments on this list, and build numarray with the line
>
> env USE_LAPACK=1 python setup.py build
>
> after editing addons.py appropriately.  You should see a major speed
> improvement.

I would kindly suggest updating the numarray documentation. In the section on 
installation, it is easy to overlook the option to compile againist existing 
libraries. That is explained in section 16, which appears to be out of date. 
The code listed in Packages/LinearAlgebra2/setup.py has been moved to 
addons.py, correct?

-- 

Darren


From jmiller at stsci.edu  Tue Oct  5 13:37:42 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Oct  5 13:37:42 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <200410051600.38254.dd55@cornell.edu>
References: <200410051941.29807.graik@web.de>
	 <1097003873.13715.17.camel@freyer.sfo.csun.edu>
	 <200410051600.38254.dd55@cornell.edu>
Message-ID: <1097008567.27149.140.camel@halloween.stsci.edu>

On Tue, 2004-10-05 at 16:00, Darren Dale wrote:
> On Tuesday 05 October 2004 03:17 pm, Stephen Walton wrote:
> > On Tue, 2004-10-05 at 10:41, Raik Gr?nberg wrote:
> > > Our sysadmin noted that unlike Numeric, numarray is not using any
> > > external math libraries (like LAPACK) that have been speed-optimized for
> > > decades and are available in CPU-optimized variants (e.g. ATLAS). It's
> > > probably difficult to match this efficiency with any new code ...
> >
> > This is a key point.  Have a look at addons.py in numarray, some
> > previous comments on this list, and build numarray with the line
> >
> > env USE_LAPACK=1 python setup.py build
> >
> > after editing addons.py appropriately.  You should see a major speed
> > improvement.
> 
> I would kindly suggest updating the numarray documentation. 

Thanks, will do.

> In the section on 
> installation, it is easy to overlook the option to compile againist existing 
> libraries. That is explained in section 16, which appears to be out of date. 
> The code listed in Packages/LinearAlgebra2/setup.py has been moved to 
> addons.py, correct?

That's correct.

Regards,
Todd


From faheem at email.unc.edu  Tue Oct  5 15:44:36 2004
From: faheem at email.unc.edu (Faheem Mitha)
Date: Tue Oct  5 15:44:36 2004
Subject: [Numpy-discussion] numarray.random_array number generation in
 C code
In-Reply-To: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu>
References: <6d0c2265.8aa2891d.81a0300@expms6.cites.uiuc.edu>
Message-ID: <Pine.LNX.4.61.0410051448220.16160@Chrestomanci>


On Tue, 5 Oct 2004, Bruce Southey wrote:

> Hi,
> It is rather hard to suggest anything without more detail on what you want to
> actually do.

I could give you more details if you were interested.

> As you describe it, why do you need the 'seed' returned? It would only 
> make sense if you were going in and out of Python multiple times - a 
> somewhat undesirable situation due to the overhead costs.

Not really. One might (and I frequently do) want to run the same function 
(which in this case might be all in C++ code), interactively with 
different parameters. The kind of thing that I'm doing is akin to 
exploratory data analysis, and the specific code in question is a 
stochastic search algorithm. Doing all this in C++ would not be very 
interactive. Also, one often wants to postprocess data output using Python 
scripts. This involves multiple calls to C++ code, and would be impossible 
to do using C++, since one has to call other Python libraries.

  > I see at least three options:

> 1) Do everything in Python/numarray.

That's my current situation.

> 2) Do parts in Python and the other in C/C++.
>   For example, pass a matrix of random numbers to your code from Python. The
> 'seed' never needs to leave Python.

This doesn't work very well unless you know in advance how many random 
numbers are needed (not the case, for example, for stochastic search 
algorithms), and in any case is a rather clumsy way to do things. No 
offense intended.

> 3) Do it all in C/C++ - pass the 'seed' into your code that includes the
> random number generator(s) - there is C/C++ code around for this. Do you stuff
> and then return the 'seed' back with whatever else is required.

Yes, but part of the point of mixed programming is that you have an 
interpreted front end which can easily hook into other routines. Also, in 
this case, you would not be passing the seed in, since there is nothing to 
pass it in from. One would simply call system time or something similar to 
obtain the seed.

> You can email me privately if you want.

I'll keep sending this to the list unless someone objects, since I think 
this is of some general interest.

Really, my main question was to whether my understanding of how to use the 
Numarray random number facilities in C was correct or not.

                                                                Faheem.


From stephen.walton at csun.edu  Tue Oct  5 16:15:31 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Tue Oct  5 16:15:31 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <d9af7a88041005160040938eeb@mail.gmail.com>
References: <200410051941.29807.graik@web.de>
	 <1097003873.13715.17.camel@freyer.sfo.csun.edu>
	 <d9af7a88041005160040938eeb@mail.gmail.com>
Message-ID: <1097018077.22092.15.camel@freyer.sfo.csun.edu>

On Tue, 2004-10-05 at 16:00, Flavio Coelho wrote:
> I wrote 
> > env USE_LAPACK=1 python setup.py build
> > 
> > after editing addons.py appropriately.  You should see a major speed
> > improvement.
> > 
>  
> 
> If that is the case, why is it not the default?, at least when LAPACK
> is installed?

Well, I won't pretend to speak for the developers on this one.  But I
strongly suspect it is just too hard to find all possible LAPACK
distributions;  the default numarray setup should be self contained even
if somewhat slower.  The current version of Numeric also defaults to its
own built-in BLAS and requires editing setup.py to use a different one.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041005/88461001/attachment-0001.sig>

From perry at stsci.edu  Tue Oct  5 17:30:58 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Oct  5 17:30:58 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <1097018077.22092.15.camel@freyer.sfo.csun.edu>
Message-ID: <NEBBIJKBMLDBLNCEEFOCEEOJFHAA.perry@stsci.edu>

Steve Walton wrote:
> On Tue, 2004-10-05 at 16:00, Flavio Coelho wrote:
> > I wrote 
> > > env USE_LAPACK=1 python setup.py build
> > > 
> > > after editing addons.py appropriately.  You should see a major speed
> > > improvement.
> > > 
> >  
> > 
> > If that is the case, why is it not the default?, at least when LAPACK
> > is installed?
> 
> Well, I won't pretend to speak for the developers on this one.  But I
> strongly suspect it is just too hard to find all possible LAPACK
> distributions;  the default numarray setup should be self contained even
> if somewhat slower.  The current version of Numeric also defaults to its
> own built-in BLAS and requires editing setup.py to use a different one.
> 
Well, it's been a while, and Todd handled that aspect of porting those
from Numeric, but if I recall correctly, the situation was the same
there, and I think Steve is correct. It was to provide the basic 
functionality as part of the distribution without requiring other
installations. If you needed better performance, you jump through a
couple more hoops. But requiring it to use LAPACK makes life more difficult
for those who were looking for a self contained and easy to install
solution.

Perry


From perry at stsci.edu  Tue Oct  5 17:40:51 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Oct  5 17:40:51 2004
Subject: [Numpy-discussion] Numeric to numarray experiences
In-Reply-To: <200410051941.29807.graik@web.de>
Message-ID: <NEBBIJKBMLDBLNCEEFOCKEOJFHAA.perry@stsci.edu>

I hadn't seen this until now. It's hard for us to understand
exactly the reasons for the slower performance with such large
arrays. Could you send us the code and an indication of the
what inputs and parameters were used so we could try to figure
out why some of these problems exist (we can check the specific
functions you mention, but I want to make sure you aren't
iterating over array slices or such). It's not obvious to
me why you are having out of memory errors and this may help.

Perry Greenfield

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net
> [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Raik
> Gr?nberg
> Sent: Tuesday, October 05, 2004 1:41 PM
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Numeric to numarray experiences
>
>
> Hi there,
>
> I've just translated a package for molecular modelling, which
> makes extensive
> use of Numeric, from Numeric to numarray. The outcome is somewhat
> negative -
> for now we are basically going to postpone the transition - the
> reasons might
> be interesting for the list and the numarray developpers out
> there (who are
> doing a brave job!).
>
> Speed:
> A typical task in our package is the least-square fitting of a
> large array of
> coordinate frames ( N1 x N2 x 3) onto a set of reference or average
> coordinates (using a sub-set of coordinates for the matching).
> The example I
> looked at (500 x 876 x 3 items) took 1.3 s with Numeric and 4.7 s with
> numarray. The main culprits for the slow-down were:
> * compress() - factor 10
> * average() - factor 7 (average() is missing from Numeric and I
> hence had to
> write a little function myself)
> * LinearAlgebra.singular_value_decomposition() - factor 10
> but a lot of extra time is also spent in uufunc.py and various
> numarraycore.py
> routines.
>
> Memory efficiency:
> I hoped numarray would solve some of the Out-of-memory problems
> that I get
> with Numeric but it turns out that it is rather less memory
> efficient for my
> kind of applications. Slicing an array that takes up 800MB on
> disc just about
> runs through with Numeric (and heavy swapping) but gives an Out-of-memory
> with numarray.
>
> Suggestions:
> OK, it's easy to make clever comments without contributing any
> real work...
> - compress(), take(), etc, really need some optimization
> - a C-coded average() routine would be helpful
> - faster LinearAlgebra routines are necessary
>
> Our sysadmin noted that unlike Numeric, numarray is not using any
> external
> math libraries (like LAPACK) that have been speed-optimized for
> decades and
> are available in CPU-optimized variants (e.g. ATLAS). It's
> probably difficult
> to match this efficiency with any new code ...
>
> Greetings
> Raik
>
> PS:
> I didn't find any useful HowTo for the translation from Numeric
> to numarray.
> The practical issues were the different nonzero() return value, the more
> restrictive boolean comparison, that take doesn't support 'O' arrays any
> longer, and the missing average().
>
> --
> -----------------------------------------------------
> Raik Gr?nberg		| Bioinformatique Structurale
> 				| Institut Pasteur
> 				| Paris, France
> -----------------------------------------------------
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to
> find out more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From perry at stsci.edu  Tue Oct  5 18:14:00 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Oct  5 18:14:00 2004
Subject: [Numpy-discussion] numarray.random_array number generation in C code
In-Reply-To: <Pine.LNX.4.61.0410020003050.30139@Chrestomanci>
Message-ID: <NEBBIJKBMLDBLNCEEFOCCEOKFHAA.perry@stsci.edu>

Faheem Mitha wrote:

> Dear People,
>
> I want to write some C++ code to link with Python, using the
> Boost.Python interface. I need to generate random numbers in the C++
> code, and I was wondering as to the best way of doing this.
>
> Note that it is important that the random number generation interoperate
> seamlessly with Python, in the sense that the behavior of the calls to
> the RNG is the same whether calls are made at the C level or the Python
> level. I hope the reasons why this is important are obvious.
>
> I was thinking that the method should go like this.
>
> 1) When C/C++ code called, reads seed from python random state.
>
> 2) Does its stuff.
>
> 3) Writes seed back to python level when it exits.
>
> After doing a little investigation of the numarray.random_array python
> library and associated extension modules, it seems possible that the
> answer is simpler than I had supposed. However, I would appreciate it if
> someone would tell me if my understanding is incorrect in some places.
>
> Summary: It seems that I can just call all the C entry point routines
> defined in ranlib.h, without worrying about getting or setting seeds.
>
> Rationale:
>
> The structure of this random number facility has three parts, all
> files in
> Packages/RandomArray2/Src.
>
> 1) low-level C routines: Packages/RandomArray2/Src/com.c and
> Packages/RandomArray2/Src/ranlib.c.
>
> com.c: basic RNG stuff; getting and setting seeds etc.
> ranlib.c: Random number generator algorithms for different distributions
> etc.
>
> 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c.
>
> This interfaces the stuff in com.c and ranlib.c.
>
> 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py.
>
> This wraps the C interface. In most cases it does not do much
> else besides
> some basic argument error checking.
>
> From my perspective, the important thing is that the random
> number seed is
> only defined at C level as a static object, all the RNG stuff
> happens at C
> level, and the Python code just calls the C code as necessary. (I'm
> sketchy about the details of what is defined as the seed etc.)
>
> This is in contrast with the R RNG facility (the only other RNG
> facility I
> am familiar with), which uses macros SetRNGstate() and GetRNGstate() to
> read and write the seed, which is defined at R level.
>
> Therefore, the upshot is that the C routines in ranlib.h read and write
> the same seed as the python level functions do, so no special action is
> necessary with regard to the seed.
>
> Is this correct?
>
> In any case, it would be nice if something like the above was documented,
> so lost souls like myself don't have to go trawling through the source
> code to figure out what is going on. Of course it is nice that the source
> code is available, otherwise even that would be impossible.
>
> R documents this stuff in the "Writing R Extensions" manual, online at
> http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray
> manual could have a small section about this too.
>
>                                                          Regards, Faheem.
>
I'm not sure I understand what you want to do. Do you want to link
directly to the extension code from your C++ code? If so I'm wondering
why. It would make the most sense if the C++ code needed obtain
small numbers of random numbers in some iterative loop, and you wish
to use the same random number library that that numarray is using.
Otherwise, I would normally obtain the random number array
in python, then call the C++ extension. Perhaps I didn't read carefully
enough. Normally linking to an extension module involves some hacks
that I'm not sure were done for the randomarray module (the gory
details are in the python docs for extension modules), Todd can
check on that, I'm not sure I will have time (a superficial check
seems to indicate that it doesn't support direct linking, though
one could link to the underlying library I suppose).

As an aside, it is likely that a better module can be done as some
have suggested, we just took what Numeric had at the time. Doing that
is not a high priority with us at the moment (anyone else want to
tackle that?). Right now integration with scipy is our biggest
priority so things like this will have to take a back seat for
a while.

Furthermore, we did what we needed to to port these modules from
Numeric, but that didn't necessarily make us experts in how they
worked. I wish we were, but we've generally been directing our
energy elsewhere. I'd presume that the sensible way for the module
to work is to initialize its seed from a time-based seed in the
absence of any other seed initialization, and to keep the seed
state in the extension module, but I could be wrong.

Perry


From faheem at email.unc.edu  Tue Oct  5 18:41:02 2004
From: faheem at email.unc.edu (Faheem Mitha)
Date: Tue Oct  5 18:41:02 2004
Subject: [Numpy-discussion] numarray.random_array number generation in
 C code
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCCEOKFHAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCCEOKFHAA.perry@stsci.edu>
Message-ID: <Pine.LNX.4.61.0410052108420.16761@Chrestomanci>


On Tue, 5 Oct 2004, Perry Greenfield wrote:

> I'm not sure I understand what you want to do. Do you want to link 
> directly to the extension code from your C++ code?

Yes.

> If so I'm wondering why. It would make the most sense if the C++ code 
> needed obtain small numbers of random numbers in some iterative loop, 
> and you wish to use the same random number library that that numarray is 
> using.

I need to obtain an arbitrary (not known in advance) number of random 
numbers in the C++ code.

I'm thinking of using the same random number library mostly because I 
assumed that using the same seed across the python/C interface would be 
supported. This is how it works in R (the only other place I have used 
this). Also, I had been using the same routines in the Python code I'm 
trying to convert to C++, so it would be a relatively smooth transfer.

If I was to use a pure C/C++ library, I'd have to worry about copying the 
seed back and forth between Python and C. Is this what I'll have to do 
then?

> Otherwise, I would normally obtain the random number array in python, 
> then call the C++ extension.

Yes, this is what everyone suggests. But in my case, the number of random 
variates required is not known in advance. I get the feeling this 
situation does not arise very often for most people, but I work with 
stochastic processes which terminate according to some stopping criterion, 
and that is the standard situation in this case.

Also generating these numbers in Python would give rise to serious 
performance issues.

> Perhaps I didn't read carefully enough. Normally linking to an extension 
> module involves some hacks that I'm not sure were done for the 
> randomarray module (the gory details are in the python docs for 
> extension modules), Todd can check on that, I'm not sure I will have 
> time (a superficial check seems to indicate that it doesn't support 
> direct linking, though one could link to the underlying library I 
> suppose).

Hmm. Well, this is unwelcome news. You mean I cannot link to ranlib.so? I 
assumed that including the ranlib.h header and linking my C++ module 
against ranlib.so would be enough. I suppose that was too optimistic.

> As an aside, it is likely that a better module can be done as some
> have suggested, we just took what Numeric had at the time. Doing that
> is not a high priority with us at the moment (anyone else want to
> tackle that?). Right now integration with scipy is our biggest
> priority so things like this will have to take a back seat for
> a while.

> Furthermore, we did what we needed to to port these modules from
> Numeric, but that didn't necessarily make us experts in how they
> worked. I wish we were, but we've generally been directing our
> energy elsewhere. I'd presume that the sensible way for the module
> to work is to initialize its seed from a time-based seed in the
> absence of any other seed initialization, and to keep the seed
> state in the extension module, but I could be wrong.

Yes. That is how R does it, anyway. Specifically, you declare the seed 
static, and then it persists across the Python/C interface. That is what I 
thought you had in the numarray code. Would it be hard to make it work 
like this?

I'm no expert either.

                                                             Faheem.


From southey at uiuc.edu  Wed Oct  6 07:01:38 2004
From: southey at uiuc.edu (Bruce Southey)
Date: Wed Oct  6 07:01:38 2004
Subject: [Numpy-discussion] numarray.random_array
 number generation in C code
Message-ID: <ba8a1339.8b0d6083.832bd00@expms6.cites.uiuc.edu>

Hi,  
My understanding is that you can use the Ranlib, R math, and GNU Scientific  
libraries in the manner you suggest or directly include the random number 
generator in your code. Usually you define the seed that should provide the 
same psuedo-random number stream every time these are used. If you don't use a 
seed then it is usually impossible to get the same stream of psuedo-random 
numbers. So I do not understand what you need to keep the same random number 
state. Not to mention that the common generators do repeat, some sooner than 
others. 
  
In your response to Perry, you indicate that you do not need an array of 
random numbers but rather the stream of random numbers. This is very different 
and I think you need to refine your algorithm to identify what parts need to 
be C/C++ and what need to be in Python/numarray. Since you currently have 
Python code, I would profile it to see what parts actually need extending - 
some times Python is rather surprising on how quick some things can be done 
(like using dictionaries). Providing those parts may be more fruitful to you 
than my vague responses. 
 
Regards 
Bruce 
  
---- Original message ----  
>Date: Tue, 5 Oct 2004 18:43:48 -0400 (EDT)  
>From: Faheem Mitha <faheem at email.unc.edu>    
>Subject: Re: [Numpy-discussion] numarray.random_array number generation in C  
code    
>To: Bruce Southey <southey at uiuc.edu>  
>Cc: numpy-discussion <numpy-discussion at lists.sourceforge.net>  
>  
>  
>  
>On Tue, 5 Oct 2004, Bruce Southey wrote:  
>  
>> Hi,  
>> It is rather hard to suggest anything without more detail on what you want  
to  
>> actually do.  
>  
>I could give you more details if you were interested.  
>  
>> As you describe it, why do you need the 'seed' returned? It would only   
>> make sense if you were going in and out of Python multiple times - a   
>> somewhat undesirable situation due to the overhead costs.  
>  
>Not really. One might (and I frequently do) want to run the same function   
>(which in this case might be all in C++ code), interactively with   
>different parameters. The kind of thing that I'm doing is akin to   
>exploratory data analysis, and the specific code in question is a   
>stochastic search algorithm. Doing all this in C++ would not be very   
>interactive. Also, one often wants to postprocess data output using Python   
>scripts. This involves multiple calls to C++ code, and would be impossible   
>to do using C++, since one has to call other Python libraries.  
>  
>  > I see at least three options:  
>  
>> 1) Do everything in Python/numarray.  
>  
>That's my current situation.  
>  
>> 2) Do parts in Python and the other in C/C++.  
>>   For example, pass a matrix of random numbers to your code from Python.  
The  
>> 'seed' never needs to leave Python.  
>  
>This doesn't work very well unless you know in advance how many random   
>numbers are needed (not the case, for example, for stochastic search   
>algorithms), and in any case is a rather clumsy way to do things. No   
>offense intended.  
>  
>> 3) Do it all in C/C++ - pass the 'seed' into your code that includes the  
>> random number generator(s) - there is C/C++ code around for this. Do you  
stuff  
>> and then return the 'seed' back with whatever else is required.  
>  
>Yes, but part of the point of mixed programming is that you have an   
>interpreted front end which can easily hook into other routines. Also, in   
>this case, you would not be passing the seed in, since there is nothing to   
>pass it in from. One would simply call system time or something similar to   
>obtain the seed.  
>  
>> You can email me privately if you want.  
>  
>I'll keep sending this to the list unless someone objects, since I think   
>this is of some general interest.  
>  
>Really, my main question was to whether my understanding of how to use the   
>Numarray random number facilities in C was correct or not.  
>  
>                                                                Faheem.  
  
 
From jmiller at stsci.edu  Wed Oct  6 23:47:31 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Oct  6 23:47:31 2004
Subject: [Numpy-discussion] numarray.random_array number generation in
	C code
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCCEOKFHAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCCEOKFHAA.perry@stsci.edu>
Message-ID: <1097073394.31512.76.camel@halloween.stsci.edu>

On Tue, 2004-10-05 at 21:10, Perry Greenfield wrote:
> Faheem Mitha wrote:
> 
> > Dear People,
> >
> > I want to write some C++ code to link with Python, using the
> > Boost.Python interface. I need to generate random numbers in the C++
> > code, and I was wondering as to the best way of doing this.
> >
> > Note that it is important that the random number generation interoperate
> > seamlessly with Python, in the sense that the behavior of the calls to
> > the RNG is the same whether calls are made at the C level or the Python
> > level. I hope the reasons why this is important are obvious.
> >
> > I was thinking that the method should go like this.
> >
> > 1) When C/C++ code called, reads seed from python random state.
> >
> > 2) Does its stuff.
> >
> > 3) Writes seed back to python level when it exits.
> >
> > After doing a little investigation of the numarray.random_array python
> > library and associated extension modules, it seems possible that the
> > answer is simpler than I had supposed. However, I would appreciate it if
> > someone would tell me if my understanding is incorrect in some places.
> >
> > Summary: It seems that I can just call all the C entry point routines
> > defined in ranlib.h, without worrying about getting or setting seeds.
> >
> > Rationale:
> >
> > The structure of this random number facility has three parts, all
> > files in
> > Packages/RandomArray2/Src.
> >
> > 1) low-level C routines: Packages/RandomArray2/Src/com.c and
> > Packages/RandomArray2/Src/ranlib.c.
> >
> > com.c: basic RNG stuff; getting and setting seeds etc.
> > ranlib.c: Random number generator algorithms for different distributions
> > etc.
> >
> > 2) Python to C interface: Packages/RandomArray2/Src/ranlibmodule.c.
> >
> > This interfaces the stuff in com.c and ranlib.c.
> >
> > 3) Python wrapper: Packages/RandomArray2/Lib/RandomArray2.py.
> >
> > This wraps the C interface. In most cases it does not do much
> > else besides
> > some basic argument error checking.
> >
> > From my perspective, the important thing is that the random
> > number seed is
> > only defined at C level as a static object, all the RNG stuff
> > happens at C
> > level, and the Python code just calls the C code as necessary. (I'm
> > sketchy about the details of what is defined as the seed etc.)
> >
> > This is in contrast with the R RNG facility (the only other RNG
> > facility I
> > am familiar with), which uses macros SetRNGstate() and GetRNGstate() to
> > read and write the seed, which is defined at R level.
> >
> > Therefore, the upshot is that the C routines in ranlib.h read and write
> > the same seed as the python level functions do, so no special action is
> > necessary with regard to the seed.
> >
> > Is this correct?
> >
> > In any case, it would be nice if something like the above was documented,
> > so lost souls like myself don't have to go trawling through the source
> > code to figure out what is going on. Of course it is nice that the source
> > code is available, otherwise even that would be impossible.
> >
> > R documents this stuff in the "Writing R Extensions" manual, online at
> > http://cran.r-project.org/doc/manuals/R-exts.pdf. Perhaps the Numarray
> > manual could have a small section about this too.
> >
> >                                                          Regards, Faheem.
> >
> I'm not sure I understand what you want to do. Do you want to link
> directly to the extension code from your C++ code? If so I'm wondering
> why. It would make the most sense if the C++ code needed obtain
> small numbers of random numbers in some iterative loop, and you wish
> to use the same random number library that that numarray is using.
> Otherwise, I would normally obtain the random number array
> in python, then call the C++ extension. Perhaps I didn't read carefully
> enough. Normally linking to an extension module involves some hacks
> that I'm not sure were done for the randomarray module (the gory
> details are in the python docs for extension modules), Todd can
> check on that, 

I checked and there's no C level export of the ranlib interface, at
least not in the "hacked" sense of an extension module C-API where the
linkage is made indirect via an API pointer and bizarre macros.

> I'm not sure I will have time (a superficial check
> seems to indicate that it doesn't support direct linking, though
> one could link to the underlying library I suppose).

Ordinary C linkage to numarray.random_array.ranlib2 may be supported
since as an extension it is also a shared library, but I've never tried
it myself and I wonder if it would actually work. If anyone has tried
something like that I'd be interested in hearing how it turned out. 
Without a really compelling reason,  I'd avoid it myself.

Regards,
Todd


From dd55 at cornell.edu  Sun Oct 10 12:51:58 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Sun Oct 10 12:51:58 2004
Subject: [Numpy-discussion] ieeespecial
Message-ID: <200410101547.18413.dd55@cornell.edu>

Hello,

I am getting invalid numeric result exceptions when dividing a complex array 
by zero. Is this the desired behavior?

Also, while trying to find a way around the above problem, I ran 
ieeespecial.test and got the following output. I am running numarray 1.1 on 
python 2.3.3. Todd, this might be correlated with the numerix package in 
matplotlib. I tried importing numarray and ieeespecial without matplotlib and 
the ieeespecial.test was successful.

Thanks,

Darren


In [31]: ieeespecial.test()
Out[31]: inf
*****************************************************************
Failure in example:
inf    # the repr() of inf may vary from platform to platform
from line #6 of numarray.ieeespecial
Expected: inf
Got:
Out[31]: nan
*****************************************************************
Failure in example:
nan    # the repr() of nan may vary from platform to platform
from line #8 of numarray.ieeespecial
Expected: nan
Got:
Out[31]: (array([0, 2]), array([0, 3]))
*****************************************************************
Failure in example: getinf(b)
from line #20 of numarray.ieeespecial
Expected: (array([0, 2]), array([0, 3]))
Got:
Out[31]:
array([[ 999.,    1.,    2.,    3.],
       [   4.,    5.,    6.,    7.],
       [   8.,    9.,   10.,  999.],
       [  12.,   13.,   14.,   15.]])
*****************************************************************
Failure in example: a
from line #26 of numarray.ieeespecial
Expected:
array([[ 999.,    1.,    2.,    3.],
       [   4.,    5.,    6.,    7.],
       [   8.,    9.,   10.,  999.],
       [  12.,   13.,   14.,   15.]])
Got:
Out[31]: (array([0, 1, 2]), array([1, 2, 3]))
*****************************************************************
Failure in example: getnan(a)
from line #35 of numarray.ieeespecial
Expected: (array([0, 1, 2]), array([1, 2, 3]))
Got:
*****************************************************************
1 items had failures:
   5 of  11 in numarray.ieeespecial
***Test Failed*** 5 failures.
Out[31]: (5, 11)


-- 

Darren


From dd55 at cornell.edu  Sun Oct 10 13:57:43 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Sun Oct 10 13:57:43 2004
Subject: [Numpy-discussion] ieeespecial
In-Reply-To: <200410101547.18413.dd55@cornell.edu>
References: <200410101547.18413.dd55@cornell.edu>
Message-ID: <200410101653.51172.dd55@cornell.edu>


On Sunday 10 October 2004 03:47 pm, Darren Dale wrote:
> Hello,
>
> I am getting invalid numeric result exceptions when dividing a complex
> array by zero. Is this the desired behavior?
>
> Also, while trying to find a way around the above problem, I ran
> ieeespecial.test and got the following output. I am running numarray 1.1 on
> python 2.3.3. Todd, this might be correlated with the numerix package in
> matplotlib. I tried importing numarray and ieeespecial without matplotlib
> and the ieeespecial.test was successful.
>

On a related note, ieeespecial.getnan appears to be incompatible with complex 
arrays, see below. I didnt mention in my last email that I built numarray for 
my existing blas/lapack libraries, will this change the behavior on my system 
from the default?

Thanks,
Darren

>>> from numarray import *
>>> from numarray.ieeespecial import *
>>> b=arange(10,typecode=Complex64)
>>> a=b/0
Warning: Encountered invalid numeric result(s)  in divide
>>> a
array([              nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj,
                     nan             +nanj])
>>> getnan(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 117, 
ingetnan
    return _spec.index(a, _spec.NAN)
  File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 95, in 
index
    return _na.nonzero(mask(a, msk))
  File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 87, in 
mask
    f = _na.ieeemask(a, m)
  File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 883, in 
_cache_miss2
    mode, win1, win2, wout, cfunc, ufargs = \
  File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 929, in 
_setup
    convtype1, convtype2, outtype, ucfunc \
  File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 471, in 
_typematch
    newInputSignature = (self._typePromoter(intype, atypelist),)*2
  File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 498, in 
_typePromoter
    raise TypeError("unable to find type to promote to")
TypeError: unable to find type to promote to

>>> getnan(a.real)
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),)
>>>  


From aisaac at american.edu  Sun Oct 10 15:57:18 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Sun Oct 10 15:57:18 2004
Subject: [Numpy-discussion] documentation error
Message-ID: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>

In the Numeric manual, there are  two different defintions of the
'diagonal' function.  The second definition appears to be incorrect.

p.39:
diagonal(a, k=0, axis1=0, axis2 = 1)
returns the entries along the k th diagonal of a (k is an
offset from the main diagonal). This is designed for 2d
arrays. For larger arrays, it will return the diagonal of
each 2d sub-array.

p.44
diagonal(a, offset=0, axis1=0, axis2=1)
The diagonal function takes an array a, and returns an array
of rank 1 containing all of the elements of a such that the
difference between their indices along the specified axes is
equal to the specified offset. With the default values, this
corresponds to all of the elements of the diagonal of a
along the last two axes.

fwiw,
Alan Isaac


From jmiller at stsci.edu  Sun Oct 10 17:43:34 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Sun Oct 10 17:43:34 2004
Subject: [Numpy-discussion] ieeespecial
In-Reply-To: <200410101653.51172.dd55@cornell.edu>
References: <200410101547.18413.dd55@cornell.edu>
	 <200410101653.51172.dd55@cornell.edu>
Message-ID: <1097454870.3741.48.camel@localhost.localdomain>

On Sun, 2004-10-10 at 16:53, Darren Dale wrote:
> On Sunday 10 October 2004 03:47 pm, Darren Dale wrote:
> > Hello,
> >
> > I am getting invalid numeric result exceptions when dividing a complex
> > array by zero. Is this the desired behavior?
> >
> > Also, while trying to find a way around the above problem, I ran
> > ieeespecial.test and got the following output. I am running numarray 1.1 on
> > python 2.3.3. Todd, this might be correlated with the numerix package in
> > matplotlib. I tried importing numarray and ieeespecial without matplotlib
> > and the ieeespecial.test was successful.
> >
> 
> On a related note, ieeespecial.getnan appears to be incompatible with complex 
> arrays, see below. 

Thanks for pointing this out.  It's an oversight in the implementation
of ieeespecial and I'll fix it.

> I didnt mention in my last email that I built numarray for 
> my existing blas/lapack libraries, will this change the behavior on my system 
> from the default?

Regarding ieeespecial and complex division by zero, I am pretty sure
blas/lapack linkage is irrelevant.  But... I very rarely link with an
external blas/lapack,  so if there is an issue, I'm unlikely to have
come across it myself.  Still, off the top of my head,  blas/lapack is
unrelated.

Regards,
Todd

> Thanks,
> Darren
> 
> >>> from numarray import *
> >>> from numarray.ieeespecial import *
> >>> b=arange(10,typecode=Complex64)
> >>> a=b/0
> Warning: Encountered invalid numeric result(s)  in divide
> >>> a
> array([              nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj,
>                      nan             +nanj])
> >>> getnan(a)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 117, 
> ingetnan
>     return _spec.index(a, _spec.NAN)
>   File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 95, in 
> index
>     return _na.nonzero(mask(a, msk))
>   File "/usr/lib/python2.3/site-packages/numarray/ieeespecial.py", line 87, in 
> mask
>     f = _na.ieeemask(a, m)
>   File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 883, in 
> _cache_miss2
>     mode, win1, win2, wout, cfunc, ufargs = \
>   File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 929, in 
> _setup
>     convtype1, convtype2, outtype, ucfunc \
>   File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 471, in 
> _typematch
>     newInputSignature = (self._typePromoter(intype, atypelist),)*2
>   File "/usr/lib/python2.3/site-packages/numarray/ufunc.py", line 498, in 
> _typePromoter
>     raise TypeError("unable to find type to promote to")
> TypeError: unable to find type to promote to
> 
> >>> getnan(a.real)
> (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),)
> >>>  


From dd55 at cornell.edu  Sun Oct 10 18:08:10 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Sun Oct 10 18:08:10 2004
Subject: [Numpy-discussion] ieeespecial
In-Reply-To: <1097454560.3741.41.camel@localhost.localdomain>
References: <200410101547.18413.dd55@cornell.edu> <1097454560.3741.41.camel@localhost.localdomain>
Message-ID: <200410102103.42221.dd55@cornell.edu>

On Sunday 10 October 2004 08:29 pm, you wrote:
> On Sun, 2004-10-10 at 15:47, Darren Dale wrote:
> > Hello,
> >
> > I am getting invalid numeric result exceptions when dividing a complex
> > array by zero. Is this the desired behavior?
>
> This is what I would have expected,  and examining the definition I have
> for complex division in numarray/Include/numarray/numcomplex.h,  I don't
> see a problem.   The definition should probably be checked by an extra
> set of eyes.  Looks OK to me.

Hi Todd,

Sorry, I wasnt clear. I was wondering if it should raise a divide by zero 
exception and return an inf, as the real datatypes do, instead of an invalid 
numeric result and a nan.  As it stands now, we have to handle divide by zero 
differently for different data types, if we need to filter/replace such 
values.

Thanks,
Darren


From jmiller at stsci.edu  Sun Oct 10 18:44:38 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Sun Oct 10 18:44:38 2004
Subject: [Numpy-discussion] ieeespecial
In-Reply-To: <200410101547.18413.dd55@cornell.edu>
References: <200410101547.18413.dd55@cornell.edu>
Message-ID: <1097454560.3741.41.camel@localhost.localdomain>

On Sun, 2004-10-10 at 15:47, Darren Dale wrote:
> Hello,
> 
> I am getting invalid numeric result exceptions when dividing a complex array 
> by zero. Is this the desired behavior?

This is what I would have expected,  and examining the definition I have
for complex division in numarray/Include/numarray/numcomplex.h,  I don't
see a problem.   The definition should probably be checked by an extra
set of eyes.  Looks OK to me.

> Also, while trying to find a way around the above problem, I ran 
> ieeespecial.test and got the following output. I am running numarray 1.1 on 
> python 2.3.3. Todd, this might be correlated with the numerix package in 
> matplotlib. I tried importing numarray and ieeespecial without matplotlib and 
> the ieeespecial.test was successful.
> 

I tried this with an ordinary Python shell and ieeespecial.test()
completed without errors.  Looking at your test output,  I noticed it
was skewed, and guessed there was an I/O synchronization issue messing
up doctest.  I tried the same test under IPython w/o matplotlib and
duplicated your results,  so I think the problem is an IPython/doctest
issue.  

Regards,
Todd

> Thanks,
> 
> Darren
> 
> 
> In [31]: ieeespecial.test()
> Out[31]: inf
> *****************************************************************
> Failure in example:
> inf    # the repr() of inf may vary from platform to platform
> from line #6 of numarray.ieeespecial
> Expected: inf
> Got:
> Out[31]: nan
> *****************************************************************
> Failure in example:
> nan    # the repr() of nan may vary from platform to platform
> from line #8 of numarray.ieeespecial
> Expected: nan
> Got:
> Out[31]: (array([0, 2]), array([0, 3]))
> *****************************************************************
> Failure in example: getinf(b)
> from line #20 of numarray.ieeespecial
> Expected: (array([0, 2]), array([0, 3]))
> Got:
> Out[31]:
> array([[ 999.,    1.,    2.,    3.],
>        [   4.,    5.,    6.,    7.],
>        [   8.,    9.,   10.,  999.],
>        [  12.,   13.,   14.,   15.]])
> *****************************************************************
> Failure in example: a
> from line #26 of numarray.ieeespecial
> Expected:
> array([[ 999.,    1.,    2.,    3.],
>        [   4.,    5.,    6.,    7.],
>        [   8.,    9.,   10.,  999.],
>        [  12.,   13.,   14.,   15.]])
> Got:
> Out[31]: (array([0, 1, 2]), array([1, 2, 3]))
> *****************************************************************
> Failure in example: getnan(a)
> from line #35 of numarray.ieeespecial
> Expected: (array([0, 1, 2]), array([1, 2, 3]))
> Got:
> *****************************************************************
> 1 items had failures:
>    5 of  11 in numarray.ieeespecial
> ***Test Failed*** 5 failures.
> Out[31]: (5, 11)
-- 


From aisaac at american.edu  Sun Oct 10 18:59:17 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Sun Oct 10 18:59:17 2004
Subject: [Numpy-discussion] location of tutorial
Message-ID: <Mahogany-0.66.0-1780-20041010-215449.00@american.edu>

p.29 of the Numeric manual refers to 
http://www.python.org/doc/tut/functional.html
which no longer exists.  I suggest substituting
http://docs.python.org/tut/tut.html

fwiw,
Alan Isaac


From jmiller at stsci.edu  Mon Oct 11 04:28:51 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Oct 11 04:28:51 2004
Subject: [Numpy-discussion] ieeespecial
In-Reply-To: <200410102103.42221.dd55@cornell.edu>
References: <200410101547.18413.dd55@cornell.edu>
	 <1097454560.3741.41.camel@localhost.localdomain>
	 <200410102103.42221.dd55@cornell.edu>
Message-ID: <1097493501.2619.26.camel@localhost.localdomain>

On Sun, 2004-10-10 at 21:03, Darren Dale wrote:
> On Sunday 10 October 2004 08:29 pm, you wrote:
> > On Sun, 2004-10-10 at 15:47, Darren Dale wrote:
> > > Hello,
> > >
> > > I am getting invalid numeric result exceptions when dividing a complex
> > > array by zero. Is this the desired behavior?
> >
> >
> > This is what I would have expected,  and examining the definition I have
> > for complex division in numarray/Include/numarray/numcomplex.h,  I don't
> > see a problem.   The definition should probably be checked by an extra
> > set of eyes.  Looks OK to me.
> 
> Hi Todd,
> 
> Sorry, I wasn't clear. I was wondering if it should raise a divide by zero 
> exception and return an inf, as the real data types do, instead of an invalid 
> numeric result and a nan.  As it stands now, we have to handle divide by zero 
> differently for different data types, if we need to filter/replace such 
> values.

Numarray's error handling system is pretty flexible, and can raise
exceptions on divide by zero if configured properly, or can ignore them
altogether.  See section 4.9 in the numarray-1.1 manual here:

http://prdownloads.sourceforge.net/numpy/numarray-1.1.pdf?download

It's an interesting question regarding the inf vs. nan.   Looking at the
complex division macro (NUM_CDIV) in numcomplex.h,  I don't understand
why we're getting nans now and not infs;  it might be a bug in the
macro,  but I don't see it.

Regards,
Todd


From stephen.walton at csun.edu  Mon Oct 11 20:16:55 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Mon Oct 11 20:16:55 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
Message-ID: <1097550159.2568.5.camel@localhost.localdomain>

On Sun, 2004-10-10 at 11:33, Alan G Isaac wrote:
> In the Numeric manual, there are  two different defintions of the
> 'diagonal' function.  The second definition appears to be incorrect.
> 
> p.39:
> diagonal(a, k=0, axis1=0, axis2 = 1)

> p.44
> diagonal(a, offset=0, axis1=0, axis2=1)

Are you sure?  On my system, it appears that the second definition is
correct in both Numeric 23.3 and numarray 1.1.


From a.schmolck at gmx.net  Tue Oct 12 02:40:55 2004
From: a.schmolck at gmx.net (Alexander Schmolck)
Date: Tue Oct 12 02:40:55 2004
Subject: [Numpy-discussion] A disconnected numarray rant
Message-ID: <yfsu0t0b7lk.fsf@black4.ex.ac.uk>

Hi,

I'm taking a 1 month break from computers (i.e. I will be completely
off-line), and I have to catch a train in an hour; but I've recently bitten
the bullet and made a matrix class I've been using for some time work with
numarray; I've written down a number of things that occured to me while I was
doing it, including some things which I think are bugs in numarray, so I
thought at least posting the bugs would be a useful service; the rest is very
raw and essentially unedited cut-and-paste of these notes -- sorry about that
and I hope it doesn't contain anything particularly offensive.

P.S. just dumped the code for the matrix class (nummat) at
http://www.dcs.ex.ac.uk/~aschmolc/Stuff/

'as

The following are my notes:


Things that fairly clearly seem to be bugs:
    - numarray.Int32 etc. can't be pickled
    - ``a = array(1+0j); a.imag = a.real * 10`` => IndexError
    - array(0, type=Float64) + 1e3000  => `inf` with right error modes
      but  array(0, type=Float32) + 1e3000 => `OverflowError`
    - numarray.array(10)/numarray.array(0) => 0 
    - numarray.array(10000000000000L) => array(1316134912)
    - numarray.where(0,1,0) => array([0])
    - l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l => [1, 2, 3]
      a = array([1,2,3]); numarray.put(a,numarray.array([1,2,0]),[0,0,0]); a => array([0, 0, 0])
    - repr(numarray.array([],typecode='i')) (etc. etc.) => "numarray.array([])"
    - getattr(array([1,2,3]), '_aligned') => SystemError
    - obscure: numarray.where(0, matrix(568, convert_scalars=True),2) =>
      ValueError (tries __len__ which fails, as len(array(568)) also fails)

Numeric incompatiblilities (that are either undocumented or bug-like)

- numarray.array('a', typecode='O') => TypeError (object arrays)
- for extra fun try: numarray.array(1, type=numarray.Object) -=> RuntimeError
  something entirely different
- nonzero is completely incompatible
- shape(None) etc. no longer works (IMHO a bug)
- cross_correlate & average missing
- left_shift et al missing
- numarray.sqrt(a,a) is None (*not* the result, as it used to be)
- num.put(a, [0,1,2,3], [10,20]) style behavior seems unavailable (without numarray.numeric)
  put(array([[ 0.,  1.,  2.], [ 3.,  4.,  5.]]), [1, 4], [10,40]) fails
- boolean testing (not even bool(array(0)) works; I'm not sure this is good)

- Generally different handling of rank0-arrays; e.g. ``type(num.array(1.0) +
  0) is float``; one potentially very nasty gotcha are inplace operations
  (e.g. a**=2) which have totally different semantics for python scalars and
  rank0 arrays, which, unlike Attribute errors on ``a.shape``, can lead to
  nasty bugs in corner cases (e.g. when a reduction just infrequently yields
  scalar ``a``) -- I think this should be mentioned in a gotchas section
  (another possible entry would be the need to use .copy() to **save** memory
  on slicing and 1xN, Nx1 matrices versus vectors (people are not used to
  thinking properly about rank from mathematical training or matlab
  exposure)).

- asarray downcasts arrays (e.g.: asarray(array([1.,2.,3.]),'i'))

- numarray.ones(-5) => MemoryError (ValueError would be nicer)
- numarray.ones(2.0), numarray.ones([2]) fail (cf. numarray.range(2.0))
      b=num.array([[1,2,3,4],[5,6,7,8]]*2)
      assert eq(num.diagonal(b), [1,6,3,8])
      assert eq(num.diagonal(b, -1), [5,2,7])
      c = num.array([b,b])
      assert eq(num.diagonal(c,1), [[2,7,4], [2,7,4]])
- no a.toscalar() !!!
- matrixmultiply in the docs
- what's the point of swapaxes (i.e. why not have a generalized in-place
  transpose?)
- what's the point of innerproduct?


- indexing by a list is different from indexing by tuple (I haven't had time
  to look closely at the docs whether that's intentional)

- doesn't know about Numeric's bizzarre '\x0b' typecode
- numarray.sqrt.reduce([]) raises (sensibly) TypeError, not ValueError

- len(array(1)) or array(1)[0] won't work anymore (understandable, but
  should be documented)
- (should maximim, minimum reduce to -inf and inf?)
- <built-in method reduce of _BinaryUFunc object at 0x82dfc9c> is not
  a very helpful repr; should be possible to get to the ufunc itself
- as in Numeric numarray.maximum.reduce(numarray.array([0,-0.])) => -0.0
- __array__ protocol no longer supported (how can a non-derived class convert
  itself efficiently to an array?)


Documentation Gotchas
- p. 34 IMO row vector is used incorrectly; row and column vectors are really
     matrices (i.e. have rank 2) so ``array([[1,2,3]])`` would be a row vector

- No proper explanation of differences between Numeric and numarray, or
  numarray.numeric module differences to proper (e.g. argmin)

- No migration and best-practice advice (e.g. there should be a standard way
  for packages which work with both numarray and numeric as backends to let
  the user choose his preference; how about setting an environment var NumPy
  or something?)


Waffle
------

- there *really* ought to be an array equality function (with optional
  tolerance); it's quite difficult to get right for are normal user (nans;
  zero-size arrays etc.) and it's often required, especially for testing

- rank preserving reduction seems useful as an option would be nice -- e.g. to
  subtract out or divide by the reduced portion (which currently won't e.g.
  work for columns without adding a unit-dimension by hand). 

Design

  The (AFAICS) benefit-free but downside-rich introduction of `type`
  ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

  Is there any reason that Typecode objects that compare as desired to the
  relevant strings ("i", "d") wouldn't have done? Now there is an explosion
  and confusion of interfaces -- some numpy code will now only except
  type(code)s as "typecode" keyword parameter (even in numarray! see
  numarray.mlab!) and other stuff

  Never mind that type already is a highly overused word in the python world.


  The big method bloat.
  '''''''''''''''''''''

  As it says in the Numeric manual introductions there were "good reasons" for
  "very few array methods" -- now there are **56** public methods and 8 public
  attributes (public == not starting with '_'); of those 56 methods about 11
  are accessors and of the rest about half are redundant or worse (i.e. they
  either also exist as numarray functions (argmin, argmax, diagonal, ...) or
  they really ought to be functions (mean, stddev) or they are quite confusing
  (``a.min``, ``a.max`` which behave quite differenlty from ``a.argmin`` and
  ``a.argmax``, never mind ``numarray.minimum``) or simply utterly pointless
  (``a.nelements`` == ``a.size``)).

  - argmin, argmax : what's wrong with numarray.argmin, numarray.argmax??? Why
    do argmin/argmax and max/min have completely different interfaces??? If
    there really is a need for these (there isn't) anything a.min and a.max
    should be called a.flatmin, a.flatmax

  - diagonal, mean, nelements, nonzero, ...

  - perversely the **only** function that I can think off that could have
    sensibly become a method hasn't: ``put`` (it used to work only on arrays
    under Numeric and not without reason, so making it a method would have
    been sensible; numarray.put of course also "works" on non-arrays, it just
    doesn't do anything with them)


  Test Code
  '''''''''
  numtest.py doesn't inspire full confidence (it's about 1000 lines of actual
  code but it doesn't seem that clearly structured and AFAICT contains no
  single loop (and that despite the diversity of shapes, types etc. that exist
  in numarray -- why not try something slightly more systematic?)).


From avhot at email.msn.com  Tue Oct 12 06:11:30 2004
From: avhot at email.msn.com (Shelia Mendez)
Date: Tue Oct 12 06:11:30 2004
Subject: [Numpy-discussion] Cheap software for you please.   6610536
Message-ID: <43647672541191164755429@email.msn.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041012/e4aa121a/attachment-0001.html>

From aisaac at american.edu  Tue Oct 12 07:03:18 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Tue Oct 12 07:03:18 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <1097550159.2568.5.camel@localhost.localdomain>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
 <1097550159.2568.5.camel@localhost.localdomain>
Message-ID: <Mahogany-0.66.0-1780-20041012-100006.00@american.edu>

> On Sun, 2004-10-10 at 11:33, Alan G Isaac wrote:
>> In the Numeric manual, there are  two different defintions of the
>> 'diagonal' function.  The second definition appears to be incorrect.


On Mon, 11 Oct 2004, Stephen Walton apparently wrote:
> Are you sure?  On my system, it appears that the second definition is
> correct in both Numeric 23.3 and numarray 1.1.


You did not quote the problematic portion:
        The diagonal function takes an array a, and returns
        an array of rank 1 ... With the default values, this
        corresponds to all of the elements of the diagonal
        of a along the last two axes.

Contrast:
>>> import Numeric
>>> Numeric.__version__
'23.1'
>>> x=[[[1,2],[3,4]],[[5,6],[7,8]]]
>>> Numeric.diagonal(x)
array([[1, 4],
       [5, 8]])

fwiw,
Alan Isaac


From stephen.walton at csun.edu  Tue Oct 12 08:42:04 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Tue Oct 12 08:42:04 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <Mahogany-0.66.0-1780-20041012-100006.00@american.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
	 <1097550159.2568.5.camel@localhost.localdomain>
	 <Mahogany-0.66.0-1780-20041012-100006.00@american.edu>
Message-ID: <1097595580.24491.4.camel@freyer.sfo.csun.edu>

On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote:

> On Mon, 11 Oct 2004, Stephen Walton apparently wrote:
> > Are you sure?  On my system, it appears that the second definition is
> > correct in both Numeric 23.3 and numarray 1.1.
> 
> 
> You did not quote the problematic portion:
>         The diagonal function takes an array a, and returns
>         an array of rank 1 ...

Ah, I thought you were referring to the fact that, in the first version
in the documentation, the second, named argument is given as "k" but in
the second version it is "offset". A look at the source reveals the
second keyword name is the correct one.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041012/567758a5/attachment-0001.sig>

From aisaac at american.edu  Tue Oct 12 12:25:01 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Tue Oct 12 12:25:01 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <1097595580.24491.4.camel@freyer.sfo.csun.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu><1097550159.2568.5.camel@localhost.localdomain><Mahogany-0.66.0-1780-20041012-100006.00@american.edu><1097595580.24491.4.camel@freyer.sfo.csun.edu>
Message-ID: <Mahogany-0.66.0-1780-20041012-152445.00@american.edu>

> On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote:
>> You did not quote the problematic portion:
>>         The diagonal function takes an array a, and returns
>>         an array of rank 1 ...


On Tue, 12 Oct 2004, Stephen Walton apparently wrote:
> A look at the source reveals the
> second keyword name is the correct one.


OK then, we have a double problem.
The first version gives the correct description
but uses the wrong keyword.
The second version gives the wrong description
but uses the correct keyword.

So, how do we file a documentation bug?

Cheers,
Alan Isaac


From perry at stsci.edu  Tue Oct 12 12:31:17 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Oct 12 12:31:17 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <Mahogany-0.66.0-1780-20041012-152445.00@american.edu>
Message-ID: <NEBBIJKBMLDBLNCEEFOCOEPOFHAA.perry@stsci.edu>

> So, how do we file a documentation bug?
> 
> Cheers,
> Alan Isaac
>
I'd say just like any other kind of bug.

Perry


From jmiller at stsci.edu  Tue Oct 12 12:40:19 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Oct 12 12:40:19 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <Mahogany-0.66.0-1780-20041012-152445.00@american.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
	 <1097550159.2568.5.camel@localhost.localdomain>
	 <Mahogany-0.66.0-1780-20041012-100006.00@american.edu>
	 <1097595580.24491.4.camel@freyer.sfo.csun.edu>
	 <Mahogany-0.66.0-1780-20041012-152445.00@american.edu>
Message-ID: <1097609991.30171.556.camel@halloween.stsci.edu>

On Tue, 2004-10-12 at 12:40, Alan G Isaac wrote:
> > On Tue, 2004-10-12 at 07:00, Alan G Isaac wrote:
> >> You did not quote the problematic portion:
> >>         The diagonal function takes an array a, and returns
> >>         an array of rank 1 ...
> 
> 
> 
> On Tue, 12 Oct 2004, Stephen Walton apparently wrote:
> > A look at the source reveals the
> > second keyword name is the correct one.
> 
> 
> OK then, we have a double problem.
> The first version gives the correct description
> but uses the wrong keyword.
> The second version gives the wrong description
> but uses the correct keyword.
> 
> So, how do we file a documentation bug?
> 
Go here:

http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse

then "Submit New", and set the "category" to "documentation.

Regards,
Todd

> Cheers,
> Alan Isaac
> 
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 


From pearu at scipy.org  Wed Oct 13 06:02:48 2004
From: pearu at scipy.org (Pearu Peterson)
Date: Wed Oct 13 06:02:48 2004
Subject: [Numpy-discussion] ANN: SciPy 0.3.2 Released
Message-ID: <Pine.LNX.4.61.0410130804100.11778@scipy.org>

Hi,

Scipy 0.3.2 has been released and binaries are available from the 
scipy.org site:

   http://www.scipy.org

Scipy 0.3.2 is a bug fix release of Scipy 0.3 including the following new 
features:

- wxPython 2.5 support
- reading/writing dense/sparse matrices in Matrix Market format
- iterative solvers, new functions sqrtm, hessenberg
- Constrained Optimization BY Linear Approximation
- discrete Boltzmann, Planck, Levy distributions
- Scipy tests pass now also on 64-bit systems and Mac OSX
etc.

The complete release notes can be found here:

   http://www.scipy.org/download/scipy_release_notes_0.3.2.html

Best regards,

Pearu

BTW Scipy is:
-------------
Scipy is an open source library of scientific tools for Python. Scipy 
supplements the popular Numeric module, gathering a variety of high level 
science and engineering modules together as a single package.

Scipy includes modules for graphics and plotting, optimization, 
integration, special functions, signal and image processing, genetic 
algorithms, ODE solvers, and others.


From jmiller at stsci.edu  Wed Oct 13 14:35:08 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Oct 13 14:35:08 2004
Subject: [Numpy-discussion] A disconnected numarray rant
In-Reply-To: <yfsu0t0b7lk.fsf@black4.ex.ac.uk>
References: <yfsu0t0b7lk.fsf@black4.ex.ac.uk>
Message-ID: <1097703239.631.923.camel@halloween.stsci.edu>

Hi Alexander,

Thanks for taking the time to provide us with feedback.  I've responded
to many of your points below.  [and in the interest of keeping the text
bloat down, I've interjected my own comments in brackets--Perry]

On Tue, 2004-10-12 at 05:37, Alexander Schmolck wrote: 
> Hi,
> 
> I'm taking a 1 month break from computers (i.e. I will be completely
> off-line), and I have to catch a train in an hour; but I've recently 
> bitten
> the bullet and made a matrix class I've been using for some time work 
> with
> numarray; I've written down a number of things that occured to me 
> while I was
> doing it, including some things which I think are bugs in numarray, so
> I
> thought at least posting the bugs would be a useful service; the rest 
> is very
> raw and essentially unedited cut-and-paste of these notes -- sorry 
> about that
> and I hope it doesn't contain anything particularly offensive.
> 
> P.S. just dumped the code for the matrix class (nummat) at
> http://www.dcs.ex.ac.uk/~aschmolc/Stuff/
> 
> 'as
> 
> The following are my notes:
> 
> 
> Things that fairly clearly seem to be bugs:
>     - numarray.Int32 etc. can't be pickled

Known limitation,  but OK.   Arrays can be pickled, as can Numeric
typecodes so I'm not sure how critical this omission is. 

>     - ``a = array(1+0j); a.imag = a.real * 10`` => IndexError
>     - array(0, type=Float64) + 1e3000  => `inf` with right error modes
>       but  array(0, type=Float32) + 1e3000 => `OverflowError`
>     - numarray.array(10)/numarray.array(0) => 0
>     - numarray.array(10000000000000L) => array(1316134912)
>     - numarray.where(0,1,0) => array([0])

There seems to be an infinity of rank-0 issues and so little
justification for having them that at one point we considered ripping
them out altogether.  Noted,  but low priority.

[Amen. If I had known the problems that rank-0 zero arrays would cause
I think I would have excluded them. I'm not sure I see the need for
them now that coercion rules have changed and helper functions to change
scalars into rank-1 len-1 arrays which serve almost all other purposes.
I'm interested in seeing what real purpose they serve now (I understand
the backward compatibility issue, but backward compatibility is not the
be all and end all for numarray; more on that later)] 
>     - l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l 
> => [1, 2, 3]

Should raise a TypeError I guess. 
>       a = array([1,2,3]); 
> numarray.put(a,numarray.array([1,2,0]),[0,0,0]); a => array([0, 0, 0])

I don't see what's wrong here. 
>     - repr(numarray.array([],typecode='i')) (etc. etc.) => 
> "numarray.array([])"

Zero length arrays are rather like rank-0 arrays: low priority. 
Agreed... this is a small wart. 
>     - getattr(array([1,2,3]), '_aligned') => SystemError

Interesting.  I've been thinking about ripping out the _align and
_contiguous self-test hacks for a long time.  You've made up my mind. 
>     - obscure: numarray.where(0, matrix(568, convert_scalars=True),2) 
> =>
>       ValueError (tries __len__ which fails, as len(array(568)) also 
> fails)

I think this may boil down to "no where() for object arrays".
numarray.where() can't handle object arrays and there is no
numarray.objects.where().  Not implemented yet. 
> Numeric incompatiblilities (that are either undocumented or bug-like)

The best Numeric compatibility in numarray comes from:

import numarray.numeric as Numeric

It's still not perfect,  but it is more compatible than ordinary
numarray. 
> - numarray.array('a', typecode='O') => TypeError (object arrays)
> - for extra fun try: numarray.array(1, type=numarray.Object) -=> 
> RuntimeError
>   something entirely different

Object arrays in numarray do not have the synergy they have in
Numeric.   In particular,  numarray.array() can't create them, only
numarray.objects.array(). [At the time we added object arrays, we 
noticed that they were not safe in Numeric; that is, Numeric was not
properly handling reference counts of objects in arrays for at least
some operations and it was possible to segfault object arrays. This may
have changed since then; we haven't had a chance to check the current
status. But the point is that handling object arrays safely is a lot
more than just loading them with object pointers. Any function that can
set values in arrays needs to handle their refcounts, and that isn't all
that trivial. We took a short cut of using a Python implementation for
object arrays that doesn't have all the old functionality, but also
didn't have the problems that they did at the time.] 
> - nonzero is completely incompatible

numarray.numeric covers this.

numarray's nonzero() is more powerful, capable of handling
multidimensional arrays,  so it returns a tuple of values rather than a
single value.  It's unfortunate that we chose to use the name nonzero()
for the "new" function;  it has the right interface and the wrong name.
Keep in mind though,  our compatibility goals have grown immensely since
we started. 
> - shape(None) etc. no longer works (IMHO a bug)

This may be related to the object array synergy.   I think
numarray.asarray() is the problem here, since it doesn't know how to
create object arrays.
> - cross_correlate & average missing

I think cross_correlate is in numarray.convolve.correlate.  It was a
conscious choice not to put it in core numarray.  Average has never been
implemented and should be, especially since it has different semantics
than the mean() method. 
> - left_shift et al missing

These were renamed lshift and rshift.   Note that << works fine.
Synonyms should probably be added. 
> - numarray.sqrt(a,a) is None (*not* the result, as it used to be)

What do you want here?  What we have now is, IMO, correct. [Amen. This 
was intentionally changed from Numeric.] 
> - num.put(a, [0,1,2,3], [10,20]) style behavior seems unavailable 
> (without numarray.numeric)

I wasn't exactly sure what the expected behavior was for this,  but
guessed is was some kind of repeat.  If that's what the behavior was,
Perry and I don't really like it.  Besides,  numarray.numeric.put *is*
Numeric.put, modulo numarray underpinnings.

>   put(array([[ 0.,  1.,  2.], [ 3.,  4.,  5.]]), [1, 4], [10,40])
> fails

numarray.put() does have different semantics for multi-dimensional
destinations... you need multi-dimensional indexes (i.e. a tuple of
index arrays).  Again,  there's now numarray.numeric.put().

> - boolean testing (not even bool(array(0)) works; I'm not sure this is
> good)

[I am. This was a clear and explicit decision to not replicate Numeric 
behavior. I'm convinced that it is the right decision. There is just too
much confusion about what the truth value of an array should be. Helper
functions should be used to make it unambiguous.] 

> - Generally different handling of rank0-arrays; e.g. 
> ``type(num.array(1.0) +
>   0) is float``; one potentially very nasty gotcha are inplace 
> operations
>   (e.g. a**=2) which have totally different semantics for python 
> scalars and
>   rank0 arrays, which, unlike Attribute errors on ``a.shape``, can 
> lead to
>   nasty bugs in corner cases (e.g. when a reduction just infrequently 
> yields
>   scalar ``a``) -- I think this should be mentioned in a gotchas 
> section

We have areduce() for this case, which always returns an array. 
>   (another possible entry would be the need to use .copy() to **save**
> memory
>   on slicing and 1xN, Nx1 matrices versus vectors (people are not used
> to
>   thinking properly about rank from mathematical training or matlab
>   exposure)).

[You will need to elaborate about what you mean here. E.g., as to the 
first: I'm guessing you mean when a slice is taken and then the original
array is deleted. But it isn't clear.] 
> - asarray downcasts arrays (e.g.: asarray(array([1.,2.,3.]),'i'))

True enough.  Is there some reason why the method should silently
succeed (I know we wanted that) and the function should not? 
> - numarray.ones(-5) => MemoryError (ValueError would be nicer)

Easy to change. 
> - numarray.ones(2.0),

This fails, and that's fine by me.  The idea of floating point shapes
seems bogus. 
> numarray.ones([2])

AFIK, this works, and should work. 
> fail (cf. numarray.range(2.0))

IMHO, arange() is a special case and not really equivalent to
numarray.ones(). 
>       b=num.array([[1,2,3,4],[5,6,7,8]]*2)
>       assert eq(num.diagonal(b), [1,6,3,8])
>       assert eq(num.diagonal(b, -1), [5,2,7])
>       c = num.array([b,b])
>       assert eq(num.diagonal(c,1), [[2,7,4], [2,7,4]])
> - no a.toscalar() !!!

a.toscalar() is written a[()] in numarray.
[This is one method that shouldn't be there IMO. What would people 
expect it to do for arrays with  len>1 ?]   
> - matrixmultiply in the docs

OK. 
> - what's the point of swapaxes (i.e. why not have a generalized 
> in-place
>   transpose?)

It's a very common function in implementation of numarray/Numeric.
[In many cases it is far easier to use than an generalized transpose
(which does exist, but requires all axes to be explicitly given)] 
> - what's the point of innerproduct?

Compatibility. [For a while the flavor is: "dammit, why aren't you
compatible?" Now it's: "dammit, why are you compatible?"] 
> - indexing by a list is different from indexing by tuple (I haven't 
> had time
>   to look closely at the docs whether that's intentional)

It's intentional.  Indexing by a list is "array" indexing.  Indexing by
a tuple is not.  Thus, a 3D array by [1,2,3] is pulling out 2D blocks,
while (1,2,3) is pulling out a single scalar. [In particular, tuples
have a special meaning for indexing; this distinction is unavoidable 
since it is a Python language issue.] 
> - doesn't know about Numeric's bizzarre '\x0b' typecode

Me either.  Should we add this? [Not unless there is a good reason.
What's it for? Why are you using it (particularly since you called it
bizarre)?] 
> - numarray.sqrt.reduce([]) raises (sensibly) TypeError, not ValueError

Got lucky I guess. 
> - len(array(1)) or array(1)[0] won't work anymore (understandable, but
>   should be documented)

OK. 
> - (should maximim, minimum reduce to -inf and inf?)

Don't they? 
> - <built-in method reduce of _BinaryUFunc object at 0x82dfc9c> is not
>   a very helpful repr; should be possible to get to the ufunc itself

Doesn't this comment fly in the face of Python itself?
[I imagine it is possible, but why? repr(dir) doesn't give you a usable
function creator, nor does it work in Numeric.] 
> - as in Numeric numarray.maximum.reduce(numarray.array([0,-0.])) => 
> -0.0

Talk about fine points...  noted.  I think the problem is that 0.0 ==
-0.0,  so there's no way for the reduction to get it right without
adding special code to look for this case, and that isn't gonna happen
without a strong case being made. [Again, a very good case needs to be
made for handling this. I doubt that it is important to many, and as
Todd mentions, not easy to handle.] 
> - __array__ protocol no longer supported (how can a non-derived class 
> convert
>   itself efficiently to an array?)

Maybe an old-timer can explain how this worked for Numeric.  I think
this is only partially implemented in numarray and that maybe we need to
add a check for an __array__() method to numarray.array(). 
> Documentation Gotchas
> - p. 34 IMO row vector is used incorrectly; row and column vectors are
> really
>      matrices (i.e. have rank 2) so ``array([[1,2,3]])`` would be a 
> row vector

Sounds reasonable. 
> - No proper explanation of differences between Numeric and numarray,
> or
>   numarray.numeric module differences to proper (e.g. argmin)

If there is,  I don't know where it is.  Noted,  but I'm not really an
encyclopedia of these facts myself. 
> - No migration and best-practice advice (e.g. there should be a 
> standard way
>   for packages which work with both numarray and numeric as backends 
> to let
>   the user choose his preference; how about setting an environment var
> NumPy
>   or something?)

We're just working this out ourselves. [Let me elaborate more. We 
haven't really had much experience yet porting tons of Numeric code (MA
is about the only example). We are working on scipy now so I expect that
in a few months we will know much better what the most important porting
issues are. At the moment, this is better documented by others.] 
> Waffle
[meaning?] 
> ------
> 
> - there *really* ought to be an array equality function (with optional
>   tolerance); it's quite difficult to get right for are normal user 
> (nans;
>   zero-size arrays etc.) and it's often required, especially for 
> testing

You're right.  Want submit one? [Make sure it isn't dependent on the
underlying C compiler's libraries for testing floating point special 
values!] 
> - rank preserving reduction seems useful as an option would be nice --
> e.g. to
>   subtract out or divide by the reduced portion (which currently won't
> e.g.
>   work for columns without adding a unit-dimension by hand).

Sounds like an interesting idea,  but also method bloat. 
> Design
> 
>   The (AFAICS) benefit-free but downside-rich introduction of `type`
>   ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
> 
>   Is there any reason that Typecode objects that compare as desired to
> the
>   relevant strings ("i", "d") wouldn't have done? Now there is an 
> explosion
>   and confusion of interfaces -- some numpy code will now only except
>   type(code)s as "typecode" keyword parameter (even in numarray! see
>   numarray.mlab!) and other stuff
> 
>   Never mind that type already is a highly overused word in the python
> world.

Personally,  I like type because it's succinct and we have type objects,
not single character codes.  More importantly,  Perry likes type, and
the bottom line is that it's his shot to call and he's called it.
[We wrestled with this a while. Given that the representation of the
type had changed from a character code, typecode is clearly misleading
and inappropriate. It is there only for backward compatibility; for new
code to be used under numarray only, people shouldn't use it. Type
certainly seemed by far the most descriptive and accurate term. It does
have the drawback of overloading the type function. Other considerations
were things like atype, but type is what we went with.] 
>   The big method bloat.
>   '''''''''''''''''''''
> 
>   As it says in the Numeric manual introductions there were "good 
> reasons" for

I actually don't buy the reasons myself.  Some methods are natural,
convenient, and good so I need to hear more voices arguing this point
before I'll budge.  Clearly there is *some* bloat,  but identifying what
to ax is more difficult.  I suppose we could do a vote to clean this up.
>   "very few array methods" -- now there are **56** public methods and 
> 8 public
>   attributes (public == not starting with '_'); of those 56 methods 
> about 11
>   are accessors and of the rest about half are redundant or worse 
> (i.e. they
>   either also exist as numarray functions (argmin, argmax, diagonal, 
> ...) or

Which of the public attributes do you have a problem with?

Which accessors? 
>   they really ought to be functions (mean, stddev) or they are quite 
> confusing

The need for these is common so I thought it would be good to add them.
Functions could be added as well. 
>   (``a.min``, ``a.max``

These require tricks to get right so we added them.  The doc-strings
explain what they do. 
> which behave quite differenlty from ``a.argmin`` and
>   ``a.argmax``,

Good point.  These are inconsistent with min and max, which were added
independently at a later date.  I'm thinking we should deprecate the
argmin and argmax methods,  which I added hoping to do polymorphism for
strings and records and if I recall correctly never did anyway.

IMHO,  min(), max(), mean(), and stddev() are simple, useful, and should
remain. 
> never mind ``numarray.minimum``) or

min != minimum, and because it is a little tricky to get right, we
codified it as a method. 
> simply utterly pointless
>   (``a.nelements`` == ``a.size``)).

I added nelements() because I needed it and didn't know about
a.size()... simple as that.  a.size() came later for compatibility
only. [I'll argue that nelements is far clearer in meaning. What
does size mean? Total bytes? Total number of elements? Sorry,
I disagree on this one.] 
> If there really is a need for these (there isn't) if anything a.min 
> and a.max
>     should be called a.flatmin, a.flatmax

flatmin is certainly clear,  but the min/max docstrings also explain it
with no fuzz. 
>   - diagonal, mean, nelements, nonzero, ...

nonzero(), and diagonal() I could care less about so they
can probably be deprecated and removed.  I like mean(). 
>   - perversely the **only** function that I can think off that could 
> have
>     sensibly become a method hasn't: ``put`` (it used to work only on 
> arrays
>     under Numeric and not without reason, so making it a method would 
> have
>     been sensible; numarray.put of course also "works" on non-arrays, 
> it just
>     doesn't do anything with them)

Well,  we need the numarray.put() function for compatibility, and
there's already a more succinct syntax for put(), which is array based
indexing so I don't see any point in adding a put() method. 
>   Test Code
>   '''''''''
>   numtest.py doesn't inspire full confidence (it's about 1000 lines of
> actual
>   code but it doesn't seem that clearly structured and AFAICT contains
> no
>   single loop (and that despite the diversity of shapes, types etc. 
> that exist
>   in numarray -- why not try something slightly more systematic?))

Testing could certainly be better.  unittest might work better for this
kind of thing than doctest.  I agree that we should test for a wider
variety of shapes, types, sizes, and behaviors but it takes time and
effort to do it so it hasn't been done yet.  There's little doubt we'd
find bugs and the system would be better for it. [On the other hand,
is it the most important thing to do next? Any volunteers to improve the
test suite? It may not be the most complete and systematic one out 
there, but it's at least as good as the one for Numeric ;-)]

There's a lot of input here.  We'll see what we can do.  Thanks again.

Regards,
Todd

[A few more editorial comments. When we started numarray, compatibility 
was not high on the list of priorities, so the initial implementation
didn't focus on it. A number of the problems you point out reflect that
origin. While it is more important, it isn't the only guide. We seek
compatibility when there is no strong reason to be incompatible. But
there are a number of issues where we definitely wanted different
behavior (if it were to be completely compatible, we wouldn't have
bothered in the first place; we needed some changes).

Given the odd corners you've run into, it makes me curious to see the
code that generated this; particularly with regard to rank-0 arrays.
If I get a chance I'll take a look at the link you provided.
I wonder if it is typical of what other users will encounter or not.
I guess our experience in porting scipy will give us a better 
indication.

To summarize what we see as work that should be done to address the 
points
made:

rank-0 issues:
1) a.imag doesn't work
2) array(0, type=Float64) + 1e3000  => `inf` with right error modes
      but  array(0, type=Float32) + 1e3000 => `OverflowError`
3) numarray.array(10)/numarray.array(0) => 0
4) numarray.array(10000000000000L) => array(1316134912)
5) numarray.where(0,1,0) => array([0])
6) documentation of behavior (how to turn into scalar, that len and [0] 
indexing
      doesn't work, etc.)

Others

1) puts into lists should raise Type error
   l = [1,2,3]; numarray.put(l,numarray.array([1,2,0]),[0,0,0]); l => 
[1, 2, 3]
2) repr for zero length arrays needs to show type and other info.
3) rip out _align and _contiguous self-test hacks
4) improved object array handling (e.g., where and the like)
5) average function
6) change MemoryError to ValueError for ones(-5)
7) document matrixmultiply
8) support for __array__ protocol?
9) Documentation fix for p34 row vector usage.
10) Numeric to numarray conversion guide
11) Better tests

Most of these are not likely to get immediate attention as our focus 
now is on integrating scipy. To the extent they make it easier to do,
their priority may be raised. There are a lot of "should"s but we have
limited resources just like anyone else; we can't do it all at once.]


From jmiller at stsci.edu  Thu Oct 14 06:11:22 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 14 06:11:22 2004
Subject: [Numpy-discussion] character arrays supported by C API?
In-Reply-To: <Pine.LNX.4.61.0410140405380.2667@Chrestomanci>
References: <Pine.LNX.4.61.0409150021150.20224@Chrestomanci>
	 <1095253587.4624.380.camel@halloween.stsci.edu>
	 <Pine.LNX.4.61.0410140405380.2667@Chrestomanci>
Message-ID: <1097759076.4219.39.camel@halloween.stsci.edu>

On Thu, 2004-10-14 at 04:20, Faheem Mitha wrote:
> On Wed, 15 Sep 2004, Todd Miller wrote:
> 
> > On Wed, 2004-09-15 at 00:52, Faheem Mitha wrote:
> >> Dear People,
> >>
> >> Are character arrays supported by the Numarray C API? My impression from
> >> the documentation is no, but I would appreciate a confirmation. Thanks.
> >>
> >>                                                                 Faheem.
> >
> > Yes and no.  CharArray is not as well supported from C as NumArray;
> > there are no easy to call functions which will convert a nested sequence
> > of strings into a CharArray.
> >
> > However,  it is possible to call the Python functions in the CharArray
> > module from C,  and a pre-existing CharArray is a PyArrayObject so it
> > can be manipulated in C as a struct;  it's shape and strides are
> > visible,  it's itemsize is the length of the string, etc.
> >
> > What is it you want to do?   What functions do you think would help?
> 
> Hi. Sorry about the slow reply.
> 
> What I want to do is extremely simple. I want to convert (in C++) a C++ 
> character array to a CharArray. The simplest way of doing this would be to 
> create an array of the appropriate size, and write character strings into 
> it element by element.
> 
> So, a utility function which creates a character array of appropriate 
> dimensions would be useful. Also a utility function which convert a list 
> of strings into a Character Array would also be desirable.
> 
> Currently I am having to work around this limitation by returning lists of 
> strings back to Python. I'd prefer to not have to do that.

That's a sensible addition,  but right now,  such a function does not
exist, and I don't have time to add it myself.  The way to achieve this
without C-API support by CharArray is to do a Python callback.  The
steps in C would be roughly:

0. Import the numarray.strings module.  PyImport_ImportModule().

1. Get the module's dictionary object.  PyModule_GetDict().

2. Get a pointer to CharArray by looking it up in the dictionary.  
PyDict_GetItemString().

3. Construct an argument tuple which contains the constructor
parameters.  Py_BuildValue().

4. Call the constructor using the arg tuple.  The return value is the
CharArray.  PyObject_CallFunction().

Similar steps are done for NumArray in the current C-API in newarray.ch
in NA_NewAllFromBuffer().  

Regards,
Todd


From akulla at comcast.net  Thu Oct 14 06:44:20 2004
From: akulla at comcast.net (akulla at comcast.net)
Date: Thu Oct 14 06:44:20 2004
Subject: [Numpy-discussion] Slow operation of nd_image.generic_filter
Message-ID: <101420041338.1510.416E8157000325EE000005E622007456720E04049A050E@comcast.net>

Hi all,

Could it be that the execution of the following function lasts more than 25 seconds, for an array of shape (256, 480)? 

...
def myFunc(anArray, winSize=5):
    return numarray.nd_image.generic_filter(\
                input=anArray,
                function=lambda win: win.mean(),
                size=winSize,
                mode='constant')
...

Python 2.3, numarray 1.0 (XP, P4)

Regards,
Alban


From falted at pytables.org  Fri Oct 15 04:27:55 2004
From: falted at pytables.org (Francesc Alted)
Date: Fri Oct 15 04:27:55 2004
Subject: [Numpy-discussion] numarray and ATLAS
Message-ID: <200410151318.40035.falted@pytables.org>

Hi,

Perhaps this is a too recurrent subject, but I'm having problems when
making numarray to use ATLAS instead of the mini-lapack included.

I've installed ATLAS 3.6.0 on my pentium IV machine. I've made it a
completely featured LAPACK by following the instructions in:

http://math-atlas.sourceforge.net/errata.html#completelp

and I'm pretty sure that the resulting library works. Now, after exporting
USE_LAPACK and set the appropiate directory for lapack_dirs in addons.py,
the compilation went well (however, I can see that lapack_litemodule.c is
still being compiled, and I don't know if that's normal or not). The command
I've used to install is:

$ python setup.py install --gencode --home=/users/exp/alted/bin-i686

And the error that happens during the test phase follows:

$ python
Python 2.3.4 (#1, Jul 22 2004, 20:47:54)
[GCC 3.3.2 20031022 (Red Hat Linux 3.3.2-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numarray.testall as testall
>>> testall.test()
numarray:                               ((0, 1199), (0, 1199))
numarray.records:                       (0, 48)
numarray.strings:                       (0, 176)
numarray.memmap:                        (0, 82)
numarray.objects:                       (0, 105)
numarray.memorytest:                    (0, 16)
numarray.examples.convolve:             ((0, 20), (0, 20), (0, 20), (0, 20))
numarray.convolve:                      (0, 52)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/users/exp/alted/bin-i686/lib/python/numarray/testall.py", line 24, in test
    result = eval(p+".test()")
  File "<string>", line 0, in ?
  File "/users/exp/alted/bin-i686/lib/python/numarray/fft/FFT.py", line 326, in test
    import dtest
  File "/users/exp/alted/bin-i686/lib/python/numarray/fft/dtest.py", line 238, in ?
    import numarray.random_array as random_array
  File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/__init__.py", line 7, in ?
    from RandomArray2 import *
  File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/RandomArray2.py", line 3, in ?
    import numarray.linear_algebra as linalg
  File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/__init__.py", line 1, in ?
    from LinearAlgebra2 import *
  File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/LinearAlgebra2.py", line 23, in ?
    import lapack_lite2
ImportError:
/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/lapack_lite2.so:
undefined symbol: dgesdd_

I've checked that dgesdd symbol exists on my liblapack.a:

$ strings ~/bin-i686/lib/atlas/liblapack.a | grep dgesdd
dgesdd.o/       1097832195  2514  515   100644  13788     `

but not a dgesdd_, as you can see. 

I'm missing something?

-- 
Francesc Alted


From falted at pytables.org  Fri Oct 15 10:07:40 2004
From: falted at pytables.org (Francesc Alted)
Date: Fri Oct 15 10:07:40 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <200410151318.40035.falted@pytables.org>
References: <200410151318.40035.falted@pytables.org>
Message-ID: <200410151903.41288.falted@pytables.org>

Hi,

Despite de fact that some errors arise, I've checked the numarray version
linked against ATLAS, and it seems like it doesn't get the expected ATLAS
boost:

>>> import timeit
>>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)")
>>> t1.repeat(3,10)
[3.7274820804595947, 3.8542821407318115, 3.7117569446563721]

However, Numeric seems to get it:

>>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))")
>>> t3.repeat(3,10)
[0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281]

i.e. almost 300 faster than numarray

Anyone is getting the acceleration boost with numarray & ATLAS?

Cheers,

A Divendres 15 Octubre 2004 13:18, Francesc Alted va escriure:
> Hi,
> 
> Perhaps this is a too recurrent subject, but I'm having problems when
> making numarray to use ATLAS instead of the mini-lapack included.
> 
> I've installed ATLAS 3.6.0 on my pentium IV machine. I've made it a
> completely featured LAPACK by following the instructions in:
> 
> http://math-atlas.sourceforge.net/errata.html#completelp
> 
> and I'm pretty sure that the resulting library works. Now, after exporting
> USE_LAPACK and set the appropiate directory for lapack_dirs in addons.py,
> the compilation went well (however, I can see that lapack_litemodule.c is
> still being compiled, and I don't know if that's normal or not). The command
> I've used to install is:
> 
> $ python setup.py install --gencode --home=/users/exp/alted/bin-i686
> 
> And the error that happens during the test phase follows:
> 
> $ python
> Python 2.3.4 (#1, Jul 22 2004, 20:47:54)
> [GCC 3.3.2 20031022 (Red Hat Linux 3.3.2-1)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numarray.testall as testall
> >>> testall.test()
> numarray:                               ((0, 1199), (0, 1199))
> numarray.records:                       (0, 48)
> numarray.strings:                       (0, 176)
> numarray.memmap:                        (0, 82)
> numarray.objects:                       (0, 105)
> numarray.memorytest:                    (0, 16)
> numarray.examples.convolve:             ((0, 20), (0, 20), (0, 20), (0, 20))
> numarray.convolve:                      (0, 52)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/users/exp/alted/bin-i686/lib/python/numarray/testall.py", line 24, in test
>     result = eval(p+".test()")
>   File "<string>", line 0, in ?
>   File "/users/exp/alted/bin-i686/lib/python/numarray/fft/FFT.py", line 326, in test
>     import dtest
>   File "/users/exp/alted/bin-i686/lib/python/numarray/fft/dtest.py", line 238, in ?
>     import numarray.random_array as random_array
>   File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/__init__.py", line 7, in ?
>     from RandomArray2 import *
>   File "/users/exp/alted/bin-i686/lib/python/numarray/random_array/RandomArray2.py", line 3, in ?
>     import numarray.linear_algebra as linalg
>   File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/__init__.py", line 1, in ?
>     from LinearAlgebra2 import *
>   File "/users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/LinearAlgebra2.py", line 23, in ?
>     import lapack_lite2
> ImportError:
> /users/exp/alted/bin-i686/lib/python/numarray/linear_algebra/lapack_lite2.so:
> undefined symbol: dgesdd_
> 
> I've checked that dgesdd symbol exists on my liblapack.a:
> 
> $ strings ~/bin-i686/lib/atlas/liblapack.a | grep dgesdd
> dgesdd.o/       1097832195  2514  515   100644  13788     `
> 
> but not a dgesdd_, as you can see. 
> 
> I'm missing something?
> 

-- 
Francesc Alted


From dd55 at cornell.edu  Fri Oct 15 14:18:41 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Fri Oct 15 14:18:41 2004
Subject: [Numpy-discussion] how to deal with large arrays
Message-ID: <200410151714.38492.dd55@cornell.edu>

Hello,

I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to sum 
q(r) over R, so I take a dot product RQ and then sum along one axis to get a 
1-by-q result.

I'm doing this with dot products because it is much faster than the equivalent 
for or while loop. The intermediate r-by-q array can get very large though 
(200MB in my case), so I was wondering if there is a better way to go about 
it?

If not, I can slice up R and deal with it one chunk at a time, then the 
intermediate arrays fit within the available system resources. Would somebody 
offer a suggestion of how to do this intelligently? Should the intermediate 
array be about the size of the processor cache, some fraction of the 
available memory, or is there something else I need to consider?

Thank you,
Darren


From tim.hochberg at cox.net  Fri Oct 15 15:11:05 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Fri Oct 15 15:11:05 2004
Subject: [Numpy-discussion] how to deal with large arrays
In-Reply-To: <200410151714.38492.dd55@cornell.edu>
References: <200410151714.38492.dd55@cornell.edu>
Message-ID: <41704A3C.5080802@cox.net>

Darren Dale wrote:

>Hello,
>
>I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to sum 
>q(r) over R, so I take a dot product RQ and then sum along one axis to get a 
>1-by-q result.
>
>I'm doing this with dot products because it is much faster than the equivalent 
>for or while loop. The intermediate r-by-q array can get very large though 
>(200MB in my case), so I was wondering if there is a better way to go about 
>it?
>  
>
I think so. I believe you are doing something like this:

   result_1 = na.sum(na.dot(R,Q), 0)

I'm fairly certain (but I urge you to double check), that this reduces to:

    result_2 = na.dot(na.sum(R, 0), Q)

which will take up much less intermediate storage and be faster to boot. 
In more quasi-mathematical notations:

   result_1 => sum_i  sum_j  R_ij Qjk = sum_j sum_i R_ij Q_jk = sum_j 
Q_jk sum_i R_ij => result_2

A quick test seems to confirm this:

import numarray as na
from numarray import random_array

q = 10
r = 12

R = random_array.random((r,3))
Q = random_array.random((3,q))

x1 = na.sum(na.dot(R,Q), 0)
x2 = na.dot(na.sum(R, 0), Q)

print na.allclose(x1, x2)


-tim


>If not, I can slice up R and deal with it one chunk at a time, then the 
>intermediate arrays fit within the available system resources. Would somebody 
>offer a suggestion of how to do this intelligently? Should the intermediate 
>array be about the size of the processor cache, some fraction of the 
>available memory, or is there something else I need to consider?
>
>Thank you,
>Darren
>
>
>-------------------------------------------------------
>This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
>Use IT products in your business? Tell us what you think of them. Give us
>Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
>http://productguide.itmanagersjournal.com/guidepromo.tmpl
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>  
>


From dd55 at cornell.edu  Fri Oct 15 16:29:03 2004
From: dd55 at cornell.edu (Darren Dale)
Date: Fri Oct 15 16:29:03 2004
Subject: [Numpy-discussion] how to deal with large arrays
In-Reply-To: <41704A3C.5080802@cox.net>
References: <200410151714.38492.dd55@cornell.edu> <41704A3C.5080802@cox.net>
Message-ID: <200410151927.54005.dd55@cornell.edu>

Thank you for your response, Tim,

On Friday 15 October 2004 06:07 pm, Tim Hochberg wrote:
> Darren Dale wrote:
> >Hello,
> >
> >I have two 2D arrays, Q is 3-by-q and R is r-by-3. At each q, I need to
> > sum q(r) over R, so I take a dot product RQ and then sum along one axis
> > to get a 1-by-q result.
> >
> >I'm doing this with dot products because it is much faster than the
> > equivalent for or while loop. The intermediate r-by-q array can get very
> > large though (200MB in my case), so I was wondering if there is a better
> > way to go about it?
>
> I'm fairly certain (but I urge you to double check), that this reduces to:
>
>     result_2 = na.dot(na.sum(R, 0), Q)
>

Yes. As usual, I left out a bit of information that turned out to be 
important. See below 

A modified test:

from numarray import *
from numarray import random_array

q = 10
r = 12

R = random_array.random((r,3))
Q = random_array.random((3,q))

x1 = sum( exp(1j*dot(R,Q)), 0) #note complex argument to exp()
x2 = exp(1j*dot(sum(R, 0), Q))

print allclose(x1, x2)

The complex arithmetic changes things. I am still learning how to keep my code 
efficient. The following code is actually almost as fast as using the large 
dot product, apparently I had some other sinks in my original tests:

phase = zeros(len(Q[0]),'d')
for i in range(len(Q[0])):
    phase[i] = phase[i] + sum(exp(1j*dot(R,Q[:,i])), 0)

If q=1000 and r=2500, the for loop takes about 13% longer than the dot product 
method. Incredibly, if q=10,000 and r=2500, the for loop is 17% faster. So I 
am going to use it instead. Apparently I had some other time sink in my 
original test.

from numarray import *
from numarray import random_array
from time import clock

q = 10000
r = 2500

R = random_array.random((r,3))
Q = random_array.random((3,q))

t0 = clock()
x1 = sum(exp(1j*dot(R,Q)), 0) #note complex argument to exp()
t1 = clock()
dt1 = t1-t0

phase = zeros(len(Q[0]),'d')
for i in range(len(Q[0])):
    phase[i] = phase[i] + sum(exp(1j*dot(R,Q[:,i])), 0)
    
t2 = clock()
dt2 = t2-t1

print (dt2-dt1)/dt1

-- 

Darren


From falted at pytables.org  Sat Oct 16 04:29:02 2004
From: falted at pytables.org (Francesc Alted)
Date: Sat Oct 16 04:29:02 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <200410151903.41288.falted@pytables.org>
References: <200410151318.40035.falted@pytables.org> <200410151903.41288.falted@pytables.org>
Message-ID: <200410161327.47485.falted@pytables.org>

A Divendres 15 Octubre 2004 19:03, Francesc Alted va escriure:
> >>> import timeit
> >>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)")
> >>> t1.repeat(3,10)
> [3.7274820804595947, 3.8542821407318115, 3.7117569446563721]
> 
> However, Numeric seems to get it:
> 
> >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))")
> >>> t3.repeat(3,10)
> [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281]
> 
> i.e. almost 300 faster than numarray

Ooops! The Numeric test had a bug on it. The correct test would be:

>>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');m1=Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');m2=Numeric.reshape(m2,(dim1,dim1))")
>>> t3.repeat(3,10)
[0.47363090515136719, 0.47403502464294434, 0.47770595550537109]

which is 8 times faster, more or less, than numarray (or Numeric) without
ATLAS.

Just to clarify things ;)

-- 
Francesc Alted


From aisaac at american.edu  Sat Oct 16 15:53:01 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Sat Oct 16 15:53:01 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <1097609991.30171.556.camel@halloween.stsci.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu><1097550159.2568.5.camel@localhost.localdomain><Mahogany-0.66.0-1780-20041012-100006.00@american.edu><1097595580.24491.4.camel@freyer.sfo.csun.edu><Mahogany-0.66.0-1780-20041012-152445.00@american.edu><1097609991.30171.556.camel@halloween.stsci.edu>
Message-ID: <Mahogany-0.66.0-1432-20041016-185312.00@american.edu>

On 12 Oct 2004, Todd Miller apparently wrote:
> Go here:
> http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse
> then "Submit New", and set the "category" to "documentation.


Done.

Thanks,
Alan Isaac


From aisaac at american.edu  Sat Oct 16 15:53:02 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Sat Oct 16 15:53:02 2004
Subject: [Numpy-discussion] matrixmultiply: return type
Message-ID: <Mahogany-0.66.0-1432-20041016-185314.00@american.edu>

Being new to numerical Python applications,
I was a little puzzled/concerned when I read
http://sourceforge.net/tracker/index.php?func=detail&aid=984368&group_id=1369&atid=450446
I *think* the answer is:
matrixmultiply will always return an array.

Is there a stable view about what type of object will be
returned by matrixmultiply?  Currently, to my initial
surprise, it returns an array when the arguments are
matrices.  Is this stable?

Might an optional argument to specify the return type
be desirable?

Thank you,
Alan Isaac


From jmiller at stsci.edu  Sat Oct 16 18:27:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Sat Oct 16 18:27:04 2004
Subject: [Numpy-discussion] documentation error
In-Reply-To: <Mahogany-0.66.0-1432-20041016-185312.00@american.edu>
References: <Mahogany-0.66.0-1780-20041010-185314.00@american.edu>
	 <1097550159.2568.5.camel@localhost.localdomain>
	 <Mahogany-0.66.0-1780-20041012-100006.00@american.edu>
	 <1097595580.24491.4.camel@freyer.sfo.csun.edu>
	 <Mahogany-0.66.0-1780-20041012-152445.00@american.edu>
	 <1097609991.30171.556.camel@halloween.stsci.edu>
	 <Mahogany-0.66.0-1432-20041016-185312.00@american.edu>
Message-ID: <1097976412.3744.159.camel@localhost.localdomain>

On Sat, 2004-10-16 at 17:17, Alan G Isaac wrote:
> On 12 Oct 2004, Todd Miller apparently wrote:
> > Go here:
> > http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse
> > then "Submit New", and set the "category" to "documentation.
> 
> 
> Done.
> 
> Thanks,
> Alan Isaac

As it turns out,  I misdirected you.  The above link is for numarray
bugs.  This link is for Numeric bugs:

http://sourceforge.net/tracker/?group_id=1369&atid=101369

I moved the diagonal doc bug report to the Numeric bugs tracker.

Regards,
Todd


From jmiller at stsci.edu  Sat Oct 16 18:50:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Sat Oct 16 18:50:04 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <200410161327.47485.falted@pytables.org>
References: <200410151318.40035.falted@pytables.org>
	 <200410151903.41288.falted@pytables.org>
	 <200410161327.47485.falted@pytables.org>
Message-ID: <1097977801.3744.184.camel@localhost.localdomain>

On Sat, 2004-10-16 at 07:27, Francesc Alted wrote:
> A Divendres 15 Octubre 2004 19:03, Francesc Alted va escriure:
> > >>> import timeit
> > >>> t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)")
> > >>> t1.repeat(3,10)
> > [3.7274820804595947, 3.8542821407318115, 3.7117569446563721]
> > 
> > However, Numeric seems to get it:
> > 
> > >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))")
> > >>> t3.repeat(3,10)
> > [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281]
> > 
> > i.e. almost 300 faster than numarray
> 
> Ooops! The Numeric test had a bug on it. The correct test would be:
> 
> >>> t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');m1=Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');m2=Numeric.reshape(m2,(dim1,dim1))")
> >>> t3.repeat(3,10)
> [0.47363090515136719, 0.47403502464294434, 0.47770595550537109]
> 
> which is 8 times faster, more or less, than numarray (or Numeric) without
> ATLAS.
> 
> Just to clarify things ;)

Hi Francesc,

I don't think numarray dot() will pick up any boost at all from ATLAS
because it's not written to do it.   Besides that,  there are two
performance problems I know of with numarray's dot() which may dominate
or dilute any ATLAS benefits:

1. dot() requires array creation.

2. dot() requires array copies.

Because it has a class hierarchy and a memory buffer object,  numarray
is at a disadvantage for (1).  (2) just hasn't been optimized yet for
noncontiguous arrays which (I think) are always present when dot()
starts with two contiguous array parameters.

Regards,
Todd


From stephen.walton at csun.edu  Sun Oct 17 17:35:03 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Sun Oct 17 17:35:03 2004
Subject: [Numpy-discussion] New LAPACK and ScaLAPACK planned
Message-ID: <1098059497.5110.5.camel@localhost.localdomain>

From volume 4 #37 of the NA-Digest mailing list.  I hope this is of
enough interest to this list to justify the cross post.

From dongarra at cs.utk.edu  Fri Oct 15 04:10:44 2004
From: dongarra at cs.utk.edu (Jack Dongarra)
Date: Fri, 15 Oct 2004 04:10:44 -0400
Subject: New Release of LAPACK and ScaLAPACK Planned
Message-ID: <mailman.41.1490332408.18468.numpy-discussion@python.org>

New Release of LAPACK and ScaLAPACK planned.

We are pleased to announce that we recently received NSF funding for new
releases of the LAPACK and ScaLAPACK linear algebra libraries.
The proposal pointed out the new and better algorithms that have been
developed by many people in the community since the first  releases of
these libraries, as well as more obvious gaps and possible improvements.

The proposal listed a large number of activities, which we now need to
prioritize. There are a number of design decisions that still need to be
made, for which we are interested in your input. For this purpose, we
would like to remind you of a web page to collect your input that we
originally announced on NA-Digest while we were preparing the proposal:

    http://icl.cs.utk.edu/lapack-survey.html

In addition to the questions on that form, we are interested in your
opinion on all aspects of the proposal, a copy of which you may find at:

    http://www.cs.berkeley.edu/~demmel/Sca-LAPACK-Proposal.pdf

Thanks,
Jim Demmel and Jack Dongarra
--=20
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, CSU Northridge

--=-vf5K3It096b9Vx529EKP
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQBBcw7pURWByv7S9xcRAms0AJ0YE13AXJ127J/5UVRs2t+BUYMIUQCgnd8I
kvjNlPBX6phVfhjclKGExPY=
=1kTj
-----END PGP SIGNATURE-----

--=-vf5K3It096b9Vx529EKP--


From falted at pytables.org  Mon Oct 18 01:30:01 2004
From: falted at pytables.org (Francesc Alted)
Date: Mon Oct 18 01:30:01 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <1097977801.3744.184.camel@localhost.localdomain>
References: <200410151318.40035.falted@pytables.org> <200410161327.47485.falted@pytables.org> <1097977801.3744.184.camel@localhost.localdomain>
Message-ID: <200410181029.14879.falted@pytables.org>

Hi Todd,

A Diumenge 17 Octubre 2004 03:50, Todd Miller va escriure:
> I don't think numarray dot() will pick up any boost at all from ATLAS
> because it's not written to do it.   Besides that,  there are two
> performance problems I know of with numarray's dot() which may dominate
> or dilute any ATLAS benefits:
> 
> 1. dot() requires array creation.

Yes, but my guess is that for large arrays, this time should be negligible
compared with the multiplication time.

> 2. dot() requires array copies.

Mmm, you mean even for well-behaved arrays? Sorry, but I don't understand
why.

May I ask if there is any plan to complete a better integration of external
LAPACK libraries in numarray or this is considered low priority?

Never mind, I don't need this functionality right now. It's just that I'm
preparing a series of 'hands-on' sessions about Python and Scientific
Computing, and I was trying to understand the current advantages and
limitations of numarray compared with NumPy.

Cheers,

-- 
Francesc Alted


From jmiller at stsci.edu  Mon Oct 18 04:53:01 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Oct 18 04:53:01 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <200410181029.14879.falted@pytables.org>
References: <200410151318.40035.falted@pytables.org>
	 <200410161327.47485.falted@pytables.org>
	 <1097977801.3744.184.camel@localhost.localdomain>
	 <200410181029.14879.falted@pytables.org>
Message-ID: <1098100329.3741.96.camel@localhost.localdomain>

On Mon, 2004-10-18 at 04:29, Francesc Alted wrote:
> Hi Todd,
> 
> A Diumenge 17 Octubre 2004 03:50, Todd Miller va escriure:
> > I don't think numarray dot() will pick up any boost at all from ATLAS
> > because it's not written to do it.   Besides that,  there are two
> > performance problems I know of with numarray's dot() which may dominate
> > or dilute any ATLAS benefits:
> > 
> > 1. dot() requires array creation.
> 
> Yes, but my guess is that for large arrays, this time should be negligible
> compared with the multiplication time.
> 

Probably true.  I should measure this.  For small computations,  it's an
issue.

> > 2. dot() requires array copies.
> 
> Mmm, you mean even for well-behaved arrays? Sorry, but I don't understand
> why.

I looked at this some this morning,  trying to figure out why this is a
problem only for numarray.  It turns out that Numeric strides its arrays
to get around the copy.   When I implemented numarray,  I chose not to
stride because I thought it would be too slow...  Recently I realized
that one input array to dot() is *always* transposed and therefore
likely noncontiguous and therefore copied.  I think it's now possible to
simply port the Numeric code so I'll look into that.

> May I ask if there is any plan to complete a better integration of external
> LAPACK libraries in numarray or this is considered low priority?

Perry may answer this.  I have no immediate plans for it...  it does
sound like enough people need this that it should be done.

Regards,
Todd


From perry at stsci.edu  Mon Oct 18 05:21:02 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Mon Oct 18 05:21:02 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <1098100329.3741.96.camel@localhost.localdomain>
References: <200410151318.40035.falted@pytables.org> <200410161327.47485.falted@pytables.org> <1097977801.3744.184.camel@localhost.localdomain> <200410181029.14879.falted@pytables.org> <1098100329.3741.96.camel@localhost.localdomain>
Message-ID: <C98A8957-20FF-11D9-8F26-000A95B68E50@stsci.edu>

On Oct 18, 2004, at 7:52 AM, Todd Miller wrote:

> On Mon, 2004-10-18 at 04:29, Francesc Alted wrote:
>
>> May I ask if there is any plan to complete a better integration of 
>> external
>> LAPACK libraries in numarray or this is considered low priority?
>
> Perry may answer this.  I have no immediate plans for it...  it does
> sound like enough people need this that it should be done.
>
Like Todd says, it does sound like this needs to be done. I think it 
takes
a back seat to doing the scipy integration in general, but will need to
be addressed soon thereafter.

Perry


From frank.horowitz at csiro.au  Mon Oct 18 23:33:03 2004
From: frank.horowitz at csiro.au (Frank Horowitz)
Date: Mon Oct 18 23:33:03 2004
Subject: [Numpy-discussion] Numeric Underflow Exceptions: Recommendations?
Message-ID: <1098167541.8538.48.camel@localhost>

Hi all,

Using Numeric 23.5 I've been bitten by the dreaded 'floating point
underflow throws an "OverflowError: math range error" instead of
silently returning zero' bug.

My setup is Debian unstable (Sid) on an i386, and I am using Debian's
binary package "python-numeric".

I understand from googling past discussions that this is (used to be?)
phase-of-the-moon stuff, depending mostly upon architecture, options at
libm compilation time of libc6. Several references to a trick of adding
"-lieee" to the link list succeeding in taming the bug were mentioned
around the era of Python2.0. 

My questions are these: Is there some higher level way of dealing with
underflow now in Numeric? Or am I going to have to track down wherever
"-lieee" has disappeared to in Debian, and recompile Numeric in the
hopes that that still cures the problem?

Any other tricks up people's sleeves for dealing with this? (I already
know about exp_safe in Fernando Perez' IPython/numutils.py, BTW. I'm
kind of hoping for a library level fix though, since my code is littered
with "Numeric.exp()" calls.)

TIA for any help you might be able to provide!

Cheers,

	Frank Horowitz


From falted at pytables.org  Tue Oct 19 01:35:05 2004
From: falted at pytables.org (Francesc Alted)
Date: Tue Oct 19 01:35:05 2004
Subject: [Numpy-discussion] numarray and ATLAS
In-Reply-To: <1098100329.3741.96.camel@localhost.localdomain>
References: <200410151318.40035.falted@pytables.org> <200410181029.14879.falted@pytables.org> <1098100329.3741.96.camel@localhost.localdomain>
Message-ID: <200410191034.08018.falted@pytables.org>

A Dilluns 18 Octubre 2004 13:52, Todd Miller va escriure:
> > > 1. dot() requires array creation.
> > 
> > Yes, but my guess is that for large arrays, this time should be negligible
> > compared with the multiplication time.
> > 
> 
> Probably true.  I should measure this.  For small computations,  it's an
> issue.

Well, for small arrays ATLAS (or any other optimized LAPACK library) can't
probably do much better than lapack lite, so I think you should not worry
about this anyway.

> Perry may answer this.  I have no immediate plans for it...  it does
> sound like enough people need this that it should be done.

Ok. Thanks for information,

-- 
Francesc Alted


From flin at broadpark.no  Wed Oct 20 02:23:30 2004
From: flin at broadpark.no (Frank Lindseth)
Date: Wed Oct 20 02:23:30 2004
Subject: [Numpy-discussion] Problems compiling numeric using python2.4 and VS.Net 2003
Message-ID: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>

Hi,
 
I need numeric in a python2.4 / win32 project.
 
Is there a binary installer somewhere?
 
I tried to compile it from source but ran into the following problem (se
below):
Where are the libs supposed to come from?
 
- Frank
 
C:\users\frankl\download\Numeric-23.5>c:\Python24\python.exe setup.py
install
running install
running build
running build_py
running build_ext
building 'lapack_lite' extension
C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link.exe /DLL
/nologo
 /INCREMENTAL:NO /LIBPATH:/usr/lib/atlas /LIBPATH:c:\Python24\libs
/LIBPATH:c:\P
ython24\PCBuild lapack.lib cblas.lib f77blas.lib atlas.lib g2c.lib
/EXPORT:initl
apack_lite build\temp.win32-2.4\Release\Src\lapack_litemodule.obj
/OUT:build\lib
.win32-2.4\lapack_lite.pyd
/IMPLIB:build\temp.win32-2.4\Release\Src\lapack_lite.
lib
LINK : fatal error LNK1181: cannot open input file 'lapack.lib'
error: command '"C:\Program Files\Microsoft Visual Studio .NET
2003\Vc7\bin\link
.exe"' failed with exit status 1181
 
C:\users\frankl\download\Numeric-23.5>
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041020/3bc48a5e/attachment-0001.html>

From stephen.walton at csun.edu  Wed Oct 20 09:22:37 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Wed Oct 20 09:22:37 2004
Subject: [Numpy-discussion] Problems compiling numeric using python2.4
	and VS.Net 2003
In-Reply-To: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>
Message-ID: <1098288859.7182.11.camel@sunspot.csun.edu>

On Wed, 2004-10-20 at 11:17 +0200, Frank Lindseth wrote:

> LINK : fatal error LNK1181: cannot open input file 'lapack.lib'

Edit setup.py, setting the variables library_dirs_list and
libraries_list to empty lists, and try again.

List:  shouldn't this be the default?  Right now Numeric looks for ATLAS
by default.

-- 
Stephen Walton, Professor of Physics and Astronomy,
California State University, Northridge
stephen.walton at csun.edu


From flin at broadpark.no  Wed Oct 20 11:46:27 2004
From: flin at broadpark.no (flin at broadpark.no)
Date: Wed Oct 20 11:46:27 2004
Subject: [Numpy-discussion] Problems compiling numeric using python2.4	and VS.Net 2003
In-Reply-To: <1098288859.7182.11.camel@sunspot.csun.edu>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC> <1098288859.7182.11.camel@sunspot.csun.edu>
Message-ID: <1098297610.4176b10a3b4d2@webmail.broadpark.no>

Thank you for the replay Stephen,
I did as you suggested:
library_dirs_list = []
libraries_list = [] 
#library_dirs_list = ['/usr/lib/atlas']
#libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] 

but it still woun't install (se below)
Any suggestions?


C:\users\frankl\download\Numeric-23.5>c:\Python24\python.exe setup.py install
running install
running build
running build_py
running build_ext
building 'lapack_lite' extension
C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin\link.exe /DLL
/nologo
 /INCREMENTAL:NO /LIBPATH:c:\Python24\libs /LIBPATH:c:\Python24\PCBuild
/EXPORT:
initlapack_lite build\temp.win32-2.4\Release\Src\lapack_litemodule.obj
/OUT:buil
d\lib.win32-2.4\lapack_lite.pyd
/IMPLIB:build\temp.win32-2.4\Release\Src\lapack_
lite.lib
   Creating library build\temp.win32-2.4\Release\Src\lapack_lite.lib and object
build\temp.win32-2.4\Release\Src\lapack_lite.exp
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgeev_
refere
nced in function _lapack_lite_dgeev
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dsyevd_
refer
enced in function _lapack_lite_dsyevd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zheevd_
refer
enced in function _lapack_lite_zheevd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgelsd_
refer
enced in function _lapack_lite_dgelsd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgesv_
refere
nced in function _lapack_lite_dgesv
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgesdd_
refer
enced in function _lapack_lite_dgesdd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dgetrf_
refer
enced in function _lapack_lite_dgetrf
lapack_litemodule.obj : error LNK2019: unresolved external symbol _dpotrf_
refer
enced in function _lapack_lite_dpotrf
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgeev_
refere
nced in function _lapack_lite_zgeev
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgelsd_
refer
enced in function _lapack_lite_zgelsd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgesv_
refere
nced in function _lapack_lite_zgesv
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgesdd_
refer
enced in function _lapack_lite_zgesdd
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zgetrf_
refer
enced in function _lapack_lite_zgetrf
lapack_litemodule.obj : error LNK2019: unresolved external symbol _zpotrf_
refer
enced in function _lapack_lite_zpotrf
build\lib.win32-2.4\lapack_lite.pyd : fatal error LNK1120: 14 unresolved
externa
ls
error: command '"C:\Program Files\Microsoft Visual Studio .NET
2003\Vc7\bin\link
.exe"' failed with exit status 1120

C:\users\frankl\download\Numeric-23.5>


Quoting Stephen Walton <stephen.walton at csun.edu>:

> On Wed, 2004-10-20 at 11:17 +0200, Frank Lindseth wrote:
> 
> > LINK : fatal error LNK1181: cannot open input file 'lapack.lib'
> 
> Edit setup.py, setting the variables library_dirs_list and
> libraries_list to empty lists, and try again.
> 
> List:  shouldn't this be the default?  Right now Numeric looks for ATLAS
> by default.
> 
> -- 
> Stephen Walton, Professor of Physics and Astronomy,
> California State University, Northridge
> stephen.walton at csun.edu
> 
> 


From stephen.walton at csun.edu  Wed Oct 20 12:09:00 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Wed Oct 20 12:09:00 2004
Subject: [Numpy-discussion] Problems compiling numeric using
	python2.4	and VS.Net 2003
In-Reply-To: <1098297610.4176b10a3b4d2@webmail.broadpark.no>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>
	 <1098288859.7182.11.camel@sunspot.csun.edu>
	 <1098297610.4176b10a3b4d2@webmail.broadpark.no>
Message-ID: <1098299055.7182.33.camel@sunspot.csun.edu>

On Wed, 2004-10-20 at 20:40 +0200, flin at broadpark.no wrote:
> Thank you for the replay Stephen,
> I did as you suggested:
> library_dirs_list = []
> libraries_list = [] 
> #library_dirs_list = ['/usr/lib/atlas']
> #libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] 
> 
> but it still woun't install (se below)
> Any suggestions?

I'm guessing you still have files left over from last time.  On Unix,
you can run the 'makeclean.sh' script.  On Windows, manually deleting
the directories listed in that script (they are all called build) should
do the trick.  Then try the 'setup.py build' again.

-- 
Stephen Walton, Professor of Physics and Astronomy,
California State University, Northridge
stephen.walton at csun.edu


From flin at broadpark.no  Wed Oct 20 15:59:45 2004
From: flin at broadpark.no (flin at broadpark.no)
Date: Wed Oct 20 15:59:45 2004
Subject: [Numpy-discussion] Problems compiling numeric using	python2.4	and VS.Net 2003
In-Reply-To: <1098299055.7182.33.camel@sunspot.csun.edu>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>  <1098288859.7182.11.camel@sunspot.csun.edu>  <1098297610.4176b10a3b4d2@webmail.broadpark.no> <1098299055.7182.33.camel@sunspot.csun.edu>
Message-ID: <1098312989.4176ed1d83de1@webmail.broadpark.no>

Thanks again Stephen.
Still no success.
I deleted the whole Numeric-directory-tree,
unzipped a newly downloaded src-file,
edited the setup.py as you suggested,
tried to run the installer,
same error.

I'm not sure what to du next?
(what canm't somebody just make a binary installer for python2.4,
after all it's in beta now...)

- Frank

--------

Quoting Stephen Walton <stephen.walton at csun.edu>:

> On Wed, 2004-10-20 at 20:40 +0200, flin at broadpark.no wrote:
> > Thank you for the replay Stephen,
> > I did as you suggested:
> > library_dirs_list = []
> > libraries_list = [] 
> > #library_dirs_list = ['/usr/lib/atlas']
> > #libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c'] 
> > 
> > but it still woun't install (se below)
> > Any suggestions?
> 
> I'm guessing you still have files left over from last time.  On Unix,
> you can run the 'makeclean.sh' script.  On Windows, manually deleting
> the directories listed in that script (they are all called build) should
> do the trick.  Then try the 'setup.py build' again.
> 
> -- 
> Stephen Walton, Professor of Physics and Astronomy,
> California State University, Northridge
> stephen.walton at csun.edu
> 
> 


From stephen.walton at csun.edu  Wed Oct 20 16:47:52 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Wed Oct 20 16:47:52 2004
Subject: [Numpy-discussion] Problems compiling numeric
	using	python2.4	and VS.Net 2003
In-Reply-To: <1098312989.4176ed1d83de1@webmail.broadpark.no>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>
	 <1098288859.7182.11.camel@sunspot.csun.edu>
	 <1098297610.4176b10a3b4d2@webmail.broadpark.no>
	 <1098299055.7182.33.camel@sunspot.csun.edu>
	 <1098312989.4176ed1d83de1@webmail.broadpark.no>
Message-ID: <1098315982.7159.2.camel@freyer.sfo.csun.edu>

On Wed, 2004-10-20 at 15:56, flin at broadpark.no wrote:
> Thanks again Stephen.
> Still no success.

Sorry.  Being a Linux user I'm afraid I can't help much.

> I'm not sure what to du next?

Download SciPy from http://www.scipy.org/?  It is much more than you
actually need, being all of Scientific Python as well as Numeric, but at
least it's an all-in-one installer.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041020/736d0e27/attachment-0001.sig>

From mdehoon at ims.u-tokyo.ac.jp  Wed Oct 20 21:40:04 2004
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Wed Oct 20 21:40:04 2004
Subject: [Numpy-discussion] Problems compiling numeric using	python2.4
 and VS.Net 2003
In-Reply-To: <1098312989.4176ed1d83de1@webmail.broadpark.no>
References: <000001c4b685$a7b6d7e0$0302a8c0@LDg24sPC>  <1098288859.7182.11.camel@sunspot.csun.edu>  <1098297610.4176b10a3b4d2@webmail.broadpark.no> <1098299055.7182.33.camel@sunspot.csun.edu> <1098312989.4176ed1d83de1@webmail.broadpark.no>
Message-ID: <41772BE1.5020403@ims.u-tokyo.ac.jp>

flin at broadpark.no wrote:
> Thanks again Stephen.
> Still no success.
> I deleted the whole Numeric-directory-tree,
> unzipped a newly downloaded src-file,
> edited the setup.py as you suggested,
> tried to run the installer,
> same error.

Previously I managed to compile Numeric for Python 2.4 on Windows, using the 
MinGW compiler and Atlas. If you still need it, I can send you the binaries.
> 
> I'm not sure what to du next?
> (what canm't somebody just make a binary installer for python2.4,
> after all it's in beta now...)

There is a bug in Python 2.4 that prevents users from running bdist_wininst to 
create a binary installer. python setup.py install fails too. See bug 1021756 on 
sourceforge.

--Michiel, U Tokyo.

-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon


From stark at tuebingen.mpg.de  Wed Oct 20 23:49:23 2004
From: stark at tuebingen.mpg.de (Sebastian Stark)
Date: Wed Oct 20 23:49:23 2004
Subject: [Numpy-discussion] Re: numarray and ATLAS
Message-ID: <200410210846.09275.stark@tuebingen.mpg.de>

> Perhaps this is a too recurrent subject, but I"m having problems when
> making numarray to use ATLAS instead of the mini-lapack included.

I had to change lapack_libs and lapack_dirs in addons.py to read: 

  lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm']
  lapack_dirs = ['/usr/local/lib/ATLAS']

I have all my .a files in /usr/local/lib/ATLAS so I can control which ones I'm 
actually linking against.

mosel ~ % ls -l /usr/local/lib/ATLAS
total 14608
-rw-r--r--    1 root     staff     7952316 Oct 20 10:03 libatlas.a
-rw-r--r--    1 root     staff      277592 Oct 20 10:03 libcblas.a
-rw-r--r--    1 root     staff      261060 Oct 20 10:45 libf2c.a
-rw-r--r--    1 root     staff      353278 Oct 20 10:03 libf77blas.a
-rw-r--r--    1 root     staff     5734736 Oct 20 10:42 liblapack.a
-rw-r--r--    1 root     staff      324968 Oct 20 10:03 libtstatlas.a


-Sebastian

(and yes, I get a significant speed boost from ATLAS)

-- 
Sebastian Stark -- http://www.kyb.tuebingen.mpg.de/~stark
Max Planck Institute for Biological Cybernetics
Spemannstr. 38, 72076 Tuebingen
Phone: +49 7071 601 555 -- Fax: +49 7071 601 552


From stark at tuebingen.mpg.de  Wed Oct 20 23:56:14 2004
From: stark at tuebingen.mpg.de (Sebastian Stark)
Date: Wed Oct 20 23:56:14 2004
Subject: [Numpy-discussion] indexing on uninitialized arrays
Message-ID: <200410210852.02285.stark@tuebingen.mpg.de>

In matlab I can do:

>> x = []

x =

     []

>> x(2) = 1.4

x =

         0    1.4000

>> x(2,4) = 2.9

x =

         0    1.4000         0         0
         0         0         0    2.9000


which means x expands as necessary depending on "how far" my indexing goes. 

Now I'm thinking about how to realize this with numarray. I could imagine to 
define a derived array type "SelfInflatingArray" which catches the IndexError 
exception and does the right thing then. Any better ideas?


-Sebastian

-- 
Sebastian Stark -- http://www.kyb.tuebingen.mpg.de/~stark
Max Planck Institute for Biological Cybernetics
Spemannstr. 38, 72076 Tuebingen
Phone: +49 7071 601 555 -- Fax: +49 7071 601 552


From falted at pytables.org  Thu Oct 21 00:32:31 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Oct 21 00:32:31 2004
Subject: [Numpy-discussion] Re: numarray and ATLAS
In-Reply-To: <200410210846.09275.stark@tuebingen.mpg.de>
References: <200410210846.09275.stark@tuebingen.mpg.de>
Message-ID: <200410210929.18477.falted@pytables.org>

A Dijous 21 Octubre 2004 08:46, Sebastian Stark va escriure:
> 
> > Perhaps this is a too recurrent subject, but I"m having problems when
> > making numarray to use ATLAS instead of the mini-lapack included.
> 
> I had to change lapack_libs and lapack_dirs in addons.py to read: 
> 
>   lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm']
>   lapack_dirs = ['/usr/local/lib/ATLAS']

I've done something similar:

    lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas']
    lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib']

Mmm, I can see that you have added 'f2c'. However, I don't have it
installed. Could that be the cause that tests would not pass in my case?

> (and yes, I get a significant speed boost from ATLAS)

Great, it's good to know that.

Thank you very much for your feedback,

-- 
Francesc Alted


From rkern at ucsd.edu  Thu Oct 21 02:05:51 2004
From: rkern at ucsd.edu (Robert Kern)
Date: Thu Oct 21 02:05:51 2004
Subject: [Numpy-discussion] Re: numarray and ATLAS
In-Reply-To: <200410210929.18477.falted@pytables.org>
References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org>
Message-ID: <41777422.8040205@ucsd.edu>

Francesc Alted wrote:
> A Dijous 21 Octubre 2004 08:46, Sebastian Stark va escriure:
> 
>>>Perhaps this is a too recurrent subject, but I"m having problems when
>>>making numarray to use ATLAS instead of the mini-lapack included.
>>
>>I had to change lapack_libs and lapack_dirs in addons.py to read: 
>>
>>  lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm']
>>  lapack_dirs = ['/usr/local/lib/ATLAS']
> 
> 
> I've done something similar:
> 
>     lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas']
>     lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib']
> 
> Mmm, I can see that you have added 'f2c'. However, I don't have it
> installed. Could that be the cause that tests would not pass in my case?

If you are compiling with gcc, add 'g2c' after 'f77blas'. It's g77's 
FORTRAN runtime library.

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter


From falted at pytables.org  Thu Oct 21 02:33:28 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Oct 21 02:33:28 2004
Subject: [Numpy-discussion] Re: numarray and ATLAS
In-Reply-To: <41777422.8040205@ucsd.edu>
References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org> <41777422.8040205@ucsd.edu>
Message-ID: <200410211126.42729.falted@pytables.org>

A Dijous 21 Octubre 2004 10:32, Robert Kern va escriure:
> > Mmm, I can see that you have added 'f2c'. However, I don't have it
> > installed. Could that be the cause that tests would not pass in my case?
> 
> If you are compiling with gcc, add 'g2c' after 'f77blas'. It's g77's 
> FORTRAN runtime library.

Yeah, that made the trick!. So for a gcc compiler, this works just fine:

lapack_libs = ['lapack', 'f77blas', 'g2c', 'cblas', 'atlas', 'm']

Many thanks!,

-- 
Francesc Alted


From stephen.walton at csun.edu  Thu Oct 21 10:59:05 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Oct 21 10:59:05 2004
Subject: [Numpy-discussion] Counting array elements
Message-ID: <1098381332.8249.12.camel@freyer.sfo.csun.edu>

Is there some simple way of counting the number of array elements which
satisfy a certain condition?  It is easy to do

A[A<=1].sum()

to sum all the values of A which are less than 1, but there doesn't seem
to be a count() method.  I tried

(A<=1).sum()

but this throws an exception at numarray 1.1.  If I try

sum(A<=value)

I have to nest multiple sums if A has rank greater than 1, plus the sum
overflows if A is large, apparently because boolean gets treated as
Int8.  (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)).  You get
zero.)  The following works:

array(A<=1024,type=Int32).sum()

but is awkward.  Am I missing an obvious better alternative?  If not,
I'm going to file an RFE :-) .

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041021/85482632/attachment-0001.sig>

From Chris.Barker at noaa.gov  Thu Oct 21 11:33:03 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Oct 21 11:33:03 2004
Subject: [Numpy-discussion] Re: numarray and ATLAS
In-Reply-To: <41777422.8040205@ucsd.edu>
References: <200410210846.09275.stark@tuebingen.mpg.de> <200410210929.18477.falted@pytables.org> <41777422.8040205@ucsd.edu>
Message-ID: <4177FFF6.40006@noaa.gov>

Robert Kern wrote:
> Francesc Alted wrote:

>>> I had to change lapack_libs and lapack_dirs in addons.py to read:
>>>  lapack_libs = ['lapack', 'f77blas', 'f2c', 'cblas', 'atlas', 'm']
>>>  lapack_dirs = ['/usr/local/lib/ATLAS']

>> I've done something similar:
>>
>>     lapack_libs = ['lapack', 'cblas', 'f77blas', 'atlas']
>>     lapack_dirs = ['/usr/local/atlas/Linux_P4SSE2_full/lib']

For what it's worth, this is what worked for me on Gentoo Linux:
     lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm']

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From jmiller at stsci.edu  Thu Oct 21 11:33:46 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 21 11:33:46 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
Message-ID: <1098383430.3644.4.camel@halloween.stsci.edu>

On Thu, 2004-10-21 at 13:55, Stephen Walton wrote:
> Is there some simple way of counting the number of array elements which
> satisfy a certain condition?  It is easy to do
> 
> A[A<=1].sum()
> 
> to sum all the values of A which are less than 1, but there doesn't seem
> to be a count() method.  I tried
> 
> (A<=1).sum()
>
> but this throws an exception at numarray 1.1.  If I try

This works now in CVS and will be part of numarray-1.2.  Another more
tedious approach which works for numarray-1.1 is:

(A <= 1).astype('Int32').sum()

> sum(A<=value)
> 
> I have to nest multiple sums if A has rank greater than 1, plus the sum
> overflows if A is large, apparently because boolean gets treated as
> Int8.  (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)).  You get
> zero.)  The following works:
> 
> array(A<=1024,type=Int32).sum()
> 
> but is awkward.  Am I missing an obvious better alternative?  If not,
> I'm going to file an RFE :-) .

I don't think there's any need for an RFE, provided you're satisfied
with (A<=1).sum().

Regards,
Todd


From rkern at ucsd.edu  Thu Oct 21 12:22:20 2004
From: rkern at ucsd.edu (Robert Kern)
Date: Thu Oct 21 12:22:20 2004
Subject: [Numpy-discussion] argmin and unsigned types
Message-ID: <41780BE5.4070009@ucsd.edu>

argmin locates the minimum by finding the maximum of the negative of the 
input. Unfortunately, for unsigned arrays, the negative has nothing to 
do with the actual numerical negative.

Example:

 >>> from numarray import *
 >>> a = arange(10).astype(UInt8)
 >>> print a
[0 1 2 3 4 5 6 7 8 9]
 >>> print -a
[  0 255 254 253 252 251 250 249 248 247]
 >>> argmin(a)
1

We need a separate argmin to handle these arrays properly.

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter


From jmiller at stsci.edu  Thu Oct 21 15:04:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 21 15:04:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098383430.3644.4.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <1098383430.3644.4.camel@halloween.stsci.edu>
Message-ID: <1098396116.3644.129.camel@halloween.stsci.edu>

On Thu, 2004-10-21 at 14:30, Todd Miller wrote:
> On Thu, 2004-10-21 at 13:55, Stephen Walton wrote:
> > Is there some simple way of counting the number of array elements which
> > satisfy a certain condition?  It is easy to do
> > 
> > A[A<=1].sum()
> > 
> > to sum all the values of A which are less than 1, but there doesn't seem
> > to be a count() method.  I tried
> > 
> > (A<=1).sum()
> >
> > but this throws an exception at numarray 1.1.  If I try
> 
> This works now in CVS and will be part of numarray-1.2.  

Stephen tried this and it turns out my earlier statement was untrue,
(A<=1).sum() doesn't do anything reasonable, even in CVS.  The problem
is that sum() is written (without direct C support) to conserve
storage.  As a result,  it doesn't do implicit 
> Another more
> tedious approach which works for numarray-1.1 is:
> 
> (A <= 1).astype('Int32').sum()
> 

There's also a prettier approach that works for 1.1 that I forgot about:

(A <= 1).sum('Int32')

> > sum(A<=value)
> > 
> > I have to nest multiple sums if A has rank greater than 1, plus the sum
> > overflows if A is large, apparently because boolean gets treated as
> > Int8.  (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)).  You get
> > zero.)  The following works:
> > 
> > array(A<=1024,type=Int32).sum()
> > 
> > but is awkward.  Am I missing an obvious better alternative?  If not,
> > I'm going to file an RFE :-) .
> 
> I don't think there's any need for an RFE, provided you're satisfied
> with (A<=1).sum().
> 
> Regards,
> Todd
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 


From jmiller at stsci.edu  Thu Oct 21 15:08:52 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 21 15:08:52 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098396116.3644.129.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <1098383430.3644.4.camel@halloween.stsci.edu>
	 <1098396116.3644.129.camel@halloween.stsci.edu>
Message-ID: <1098396420.28271.0.camel@halloween.stsci.edu>

On Thu, 2004-10-21 at 18:01, Todd Miller wrote:
> On Thu, 2004-10-21 at 14:30, Todd Miller wrote:
> > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote:
> > > Is there some simple way of counting the number of array elements which
> > > satisfy a certain condition?  It is easy to do
> > > 
> > > A[A<=1].sum()
> > > 
> > > to sum all the values of A which are less than 1, but there doesn't seem
> > > to be a count() method.  I tried
> > > 
> > > (A<=1).sum()
> > >
> > > but this throws an exception at numarray 1.1.  If I try
> > 
> > This works now in CVS and will be part of numarray-1.2.  
> 
> Stephen tried this and it turns out my earlier statement was untrue,
> (A<=1).sum() doesn't do anything reasonable, even in CVS.  The problem
> is that sum() is written (without direct C support) to conserve
> storage.  As a result,  it doesn't do implicit 
> > Another more
> > tedious approach which works for numarray-1.1 is:
> > 
> > (A <= 1).astype('Int32').sum()
> > 
> 
> There's also a prettier approach that works for 1.1 that I forgot about:
> 
> (A <= 1).sum('Int32')
> 
> > > sum(A<=value)
> > > 
> > > I have to nest multiple sums if A has rank greater than 1, plus the sum
> > > overflows if A is large, apparently because boolean gets treated as
> > > Int8.  (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)).  You get
> > > zero.)  The following works:
> > > 
> > > array(A<=1024,type=Int32).sum()
> > > 
> > > but is awkward.  Am I missing an obvious better alternative?  If not,
> > > I'm going to file an RFE :-) .
> > 
> > I don't think there's any need for an RFE, provided you're satisfied
> > with (A<=1).sum().
> > 
> > Regards,
> > Todd
> > 
> > 
> > 
> > -------------------------------------------------------
> > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> > Use IT products in your business? Tell us what you think of them. Give us
> > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> > http://productguide.itmanagersjournal.com/guidepromo.tmpl
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 


From jmiller at stsci.edu  Thu Oct 21 15:11:23 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 21 15:11:23 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098396116.3644.129.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <1098383430.3644.4.camel@halloween.stsci.edu>
	 <1098396116.3644.129.camel@halloween.stsci.edu>
Message-ID: <1098396569.28351.0.camel@halloween.stsci.edu>

On Thu, 2004-10-21 at 18:01, Todd Miller wrote:
> On Thu, 2004-10-21 at 14:30, Todd Miller wrote:
> > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote:
> > > Is there some simple way of counting the number of array elements which
> > > satisfy a certain condition?  It is easy to do
> > > 
> > > A[A<=1].sum()
> > > 
> > > to sum all the values of A which are less than 1, but there doesn't seem
> > > to be a count() method.  I tried
> > > 
> > > (A<=1).sum()
> > >
> > > but this throws an exception at numarray 1.1.  If I try
> > 
> > This works now in CVS and will be part of numarray-1.2.  
> 
> Stephen tried this and it turns out my earlier statement was untrue,
> (A<=1).sum() doesn't do anything reasonable, even in CVS.  The problem
> is that sum() is written (without direct C support) to conserve
> storage.  As a result,  it doesn't do implicit 
> > Another more
> > tedious approach which works for numarray-1.1 is:
> > 
> > (A <= 1).astype('Int32').sum()
> > 
> 
> There's also a prettier approach that works for 1.1 that I forgot about:
> 
> (A <= 1).sum('Int32')
> 
> > > sum(A<=value)
> > > 
> > > I have to nest multiple sums if A has rank greater than 1, plus the sum
> > > overflows if A is large, apparently because boolean gets treated as
> > > Int8.  (Try A=arange(1024,shape=(32,32));sum(sum(A<=1024)).  You get
> > > zero.)  The following works:
> > > 
> > > array(A<=1024,type=Int32).sum()
> > > 
> > > but is awkward.  Am I missing an obvious better alternative?  If not,
> > > I'm going to file an RFE :-) .
> > 
> > I don't think there's any need for an RFE, provided you're satisfied
> > with (A<=1).sum().
> > 
> > Regards,
> > Todd
> > 
> > 
> > 
> > -------------------------------------------------------
> > This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> > Use IT products in your business? Tell us what you think of them. Give us
> > Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
> > http://productguide.itmanagersjournal.com/guidepromo.tmpl
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 


From jmiller at stsci.edu  Thu Oct 21 16:41:29 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Oct 21 16:41:29 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098396569.28351.0.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <1098383430.3644.4.camel@halloween.stsci.edu>
	 <1098396116.3644.129.camel@halloween.stsci.edu>
	 <1098396569.28351.0.camel@halloween.stsci.edu>
Message-ID: <1098401959.3744.34.camel@localhost.localdomain>

On Thu, 2004-10-21 at 18:09, Todd Miller wrote:
> On Thu, 2004-10-21 at 18:01, Todd Miller wrote:
> > On Thu, 2004-10-21 at 14:30, Todd Miller wrote:
> > > On Thu, 2004-10-21 at 13:55, Stephen Walton wrote:
> > > > Is there some simple way of counting the number of array elements which
> > > > satisfy a certain condition?  It is easy to do
> > > > 
> > > > A[A<=1].sum()
> > > > 
> > > > to sum all the values of A which are less than 1, but there doesn't seem
> > > > to be a count() method.  I tried
> > > > 
> > > > (A<=1).sum()
> > > >
> > > > but this throws an exception at numarray 1.1.  If I try
> > > 
> > > This works now in CVS and will be part of numarray-1.2.  
> > 
> > Stephen tried this and it turns out my earlier statement was untrue,
> > (A<=1).sum() doesn't do anything reasonable, even in CVS.  The problem
> > is that sum() is written (without direct C support) to conserve
> > storage.  As a result,  it doesn't do implicit 

<drum roll> up-casting. </drum roll>

I'm pretty sure this was a conscious and discussed choice (this is
actually the 2nd time sum() has been wrong).  IMHO, the typing for sum()
should be revised because it is too dangerous the way it is now.  

Regards,
Todd


From nwagner at mecha.uni-stuttgart.de  Fri Oct 22 02:17:16 2004
From: nwagner at mecha.uni-stuttgart.de (Nils Wagner)
Date: Fri Oct 22 02:17:16 2004
Subject: [Numpy-discussion] Problems with complex matrices
Message-ID: <4178CEFF.2050608@mecha.uni-stuttgart.de>

Hi all,

Another bug is revealed

Traceback (most recent call last):
  File "complex_it.py", line 6, in ?
    res=dot(A,x)-r
  File "/usr/lib/python2.3/site-packages/Numeric/dotblas/__init__.py", 
line 55, in dot
    if multiarray.array(a).shape == () or multiarray.array(b).shape == ():
TypeError: a float is required
 
Nils

-------------- next part --------------
A non-text attachment was scrubbed...
Name: complex_it.py
Type: text/x-python
Size: 139 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041022/c13d8954/attachment-0001.py>

From Sebastien.deMentendeHorne at electrabel.com  Fri Oct 22 02:44:46 2004
From: Sebastien.deMentendeHorne at electrabel.com (Sebastien.deMentendeHorne at electrabel.com)
Date: Fri Oct 22 02:44:46 2004
Subject: [Numpy-discussion] Problems with complex matrices
Message-ID: <035965348644D511A38C00508BF7EAEB145CB168@seacex03.eib.electrabel.be>

gmres returns a tuple so you should have used
res = dot(A, x[0]) - r

seb

> -----Original Message-----
> From: Nils Wagner [mailto:nwagner at mecha.uni-stuttgart.de]
> Sent: vendredi 22 octobre 2004 11:13
> To: SciPy Users List; numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Problems with complex matrices
> 
> 
> Hi all,
> 
> Another bug is revealed
> 
> Traceback (most recent call last):
>   File "complex_it.py", line 6, in ?
>     res=dot(A,x)-r
>   File 
> "/usr/lib/python2.3/site-packages/Numeric/dotblas/__init__.py", 
> line 55, in dot
>     if multiarray.array(a).shape == () or 
> multiarray.array(b).shape == ():
> TypeError: a float is required
>  
> Nils
> 
> 


From Chris.Barker at noaa.gov  Fri Oct 22 11:07:32 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Oct 22 11:07:32 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098392607.8249.20.camel@freyer.sfo.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
Message-ID: <41794B47.4090909@noaa.gov>

Stephen Walton wrote:

 > There is a difference between the sum() Ufunc and the sum() method which
 > is not mentioned in the documentation:  the function works along an
 > axis, while the method works on the whole array.  That is, A.sum() and
 > A.flat.sum() are equivalent regardless of the rank of A.


Bummer. I was hoping this was a move to a more object-oriented style, 
rather than different functionality. Also, it's pretty confusing 
terminology, particularly if it's not documented! Why not .SumAll() or 
something?

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From rowen at u.washington.edu  Fri Oct 22 11:20:36 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Fri Oct 22 11:20:36 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <41794B47.4090909@noaa.gov>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	
 <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov>
Message-ID: <p06200510bd9eff3962b0@[128.95.99.44]>

At 11:02 AM -0700 2004-10-22, Chris Barker wrote:
>Stephen Walton wrote:
>
>>  There is a difference between the sum() Ufunc and the sum() method which
>>  is not mentioned in the documentation:  the function works along an
>>  axis, while the method works on the whole array.  That is, A.sum() and
>>  A.flat.sum() are equivalent regardless of the rank of A.
>
>
>Bummer. I was hoping this was a move to a more object-oriented 
>style, rather than different functionality. Also, it's pretty 
>confusing terminology, particularly if it's not documented! Why not 
>.SumAll() or something?

I agree. Numarray is already confusing enough without identically 
named functions and methods that do different things. (nElements and 
size are another pet peeve, with size used in several places and 
nElements appearing exactly once. Though I am grateful to whoever 
added size as a workalike for nElements; formerly you had to know 
what kind of array you had before you knew how to find out how many 
elements it had.)

-- Russell


From strawman at astraw.com  Fri Oct 22 11:25:58 2004
From: strawman at astraw.com (Andrew Straw)
Date: Fri Oct 22 11:25:58 2004
Subject: [Numpy-discussion] floating point exception weirdness
In-Reply-To: <411A08FA.7000601@astraw.com>
References: <4119BBFC.6020304@astraw.com> <1092221365.3752.32.camel@localhost.localdomain> <411A08FA.7000601@astraw.com>
Message-ID: <41795006.1040807@astraw.com>

I've isolated a bug I first reported on this mailing list in August.  
I've now confined it to a small code snippet using entirely open-source 
software (previously I saw it while using Intel's IPP).  In a nutshell, 
importing numarray.ieeespecial triggers a floating point exception 
(which kills my program) when I call Numeric's 
singular_value_decomposition() function:

import Numeric
from LinearAlgebra import singular_value_decomposition

if want_FPE:
    import numarray.ieeespecial

A= [[-5.7, 2.2, -0.53, 46.0],
    [-2.3, -5.5, -1.0, 1091.0],
    [5.9, 1.4, -0.1, -142.0],
    [-1.3, 5.7, -1.5, 2673.0]]
A=Numeric.array(A)
u,s,v = singular_value_decomposition(A) # FPE triggered here

Here's my setup:

$ python
Python 2.3.4 (#2, Sep 24 2004, 08:39:09)
[GCC 3.3.4 (Debian 1:3.3.4-12)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import Numeric
 >>> Numeric.__version__
'23.6'
 >>> import numarray
 >>> numarray.__version__
'1.2a'

$ gcc -v
Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.4/specs
Configured with: ../src/configure -v 
--enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr 
--mandir=/usr/share/man --infodir=/usr/share/info 
--with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared 
--with-system-zlib --enable-nls --without-included-gettext 
--enable-__cxa_atexit --enable-clocale=gnu --enable-debug 
--enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
Thread model: posix
gcc version 3.3.4 (Debian 1:3.3.4-13)

Now, for the clue:  the above error is ONLY triggered when I compile 
Numeric to use system blas and friends, not when I use lapack_lite 
included with Numeric.  This leads me to suspect it is related to the 
SSE2 unit -- I have Debian sarge's atlas3-base, atlas3-see, atlas3-sse2, 
blas, lapack, lapack3, and refblas3 packages installed on my P4 machine.

So, to propose a hypothesis: numarray.ieeespecial sets the FPE bit in 
the SSE2 hardware, but for some reason this does not raise SIGFPE.  
However, when the next call that touches SSE2 happens, the kernel sees 
that error bit and throws the signal.  Does this explanation make 
sense?  Is it easy to fix?

Cheers!
Andrew


From jmiller at stsci.edu  Fri Oct 22 14:19:17 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Oct 22 14:19:17 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <p06200510bd9eff3962b0@[128.95.99.44]>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
Message-ID: <1098479844.29804.260.camel@halloween.stsci.edu>

On Fri, 2004-10-22 at 14:17, Russell E Owen wrote:
> At 11:02 AM -0700 2004-10-22, Chris Barker wrote:
> >Stephen Walton wrote:
> >
> >>  There is a difference between the sum() Ufunc and the sum() method which
> >>  is not mentioned in the documentation:  the function works along an
> >>  axis, while the method works on the whole array.  That is, A.sum() and
> >>  A.flat.sum() are equivalent regardless of the rank of A.
> >
> >
> >Bummer. I was hoping this was a move to a more object-oriented 
> >style, rather than different functionality. Also, it's pretty 
> >confusing terminology, particularly if it's not documented! Why not 
> >.SumAll() or something?

sumAll() would certainly be better.

Unless there are objections,  I'll rename the current sum() method to
sumAll() and re-write sum() to give a deprecation warning before calling
sumAll().  Eventually,  it'll go away altogether.

I reviewed the discussion of the sum() result type from a year ago:
"[Numpy-discussion] sum and mean methods behaviour".  We discussed sum()
in depth and AFIK I implemented the recommendations.  The results need
to be documented.

By default,  sum() now uses the maximum type of the type family of the
array, so families Bool, Integer, UnsignedInteger, Float, or Complex 
result in max types Bool, Int64, UInt64, Float64, Complex64.  I'm not
sure why we segregated Bool and it looks like a mistake to me now.  I'm
thinking the Bool "family" should just go away and be re-classified as
UnsignedInteger.  These ideas are captured by the
numerictypes.MaximumType() function which is also potentially useful for
any reduction.

> I agree. Numarray is already confusing enough without identically 
> named functions and methods that do different things. 

True enough.  This'll be fixed.

> (nElements and 
> size are another pet peeve, with size used in several places and 
> nElements appearing exactly once. Though I am grateful to whoever 
> added size as a workalike for nElements; formerly you had to know 
> what kind of array you had before you knew how to find out how many 
> elements it had.)

I'm not sure what you mean here.  When I grepped,  I got 52 hits for
nelements() in the numarray source, let alone what users have done with
it.  Right now,  IMHO, it's not clearly broken and there are bigger fish
to fry.

Regards,
Todd


From stephen.walton at csun.edu  Fri Oct 22 14:37:05 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Fri Oct 22 14:37:05 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <p06200510bd9eff3962b0@[128.95.99.44]>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
Message-ID: <1098480955.11372.19.camel@freyer.sfo.csun.edu>

On Fri, 2004-10-22 at 11:17, Russell E Owen wrote about the sum() Ufunc
vs. the sum() method:

> Numarray is already confusing enough without identically 
> named functions and methods that do different things

When I went through the Numarray docs and made suggestions for
improvements (see the list I posted at Sourceforge), I didn't make any
comments about functional changes, only what the documentation said. 
Since the sum() method is documented using 1-D arrays, you can't tell
that it in fact behaves differently than the sum() Ufunc.  On
reflection, I also agree that the Ufuncs and methods should behave the
same way.

Why do you say 'numarray is confusing'?  What in the docs would help
un-confuse it, in your view?

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041022/691b3617/attachment-0001.sig>

From rowen at u.washington.edu  Fri Oct 22 14:48:03 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Fri Oct 22 14:48:03 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	
 <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu>	
 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
 <1098479844.29804.260.camel@halloween.stsci.edu>
Message-ID: <p06200515bd9f2b4fb808@[128.95.99.44]>

At 5:17 PM -0400 2004-10-22, Todd Miller wrote:
>On Fri, 2004-10-22 at 14:17, Russell E Owen wrote:
>>  I agree. Numarray is already confusing enough without identically
>>  named functions and methods that do different things.
>
>True enough.  This'll be fixed.

Great!

>>  (nElements and
>>  size are another pet peeve, with size used in several places and
>>  nElements appearing exactly once. Though I am grateful to whoever
>>  added size as a workalike for nElements; formerly you had to know
>>  what kind of array you had before you knew how to find out how many
>>  elements it had.)
>
>I'm not sure what you mean here.  When I grepped,  I got 52 hits for
>nelements() in the numarray source, let alone what users have done with
>it.  Right now,  IMHO, it's not clearly broken and there are bigger fish
>to fry.

Since you ask...

I'm counting the number of implementations in the public interface of 
the numarray package. There are four implementations of size 
(including the numarray array method, which is simply a synonym for 
nelements), but only one implementation of nelements.

When I started using numarray, the following was true:

* numarray had a function named size.
* numarray.ma had the same function
* numarray.ma arrays had method size
* All of these worked the same way:
   size(array, axis=None)
     size  returns the number of elements in an array or
     along the specified axis.

BUT numarray arrays had no method size. Instead there was a method 
nelements, which did the same thing as size, but had no "axis" 
argument.


This was very confusing, and I got tripped up badly because I was 
trying to count array elements and was using both "normal" numarray 
arrays and masked arrays. I filed PR 934514 and some kind soul 
patched the problem by making size a synonym for nelements.

There is a bit of residual mess because the new size does not have 
the axis argument. And then there's the historical clutter of two 
ways to do the same thing, but presumably one just lives with that. 
Though it seems a bit strange to me not to deprecate nelements and 
stop using it internally.

-- Russell


From Fernando.Perez at colorado.edu  Fri Oct 22 14:50:04 2004
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Fri Oct 22 14:50:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu>
Message-ID: <41797FFE.8090802@colorado.edu>

Todd Miller wrote:

> sumAll() would certainly be better.
> 
> Unless there are objections,  I'll rename the current sum() method to
> sumAll() and re-write sum() to give a deprecation warning before calling
> sumAll().  Eventually,  it'll go away altogether.

silly, minor nit: can we avoid mixed case names? Either sum_all or SumAll? I'm 
not too fond of CamelCase, but camelCase looks even worse to me :)

As I said, it's just a minor nit.  I don't know if there's an official naming 
policy for numarray, so please don't get angry at me if my comment is out of 
place.

Best,

f


From Chris.Barker at noaa.gov  Fri Oct 22 15:12:01 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Oct 22 15:12:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu>
Message-ID: <4179853F.8040800@noaa.gov>

Todd Miller wrote:

> By default,  sum() now uses the maximum type of the type family of the
> array, so families Bool, Integer, UnsignedInteger, Float, or Complex 
> result in max types Bool, Int64, UInt64, Float64, Complex64.  I'm not
> sure why we segregated Bool and it looks like a mistake to me now.  I'm
> thinking the Bool "family" should just go away and be re-classified as
> UnsignedInteger.

Well, I think that the idea of a bool being different than an int is 
often useful. In this case, we want Bool to behave like an integer, so 
that we can use some version of sum() to add up all the true values. 
This is handy, but maybe we need more complete support for boolean 
arrays, rather than getting rid of them. For instance, there could be a 
NumTrue() function or method, for this case. I would probably maintain 
the easy conversion of a Bool array to an Int array, for when you really 
do need to do math with them.

We'd want a compete set, many of which already exist. A few off the top 
of my head:

sometrue
alltrue
numtrue

Maybe mirrors for false:
somefalse
allfalse
numfalse

What else would be needed? My vote would be for all of these to be 
methods of a Bool array, but I'm partial to methods over functions anyway.

On the other hand, Python itself is sub classing Bool from integer, so 
maybe there's little point.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From aisaac at american.edu  Fri Oct 22 15:14:07 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Fri Oct 22 15:14:07 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098479844.29804.260.camel@halloween.stsci.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu><417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu><41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
 <1098479844.29804.260.camel@halloween.stsci.edu>
Message-ID: <Mahogany-0.66.0-1388-20041022-181444.00@american.edu>

On 22 Oct 2004, Todd Miller apparently wrote:
> sumAll() would certainly be better.

> Unless there are objections,  I'll rename the current sum() method to
> sumAll() and re-write sum() to give a deprecation warning before calling
> sumAll().  Eventually,  it'll go away altogether.

Just two thoughts from a new user.
i. I agree that .sumAll is better than the current name
confusion.
ii. even better, I propose, would be for .sum to take
an axis argument, with default matching the sum function,
and possible value axis="all".

For the transition, the axis argument can be required.

fwiw,
Alan Isaac 


From rowen at u.washington.edu  Fri Oct 22 15:19:02 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Fri Oct 22 15:19:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098480955.11372.19.camel@freyer.sfo.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	
 <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu>	
 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
 <1098480955.11372.19.camel@freyer.sfo.csun.edu>
Message-ID: <p06200517bd9f307eeee5@[128.95.99.44]>

At 2:35 PM -0700 2004-10-22, Stephen Walton wrote:
>On Fri, 2004-10-22 at 11:17, Russell E Owen wrote about the sum() Ufunc
>vs. the sum() method:
>
>>  Numarray is already confusing enough without identically
>>  named functions and methods that do different things
>
>When I went through the Numarray docs and made suggestions for
>improvements (see the list I posted at Sourceforge), I didn't make any
>comments about functional changes, only what the documentation said.
>Since the sum() method is documented using 1-D arrays, you can't tell
>that it in fact behaves differently than the sum() Ufunc.  On
>reflection, I also agree that the Ufuncs and methods should behave the
>same way.
>
>Why do you say 'numarray is confusing'?  What in the docs would help
>un-confuse it, in your view?

OK, since I seem to be in a grumpy mood today, here are some examples 
(probably nothing new here):
- I'll expose my ignorance, but I find the take stuff and fancy 
indexing nearly incomprehensible. I've tried to follow the examples 
(several times--i.e. every time I need to do something fancy), but 
generally I either flail around until I find something that works, or 
give up and write a C extension.

- I'd like to write C/C++ code that would work on multiple array 
types. This seems a natural use of C++ templates, but that doesn't 
seem to be "how it's done". I hate to think how the internal code is 
managing this without being a horrible sphaghetti of code repeated 
for each array type.

The nd_image package is the closest I've come to finding source code 
that makes any sense to me in this areay. But it uses so many 
custom-defined specialized functions that I figured it was just too 
much work to figure out w/out a manual (and risky to rely on these 
functions since they are internal to the package).

So I gave up and just support the one data type I really need now. 
Very disappointing.

- Important functions are sometimes buried in a non-obvious (to me) 
sub-package.

For example: try to find that location at which an array has a 
minimum value (if there's more than one such point, pick any). You'd 
think it'd be a standard numarray function, wouldn't you? After all, 
you can ask for the minimum value. Now try to find it.

Well, I started out by trying to figure out how to get argmin to do 
the job. Horrible.

Fortunately I finally found minimum_position buried in nd_image.

- Masked arrays are not integrated. Thus a lot of important filtering 
and stuff simply cannot be done on masked data without writing custom 
extensions. For instance I'd like to do a median-filter that ignores 
masked data (taking the median of non-masked data only).

- For 2-d images x and y are reversed. I know this isn't going to 
change, but it is a headache every time I have to write new image 
processing code.

- I keep wanting more support for dealing with arrays of indices, 
e.g. "give me all the indices for which this is true", then use that 
to process the data in an array. Numarray seems to do that kind of 
operation in an entirely different way, suggesting I'm not "with it" 
on the underlying philosophy. Unfortunately no really good examples 
come to mind at the moment (it's been awhile since I've created new 
code using numarray), though I was fairly well convinced that if I 
had enough support for this I could code an efficient radial profile 
function w/out using a C extension.

-- Russell


From perry at stsci.edu  Fri Oct 22 16:50:01 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Oct 22 16:50:01 2004
Subject: [Numpy-discussion] In case there are any questions about numarray...
In-Reply-To: <p06200517bd9f307eeee5@[128.95.99.44]>
Message-ID: <NEBBIJKBMLDBLNCEEFOCOECAFIAA.perry@stsci.edu>

Todd and I will be away most of next week at a conference
and will likely not have a chance to respond to questions about
numarray or continue the current discussions about the
proper numarray interface or improvements to the
documentation. 

Perry 


From aisaac at american.edu  Fri Oct 22 19:17:02 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Fri Oct 22 19:17:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <4179853F.8040800@noaa.gov>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu><4179853F.8040800@noaa.gov>
Message-ID: <Mahogany-0.66.0-1768-20041022-221742.00@american.edu>

More new user feedback ...

On Fri, 22 Oct 2004, Chris Barker apparently wrote:
> Well, I think that the idea of a bool being different than
> an int is often useful.

Yes.  E.g., applications to directed graphs.


> we can use some version of sum() to add up all the
> true values.

Unclear, but given the existence of sometrue,
it seems natural enough to let sum treat a Bool as an
integer.  Products work naturally, of course.

> I would probably maintain
> the easy conversion of a Bool array to an Int array, for when you really
> do need to do math with them.

I would rephrase this.
Boolean arrays have a naturally different math,
which it would be nice to have supported.
It would also be nice to easily convert to Int,
when that representation captures the math needed.

> We'd want a compete set, many of which already exist. A few off the top
> of my head:
> sometrue
> alltrue
> numtrue       

I'd just let sum handle numtrue.

> Maybe mirrors for false:
> somefalse, allfalse, numfalse

I'd just rely on alltrue, sometrue, and (size less sum) for these.

fwiw,
Alan


From stephen.walton at csun.edu  Fri Oct 22 22:23:02 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Fri Oct 22 22:23:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <p06200517bd9f307eeee5@[128.95.99.44]>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
	 <1098480955.11372.19.camel@freyer.sfo.csun.edu>
	 <p06200517bd9f307eeee5@[128.95.99.44]>
Message-ID: <1098508579.3403.6.camel@localhost.localdomain>

I had no idea my innocent question would generate so much discussion.
Mindful that Perry and Todd are at ADASS in Pasadena next week:

On Fri, 2004-10-22 at 15:18 -0700, Russell E Owen wrote: 
> At 2:35 PM -0700 2004-10-22, Stephen Walton wrote:
> >
> >Why do you say 'numarray is confusing'?  What in the docs would help
> >un-confuse it, in your view?
> 
> - I'll expose my ignorance, but I find the take stuff and fancy 
> indexing nearly incomprehensible.

I agree.  It took me much experimentation to figure out exactly how it
worked.  I'd appreciate it very much if you would look at my suggested
rewrite of this section of the documentation at

http://sourceforge.net/tracker/index.php?func=detail&aid=1047889&group_id=1369&atid=101369

and give me any further thoughts for clarification (post them as
comments to the bug report itself).

> - I'd like to write C/C++ code that would work on multiple array 
> types.

I can't help much here, other than to say that C and C++ are pretty low
level languages, not well suited for this level of abstraction.

> - Important functions are sometimes buried in a non-obvious (to me) 
> sub-package.
> For example: try to find that location at which an array has a 
> minimum value

The current index to the documentation seems to include only the
function names but not concepts, which is a problem.  I myself was
trying to remember how to do type conversion;  there is no entry in the
index for 'conversion' or 'coercion' and I finally grepped my local copy
of the HTML files to re-find astype(). 

> - Masked arrays are not integrated.

I haven't tried these yet personally, but I agree that such a feature is
a very important one.  IRAF got partway along on this but didn't finish
it either.

Having said that, my workaround/technique for both MATLAB and numarray
is to simply put NaN's in the places where this not valid data and do
something like

sum(sum(A(~isnan(A)))

This is MATLAB syntax of course.  Something similar in numarray would go
a long way to helping me.  For example, I have full disk solar images
and I'd like to be able to operate on just the sunspot pixels, or just
the sky pixels, in a straightforward way.

> - For 2-d images x and y are reversed.

Are you referring to the fact that C and numarray are row major and
Fortran is column major?  Or to how images get displayed in the various
plot packages?

> - I keep wanting more support for dealing with arrays of indices, 
> e.g. "give me all the indices for which this is true", then use that 
> to process the data in an array. Numarray seems to do that kind of 
> operation in an entirely different way, suggesting I'm not "with it" 
> on the underlying philosophy.

There are two ways to do this, both of which work.  For example:

A=arange(25)
sum(A[A<=7])

will work just as you expect.  A bool array used as an index picks out
those values for which the bool is True.  Essentially identical syntax
now works in MATLAB too.  If you want an index array instead:

>>> index=where(A<7)
>>> A[index]

will do the trick.  For arrays of rank greater than 1:

>>> A=arange(25,shape=(5,5))
>>> where(A<7)
(array([0, 0, 0, 0, 0, 1, 1]), array([0, 1, 2, 3, 4, 0, 1]))

which is a tuple of two arrays that can be used to index A:

>>> ind1,ind2=where(A<7)
>>> A[ind1,ind2]
array([0, 1, 2, 3, 4, 5, 6])
>>> A[ind1,ind2]=[6,5,4,3,2,1,0]	# assignment works too
>>> A
array([[ 6,  5,  4,  3,  2],
       [ 1,  0,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

Does this help?

-- 
Stephen Walton <stephen.walton at csun.edu>
Physics & Astronomy CSUN


From verveer at embl-heidelberg.de  Sat Oct 23 04:14:04 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Sat Oct 23 04:14:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <p06200517bd9f307eeee5@[128.95.99.44]>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098480955.11372.19.camel@freyer.sfo.csun.edu> <p06200517bd9f307eeee5@[128.95.99.44]>
Message-ID: <9633F2FA-24E4-11D9-B9D4-000D932805AC@embl-heidelberg.de>

I thought I just give my point of view on this, since I do believe we 
should give these some thought.

On Oct 23, 2004, at 12:18 AM, Russell E Owen wrote:

> OK, since I seem to be in a grumpy mood today, here are some examples 
> (probably nothing new here):
> - I'll expose my ignorance, but I find the take stuff and fancy 
> indexing nearly incomprehensible. I've tried to follow the examples 
> (several times--i.e. every time I need to do something fancy), but 
> generally I either flail around until I find something that works, or 
> give up and write a C extension.

I agree, it is  very complicated, I always have trouble getting 
understanding what is going on when I use take and indexing. More 
documentation may help.

> - I'd like to write C/C++ code that would work on multiple array 
> types. This seems a natural use of C++ templates, but that doesn't 
> seem to be "how it's done". I hate to think how the internal code is 
> managing this without being a horrible sphaghetti of code repeated for 
> each array type.

This is a good point. If you look at examples for implementing 
something in C, you always see that the code only handles a single data 
type, usually converting all input to double type. That is not always a 
good way to write an extension if you want it to be of generic use 
(e.g. the FFT module does not handle 32 bits floating point well, which 
is a problem for big arrays). Some support in writing functions that 
handle multiple data types would be good.

> The nd_image package is the closest I've come to finding source code 
> that makes any sense to me in this areay. But it uses so many 
> custom-defined specialized functions that I figured it was just too 
> much work to figure out w/out a manual (and risky to rely on these 
> functions since they are internal to the package).

The internal nd_image C functions are indeed not exported and should 
not be used to implement extensions. That is going to stay that way 
since I do not plan to document these, and in any case, exposing such 
functions is not the purpose of the module.

On the other hand, some of the techniques use may be generally useful. 
I could try to factor some of the functions and macros out and write 
something up on the use of these to write extensions that handle 
multiple data types.

> So I gave up and just support the one data type I really need now. 
> Very disappointing.

Yes, it should be easier to do this, I agree. Using C macros as a 'poor 
man' templating system is in fact not too complicated (although pretty 
ugly).

Another approach that I have tried to use in nd_image is to provide 
generic functions that take a python or a C function to implement 
functionality. For instance to implement an arbitrary filter function 
in nd_image  you only need to implement a function that calculates the 
filter at one point. You then call a generic filter function that does 
the heavy lifting of dealing with multiple array types,  iterating over 
the array, dealing with borders and such, applying the function at each 
array element. The filter function can be in python, but can also be a 
C function, communicated by a CObject.

Maybe some of these type functions could be provided with the numarray 
package. This could simplify writing extensions a lot. Would there be 
interest for a package of such functions? If there is I could think 
about it a bit more, and propose (and implement) something in the form 
of an extension.

> - Important functions are sometimes buried in a non-obvious (to me) 
> sub-package.
>
> For example: try to find that location at which an array has a minimum 
> value (if there's more than one such point, pick any). You'd think 
> it'd be a standard numarray function, wouldn't you? After all, you can 
> ask for the minimum value. Now try to find it.

Agreed, this bothered me too.

> Well, I started out by trying to figure out how to get argmin to do 
> the job. Horrible.
>
> Fortunately I finally found minimum_position buried in nd_image.

It is there because numarray did not provide it... But it is also there 
because it offers much functionality that would not be appropriate for 
the main package. It is part of the object measurement functions. A 
simpler, possibly more efficient routine should maybe be part of the 
main package.

> - Masked arrays are not integrated. Thus a lot of important filtering 
> and stuff simply cannot be done on masked data without writing custom 
> extensions. For instance I'd like to do a median-filter that ignores 
> masked data (taking the median of non-masked data only).

I agree very much! To be honest, I do not like the ma package much. I 
don't like the idea of having to use a separate package with a 
different array type that duplicates the functionality in the main 
package. I think it would be much better if all functions (where it 
makes sense) in numarray would accept an optional mask argument. To me 
it makes more sense to provide the mask with the operation, not as part 
of the array like in ma (a package like ma could still be layered on 
top.) I realize it would be a lot of work to make all numarray 
functions mask aware, but it is something to think about maybe.

> - For 2-d images x and y are reversed. I know this isn't going to 
> change, but it is a headache every time I have to write new image 
> processing code.

This is not really a problem I think, but you have to get used to it. 
If you treat the last dimension always as X and the first as Y, you 
have the same layout in memory as is usual in most image processing 
software. So X corresponds to axis=1 and Y to axis=0. Or use axis=-1 
and axis=-2.

Cheers, Peter


From aisaac at american.edu  Sat Oct 23 12:01:04 2004
From: aisaac at american.edu (Alan G Isaac)
Date: Sat Oct 23 12:01:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <Mahogany-0.66.0-1388-20041022-181444.00@american.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu><417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu><41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]><1098479844.29804.260.camel@halloween.stsci.edu><Mahogany-0.66.0-1388-20041022-181444.00@american.edu>
Message-ID: <Mahogany-0.66.0-1768-20041023-150116.00@american.edu>

On Fri, 22 Oct 2004 Alan G Isaac apparently wrote:
> Just two thoughts from a new user.
> i. I agree that .sumAll is better than the current name
> confusion.
> ii. even better, I propose, would be for .sum to take
> an axis argument, with default matching the sum function,
> and possible value axis="all".
> For the transition, the axis argument can be required.


That should have been: axis=None

fwiw,
Alan Isaac


From stephen.walton at csun.edu  Sun Oct 24 19:22:03 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Sun Oct 24 19:22:03 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <41797FFE.8090802@colorado.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
	 <1098479844.29804.260.camel@halloween.stsci.edu>
	 <41797FFE.8090802@colorado.edu>
Message-ID: <1098670236.1907.21.camel@localhost.localdomain>

On Fri, 2004-10-22 at 14:47, Fernando Perez wrote:

> silly, minor nit: can we avoid mixed case names? Either sum_all or SumAll? I'm 
> not too fond of CamelCase, but camelCase looks even worse to me :)

I agree with Fernando about CamelCase (which among other things
seriously bites one when moving from case-sensitive to case-insensitive
OS's).  But I want to make a broader point:

I don't think we need sumall.  The methods and the functions should
simply work the same way.  If one wants sumall, use A.flat.sum() or, if
you can't use the methods or attributes on your old version of Python,
sum(ravel(A)).  If you start writing sumall, then you'll need meanall,
stdall, prodall, etc, etc.  The flat attribute and ravel function/method
already provide all the needed functionality.

Just trying to save Todd some work.

Steve

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041024/011e3cb7/attachment-0001.sig>

From verveer at embl-heidelberg.de  Mon Oct 25 01:37:05 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 01:37:05 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098670236.1907.21.camel@localhost.localdomain>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain>
Message-ID: <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

On 25 Oct 2004, at 04:17, Stephen Walton wrote:

> On Fri, 2004-10-22 at 14:47, Fernando Perez wrote:
>
>> silly, minor nit: can we avoid mixed case names? Either sum_all or 
>> SumAll? I'm
>> not too fond of CamelCase, but camelCase looks even worse to me :)
>
> I agree with Fernando about CamelCase (which among other things
> seriously bites one when moving from case-sensitive to case-insensitive
> OS's).  But I want to make a broader point:
>
> I don't think we need sumall.  The methods and the functions should
> simply work the same way.  If one wants sumall, use A.flat.sum() or, if
> you can't use the methods or attributes on your old version of Python,
> sum(ravel(A)).  If you start writing sumall, then you'll need meanall,
> stdall, prodall, etc, etc.  The flat attribute and ravel 
> function/method
> already provide all the needed functionality.

I think this may be inefficient, because ravel and flat may make a copy 
of the data. Also I think using flat/ravel in such a way is plain ugly 
and a complex way to do it.

But I do agree that it is not a good idea to introduce another set of 
names. In my opinion functions that calculate a statistic like sum 
should return the total in the first place, rather then over a single 
axis. But I guess it is too late to change that for sum, because of 
backward compatibility.

Cheers, Peter


From stephen.walton at csun.edu  Mon Oct 25 09:20:02 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Mon Oct 25 09:20:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
	 <1098479844.29804.260.camel@halloween.stsci.edu>
	 <41797FFE.8090802@colorado.edu>
	 <1098670236.1907.21.camel@localhost.localdomain>
	 <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
Message-ID: <1098721171.19183.12.camel@sunspot.csun.edu>

On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote:
> On 25 Oct 2004, at 04:17, Stephen Walton wrote:
> >
> > I don't think we need sumall.  The methods and the functions should
> > simply work the same way.  If one wants sumall, use A.flat.sum() or, if
> > you can't use the methods or attributes on your old version of Python,
> > sum(ravel(A)).
>
> I think this may be inefficient, because ravel and flat may make a copy 
> of the data. Also I think using flat/ravel in such a way is plain ugly 
> and a complex way to do it.

You may be right about the copying, I couldn't say.  I don't think
sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array,
but ugly is in the eye of the beholder.

> In my opinion functions that calculate a statistic like sum 
> should return the total in the first place, rather then over a single 
> axis.

It depends on the data.  I use rank-2 arrays which are images and are
therefore homogeneous.  Even there, though, I often want the sum of all
rows or all columns.  For heterogeneous data (e.g., columns of different
Y's as a function of X), the present sum() makes sense.  In other words,
we will always need ways to sum over just one dimension and over all
dimensions.  By analogy with MATLAB (I'm guessing), sum() in Numeric and
numarray does a one-D sum.

-- 
Stephen Walton, Professor of Physics and Astronomy,
California State University, Northridge
stephen.walton at csun.edu


From tim.hochberg at cox.net  Mon Oct 25 09:32:01 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Oct 25 09:32:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>	 <1098479844.29804.260.camel@halloween.stsci.edu>	 <41797FFE.8090802@colorado.edu>	 <1098670236.1907.21.camel@localhost.localdomain>	 <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu>
Message-ID: <417D2A3C.7010108@cox.net>

Stephen Walton wrote:

>On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote:
>  
>
>>On 25 Oct 2004, at 04:17, Stephen Walton wrote:
>>    
>>
>>>I don't think we need sumall.  The methods and the functions should
>>>simply work the same way.  If one wants sumall, use A.flat.sum() or, if
>>>you can't use the methods or attributes on your old version of Python,
>>>sum(ravel(A)).
>>>      
>>>
>>I think this may be inefficient, because ravel and flat may make a copy 
>>of the data. Also I think using flat/ravel in such a way is plain ugly 
>>and a complex way to do it.
>>    
>>
>
>You may be right about the copying, I couldn't say.  I don't think
>sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array,
>but ugly is in the eye of the beholder.
>  
>
I'm not sure how feasible it is, but I'd much rather an efficient, 
non-copying, 1-D view of an noncontiguous array (from an enhanced 
version of flat or ravel or whatever) than a bunch of extra methods. The 
former allows all of the standard methods to just work efficiently using 
sum(ravel(A)) or sum(A.flat) [ and max and min, etc]. Making special 
whole array methods for everything just leads to method eplosion.

-tim


>  
>
>>In my opinion functions that calculate a statistic like sum 
>>should return the total in the first place, rather then over a single 
>>axis.
>>    
>>
>
>It depends on the data.  I use rank-2 arrays which are images and are
>therefore homogeneous.  Even there, though, I often want the sum of all
>rows or all columns.  For heterogeneous data (e.g., columns of different
>Y's as a function of X), the present sum() makes sense.  In other words,
>we will always need ways to sum over just one dimension and over all
>dimensions.  By analogy with MATLAB (I'm guessing), sum() in Numeric and
>numarray does a one-D sum.
>
>  
>


From stephen.walton at csun.edu  Mon Oct 25 09:35:06 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Mon Oct 25 09:35:06 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>
	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>
	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>
	 <1098479844.29804.260.camel@halloween.stsci.edu>
	 <41797FFE.8090802@colorado.edu>
	 <1098670236.1907.21.camel@localhost.localdomain>
	 <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
	 <1098721171.19183.12.camel@sunspot.csun.edu>
Message-ID: <1098722079.19183.22.camel@sunspot.csun.edu>

On Mon, 2004-10-25 at 09:19 -0700, Stephen Walton wrote:
> On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote:
>
> > I think this may be inefficient, because ravel and flat may make a copy 
> > of the data. Also I think using flat/ravel in such a way is plain ugly 
> > and a complex way to do it.
> 
> You may be right about the copying, I couldn't say. 

I just looked at the source (numeric-1.1/Lib/generic.py).  The comment
to the ravel() function states that it returns a view, not a copy;  but
it calls reshape() which does make a copy if the input array is not
contiguous.  I just tested this:

A=arange(25,shape=(5,5))
A.transpose()		# now A is not contiguous
v=ravel(A)
A[2,2]=-17
v			# verifies that v did not change.

So, in the above, it does look like ravel() made a copy, and your fears
about inefficiency are warranted.  Another test shows that changing
ravel(A) to A.flat above also results in a copy.  Mayhaps we need
sumall() after all.

-- 
Stephen Walton, Professor of Physics and Astronomy,
California State University, Northridge
stephen.walton at csun.edu


From verveer at embl-heidelberg.de  Mon Oct 25 09:44:04 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 09:44:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098721171.19183.12.camel@sunspot.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu>
Message-ID: <0BC8D972-26A5-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

On 25 Oct 2004, at 18:19, Stephen Walton wrote:

> On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote:
>> On 25 Oct 2004, at 04:17, Stephen Walton wrote:
>>>
>>> I don't think we need sumall.  The methods and the functions should
>>> simply work the same way.  If one wants sumall, use A.flat.sum() or, 
>>> if
>>> you can't use the methods or attributes on your old version of 
>>> Python,
>>> sum(ravel(A)).
>>
>> I think this may be inefficient, because ravel and flat may make a 
>> copy
>> of the data. Also I think using flat/ravel in such a way is plain ugly
>> and a complex way to do it.
>
> You may be right about the copying, I couldn't say.  I don't think
> sum(ravel(A)) looks any worse than sum(sum(sum(A))) for a rank 3 array,
> but ugly is in the eye of the beholder.

It does not look worse, I agree with that! But I would argue it should 
have been sum(A) in the first place to sum over al axes... The sumall 
would not have been needed, and summing over one (or a sub-set) axis 
could have been implemented as a an optional argument to sum().
>
>> In my opinion functions that calculate a statistic like sum
>> should return the total in the first place, rather then over a single
>> axis.
>
> It depends on the data.  I use rank-2 arrays which are images and are
> therefore homogeneous.  Even there, though, I often want the sum of all
> rows or all columns.  For heterogeneous data (e.g., columns of 
> different
> Y's as a function of X), the present sum() makes sense.  In other 
> words,
> we will always need ways to sum over just one dimension and over all
> dimensions.  By analogy with MATLAB (I'm guessing), sum() in Numeric 
> and
> numarray does a one-D sum.

I agree it is a useful feature, and it should still be possible to do 
that using an optional axis argument, even better I would love to be 
able to sum over several axes in one go, I find the one-dimensional 
character of reduce limiting, but I digress. In any case, I suppose we 
will stick with the current behaviour for backwards compatibility.

Cheers, Peter


From verveer at embl-heidelberg.de  Mon Oct 25 09:47:01 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 09:47:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098722079.19183.22.camel@sunspot.csun.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu> <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]> <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu> <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <1098722079.19183.22.camel@sunspot.csun.edu>
Message-ID: <60595242-26A5-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

On 25 Oct 2004, at 18:34, Stephen Walton wrote:

> On Mon, 2004-10-25 at 09:19 -0700, Stephen Walton wrote:
>> On Mon, 2004-10-25 at 10:26 +0200, Peter Verveer wrote:
>>
>>> I think this may be inefficient, because ravel and flat may make a 
>>> copy
>>> of the data. Also I think using flat/ravel in such a way is plain 
>>> ugly
>>> and a complex way to do it.
>>
>> You may be right about the copying, I couldn't say.
>
> I just looked at the source (numeric-1.1/Lib/generic.py).  The comment
> to the ravel() function states that it returns a view, not a copy;  but
> it calls reshape() which does make a copy if the input array is not
> contiguous.  I just tested this:
>
> A=arange(25,shape=(5,5))
> A.transpose()		# now A is not contiguous
> v=ravel(A)
> A[2,2]=-17
> v			# verifies that v did not change.
>
> So, in the above, it does look like ravel() made a copy, and your fears
> about inefficiency are warranted.  Another test shows that changing
> ravel(A) to A.flat above also results in a copy.  Mayhaps we need
> sumall() after all.

Yes, we do I guess, but I do not like such things creeping into an 
otherwise elegant package if I may be frank...

Peter


From strang at nmr.mgh.harvard.edu  Mon Oct 25 09:53:00 2004
From: strang at nmr.mgh.harvard.edu (Gary Strangman)
Date: Mon Oct 25 09:53:00 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <417D2A3C.7010108@cox.net>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov>
  <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu>
  <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain>
  <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
 <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net>
Message-ID: <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu>

> I'm not sure how feasible it is, but I'd much rather an efficient, 
> non-copying, 1-D view of an noncontiguous array (from an enhanced version of 
> flat or ravel or whatever) than a bunch of extra methods. The former allows 
> all of the standard methods to just work efficiently using sum(ravel(A)) or 
> sum(A.flat) [ and max and min, etc]. Making special whole array methods for 
> everything just leads to method eplosion.

I completely agree with this ... an efficient flat/ravel would seem to 
solve many of the issues being raised. Forgive the potentially naive 
question here, but is there any reason such an efficient, enhanced view 
can't be implemented for the .flat method? I like the concept of .flat, 
but I regularly call functions with arguments that may-or-may-not be 
contiguous. For robustness, such functions _must_ be coded with ravel() 
because .flat fails for non-contiguous arrays. I never fully understood 
why there were two ways of "flattening" in the first place.

Gary

--------------------------------------------------------------
Gary Strangman, PhD        |  Director, Neural Systems Group
Office: 617-724-0662       |  Massachusetts General Hospital
Fax:    617-726-4078       |  149 13th Street, Ste 10018
                            |  Charlestown, MA  02129


From verveer at embl-heidelberg.de  Mon Oct 25 10:09:05 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 10:09:05 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu>
Message-ID: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

On 25 Oct 2004, at 18:51, Gary Strangman wrote:

>
>> I'm not sure how feasible it is, but I'd much rather an efficient, 
>> non-copying, 1-D view of an noncontiguous array (from an enhanced 
>> version of flat or ravel or whatever) than a bunch of extra methods. 
>> The former allows all of the standard methods to just work 
>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, 
>> etc]. Making special whole array methods for everything just leads to 
>> method eplosion.
>
> I completely agree with this ... an efficient flat/ravel would seem to 
> solve many of the issues being raised. Forgive the potentially naive 
> question here, but is there any reason such an efficient, enhanced 
> view can't be implemented for the .flat method?

I believe it is not possible without copying data. The strides between 
elements of a noncontiguous array are not always the same, so you 
cannot efficiently view it as a 1D array.

>  I like the concept of .flat, but I regularly call functions with 
> arguments that may-or-may-not be contiguous. For robustness, such 
> functions _must_ be coded with ravel() because .flat fails for 
> non-contiguous arrays.

Functions should be coded in the first place to take multi-dimensional 
nature into account in my opinion. One of the points of numarray is 
that it is multi-dimensional. If a function can work over multiple 
dimensions, but it only works for 1D arrays, it is broken in my 
opinion. In my opinion sum() _is_ broken, and introducing a separate 
sum_all() is an ugly hack.

> I never fully understood why there were two ways of "flattening" in 
> the first place.

I suppose it is for efficiency reasons, flat may not always works, but 
if it does, it is efficient since it would not need to copy any data.

Peter


From Chris.Barker at noaa.gov  Mon Oct 25 10:10:20 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Mon Oct 25 10:10:20 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1098508579.3403.6.camel@localhost.localdomain>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>	 <1098480955.11372.19.camel@freyer.sfo.csun.edu>	 <p06200517bd9f307eeee5@[128.95.99.44]> <1098508579.3403.6.camel@localhost.localdomain>
Message-ID: <417D3309.9070302@noaa.gov>

A few comments on a number of posts in this thread:

Stephen Walton wrote:
>>- I'd like to write C/C++ code that would work on multiple array 
>>types.
> 
> I can't help much here, other than to say that C and C++ are pretty low
> level languages, not well suited for this level of abstraction.

Well, this is certainly true for C, but not so much for C++. I'm not 
expert, but C++ templates could be very handy here. When the numarray 
projects was just getting started, there was some discussion about using 
a template-based array package as the base, perhaps Blitz++. I still 
this this was a great idea, but I think the biggest issue at the time 
was that templates were still not constantly well supported by the wide 
variety of compilers that numarray should work with. Personally I think 
that anything supported by gcc should be fine, as anyone can use gcc on 
virtually any platform, if they want.

Anyway, it's too late to re-write numarray, but maybe a numarray <--> 
blitz++ conversion package would make it easy to write numarray 
extensions with blitz++. Perhaps even integrate it with Boost.Python. 
Another option would be to write a template-based wrapper around the 
existing Numarray objects.

By the way, my other issue with extensions is the difficulty of writing 
extensions that support discontinuous arrays, in addition to multiple 
data types. It seems someone smarter than me could use C++ classes to 
solve this one as well.

Peter Verveer wrote:

> But I do agree that it is not a good idea to introduce another set of 
> names. In my opinion functions that calculate a statistic like sum 
> should return the total in the first place, rather then over a single 
> axis.

Absolutely not! I'm far more likely to want it over a single axis, it's 
the core of "vectorizing" your code. If the data are mean the same 
thing, why aren't you storing it in a 1-d array? That being said, it 
should be easy to do various reductions over all axis, which I think 
.flat() does nicely. I thought .flat() never made a copy: am I wrong?

Stephen Walton wrote:
> It depends on the data.  I use rank-2 arrays which are images and are
> therefore homogeneous.

OK, good example.... I take back some of what I said above!

> By analogy with MATLAB (I'm guessing), sum() in Numeric and
> numarray does a one-D sum.

except Matab does it worse. If your 2-d array happens to have only one 
row, you get the sum over that..yecch!

Tim Hochberg wrote:
> I'm not sure how feasible it is, but I'd much rather an efficient, 
> non-copying, 1-D view of an noncontiguous array (from an enhanced 
> version of flat or ravel or whatever) than a bunch of extra methods. The 
> former allows all of the standard methods to just work efficiently using 
> sum(ravel(A)) or sum(A.flat) [ and max and min, etc]. Making special 
> whole array methods for everything just leads to method eplosion.

here! here! I thought that was exactly what .flat() was for. Shows what 
I know!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From rowen at u.washington.edu  Mon Oct 25 10:33:02 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Mon Oct 25 10:33:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu> 
 <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu> 
 <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]> 
 <1098479844.29804.260.camel@halloween.stsci.edu>
 <41797FFE.8090802@colorado.edu> 
 <1098670236.1907.21.camel@localhost.localdomain>
 <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
 <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net>
 <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu>
 <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
Message-ID: <p06200505bda2e56656f4@[128.95.99.44]>

At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
>On 25 Oct 2004, at 18:51, Gary Strangman wrote:
>
>>
>>>  I'm not sure how feasible it is, but I'd much rather an 
>>>efficient, non-copying, 1-D view of an noncontiguous array (from 
>>>an enhanced version of flat or ravel or whatever) than a bunch of 
>>>extra methods. The former allows all of the standard methods to 
>>>just work efficiently using sum(ravel(A)) or sum(A.flat) [ and max 
>>>and min, etc]. Making special whole array methods for everything 
>>>just leads to method eplosion.
>>
>>  I completely agree with this ... an efficient flat/ravel would 
>>seem to solve many of the issues being raised. Forgive the 
>>potentially naive question here, but is there any reason such an 
>>efficient, enhanced view can't be implemented for the .flat method?
>
>I believe it is not possible without copying data. The strides 
>between elements of a noncontiguous array are not always the same, 
>so you cannot efficiently view it as a 1D array.

How about providing an iterator that counts through all the elements 
of an array (e.g. arr.itervalues()). So long as C extensions could 
efficiently make use of such an iterator, I think it'd do the job.

One could also imagine:
- arr.iteritems(), which returned (index, value) for each item
- a mask argument: a boolean array the same shape as the data array; 
True means elide the corresponding value from the data array
- general support for indexing

More generally, I agree that sum should work the same as a function 
and a method, and that an extra axis argument could be a good thing 
(it is so common elsewhere, e.g. size). I'd be tempted to break 
backwards compatibility to fix this, since numarray is still new and 
the current situation is very confusing.

-- Russell


From strang at nmr.mgh.harvard.edu  Mon Oct 25 10:38:01 2004
From: strang at nmr.mgh.harvard.edu (Gary Strangman)
Date: Mon Oct 25 10:38:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov>
 <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov>
 <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu>
 <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain>
 <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
 <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net>
 <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu>
 <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
Message-ID: <Pine.LNX.4.60.0410251322280.27302@gate.nmr.mgh.harvard.edu>

>> I completely agree with this ... an efficient flat/ravel would seem to 
>> solve many of the issues being raised. Forgive the potentially naive 
>> question here, but is there any reason such an efficient, enhanced view 
>> can't be implemented for the .flat method?
>
> I believe it is not possible without copying data. The strides between 
> elements of a noncontiguous array are not always the same, so you cannot 
> efficiently view it as a 1D array.

And it gets even worse for different-stride slices of N-D arrays (though 
I'm not yet ready to say it's impossible to do without copying). Maybe 
it's just me, but it does seem somewhat non-pythonic for a function/method 
to break for an inefficient case, instead of dropping back to less 
efficient (i.e., copying) behavior.

> Functions should be coded in the first place to take multi-dimensional nature 
> into account in my opinion. One of the points of numarray is that it is 
> multi-dimensional. If a function can work over multiple dimensions, but it 
> only works for 1D arrays, it is broken in my opinion. In my opinion sum() 
> _is_ broken, and introducing a separate sum_all() is an ugly hack.

+1. ;-) Hence the thought to make flattening a single "enhanced" 
method/fcn ... to essentially eliminate the need for such ugly hacks. 
Typically, my functions accept N-D arguments, and can operate over a 
user-selected subset of these dimensions. I may pass a whole array, or 
every other column, or whatever. Judging from the history of this thread, 
I think a .flat that is as-efficient-as-possible and also robust to all 
forms of non-contiguity would benefit many, while also reducing the 
learning-curve issues associated with .flat vs ravel().

As for where/when/how to introduce .newandimprovedflat, welllllll, that's 
for another thread. ;-)

Gary

--------------------------------------------------------------
Gary Strangman, PhD        |  Director, Neural Systems Group
Office: 617-724-0662       |  Massachusetts General Hospital
Fax:    617-726-4078       |  149 13th Street, Ste 10018
                            |  Charlestown, MA  02129


From verveer at embl-heidelberg.de  Mon Oct 25 10:42:03 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 10:42:03 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <417D3309.9070302@noaa.gov>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>	 <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>	 <41794B47.4090909@noaa.gov>  <p06200510bd9eff3962b0@[128.95.99.44]>	 <1098480955.11372.19.camel@freyer.sfo.csun.edu>	 <p06200517bd9f307eeee5@[128.95.99.44]> <1098508579.3403.6.camel@localhost.localdomain> <417D3309.9070302@noaa.gov>
Message-ID: <1A9085AC-26AD-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

> Stephen Walton wrote:
>>> - I'd like to write C/C++ code that would work on multiple array 
>>> types.
>> I can't help much here, other than to say that C and C++ are pretty 
>> low
>> level languages, not well suited for this level of abstraction.
>
> Well, this is certainly true for C, but not so much for C++. I'm not 
> expert, but C++ templates could be very handy here. When the numarray 
> projects was just getting started, there was some discussion about 
> using a template-based array package as the base, perhaps Blitz++. I 
> still this this was a great idea, but I think the biggest issue at the 
> time was that templates were still not constantly well supported by 
> the wide variety of compilers that numarray should work with. 
> Personally I think that anything supported by gcc should be fine, as 
> anyone can use gcc on virtually any platform, if they want.

I think having the option of using C++ would be cool. But as soon as we 
would 'require' it, I would not develop for numarray anymore. C++ is a 
big pain in my opinion, although I do agree that a well written 
templating system like Blitz++ is nice if you actually use C++.

> Anyway, it's too late to re-write numarray, but maybe a numarray <--> 
> blitz++ conversion package would make it easy to write numarray 
> extensions with blitz++. Perhaps even integrate it with Boost.Python. 
> Another option would be to write a template-based wrapper around the 
> existing Numarray objects.

yes, it would be nice to have the option. There is no reason why there 
could not be a C++ API which would include the use of templates layered 
on top of the current C API for those people that would like to use it.

> By the way, my other issue with extensions is the difficulty of 
> writing extensions that support discontinuous arrays, in addition to 
> multiple data types. It seems someone smarter than me could use C++ 
> classes to solve this one as well.

I had to deal with that problem too in nd_image. It is doable, albeit 
ugly if you depend on plain C. Probably C++ could do it differently and 
more nicely, Blitz++ possible does. Again, not for me.

> Peter Verveer wrote:
>
>> But I do agree that it is not a good idea to introduce another set of 
>> names. In my opinion functions that calculate a statistic like sum 
>> should return the total in the first place, rather then over a single 
>> axis.
>
> Absolutely not! I'm far more likely to want it over a single axis, 
> it's the core of "vectorizing" your code. If the data are mean the 
> same thing, why aren't you storing it in a 1-d array?

I agree that it is important, I am just saying that both are very 
common operations. Why not support operations over an axis by a 
optional argument, you will often have to specify which axis you want 
anyway.

> That being said, it should be easy to do various reductions over all 
> axis, which I think .flat() does nicely. I thought .flat() never made 
> a copy: am I wrong?

Unfortunately, flattening an array is not always possible without 
copying, due to the fact that arrays may be not contiguous in memory.

> Tim Hochberg wrote:
>> I'm not sure how feasible it is, but I'd much rather an efficient, 
>> non-copying, 1-D view of an noncontiguous array (from an enhanced 
>> version of flat or ravel or whatever) than a bunch of extra methods. 
>> The former allows all of the standard methods to just work 
>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, 
>> etc]. Making special whole array methods for everything just leads to 
>> method eplosion.
>
> here! here! I thought that was exactly what .flat() was for. Shows 
> what I know!

It is however not feasible I think to do it efficiently. It seems to me 
that a set of functions is necessary to do things like sum, minimum and 
so on, that work on the whole array. I would also prefer they are not 
methods. Introducing a whole array of sum_all() like functions is also 
not great.

Cheers, Peter


From verveer at embl-heidelberg.de  Mon Oct 25 11:04:01 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 11:04:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <p06200505bda2e56656f4@[128.95.99.44]>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <p06200505bda2e56656f4@[128.95.99.44]>
Message-ID: <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de>

On 25 Oct 2004, at 19:32, Russell E Owen wrote:

> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
>> On 25 Oct 2004, at 18:51, Gary Strangman wrote:
>>
>>>
>>>>  I'm not sure how feasible it is, but I'd much rather an efficient, 
>>>> non-copying, 1-D view of an noncontiguous array (from an enhanced 
>>>> version of flat or ravel or whatever) than a bunch of extra 
>>>> methods. The former allows all of the standard methods to just work 
>>>> efficiently using sum(ravel(A)) or sum(A.flat) [ and max and min, 
>>>> etc]. Making special whole array methods for everything just leads 
>>>> to method eplosion.
>>>
>>>  I completely agree with this ... an efficient flat/ravel would seem 
>>> to solve many of the issues being raised. Forgive the potentially 
>>> naive question here, but is there any reason such an efficient, 
>>> enhanced view can't be implemented for the .flat method?
>>
>> I believe it is not possible without copying data. The strides 
>> between elements of a noncontiguous array are not always the same, so 
>> you cannot efficiently view it as a 1D array.
>
> How about providing an iterator that counts through all the elements 
> of an array (e.g. arr.itervalues()). So long as C extensions could 
> efficiently make use of such an iterator, I think it'd do the job.

It would still be slower, because you would need a function call at 
each element that returns a value. Not a problem if you do a lot of 
work at each element, but if you are just adding values you want a 
custom written C function. You can do it a the C level with macros or 
so, (I do that in nd_image) but that would not help at the python 
level.

> One could also imagine:
> - arr.iteritems(), which returned (index, value) for each item
> - a mask argument: a boolean array the same shape as the data array; 
> True means elide the corresponding value from the data array
> - general support for indexing

Essentially you are suggesting to expose iterators at the python level 
that iterate over an array in some predefined way. That is possible, 
but I doubt it will be efficient.

At the C level however, it might be worth thinking about as a way of 
easing writing functions in C. I proposed to do it the other way around 
in an earlier mail: providing a set of generic functions that take a 
python or a C function to be applied at each element. I most likely 
will implement something in that direction, but I should give your idea 
also some thought.

> More generally, I agree that sum should work the same as a function 
> and a method, and that an extra axis argument could be a good thing 
> (it is so common elsewhere, e.g. size). I'd be tempted to break 
> backwards compatibility to fix this, since numarray is still new and 
> the current situation is very confusing.

I would absolutely vote for such a change. Simply because we would like 
a range of such functions, e.g. minimum, maximum, and so on. Even if we 
have to leave sum() as it is, I think we should have the alternatives, 
we would just have to come up with an alternative name for sum(). In 
fact I would consider volunteering implementing these functions.

Peter


From tim.hochberg at cox.net  Mon Oct 25 14:03:03 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Oct 25 14:03:03 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <p06200505bda2e56656f4@[128.95.99.44]> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de>
Message-ID: <417D69CD.7070604@cox.net>

Peter Verveer wrote:

>
> On 25 Oct 2004, at 19:32, Russell E Owen wrote:
>
>> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
>>
>>> On 25 Oct 2004, at 18:51, Gary Strangman wrote:
>>>
>>>>
>>>>>  I'm not sure how feasible it is, but I'd much rather an 
>>>>> efficient, non-copying, 1-D view of an noncontiguous array (from 
>>>>> an enhanced version of flat or ravel or whatever) than a bunch of 
>>>>> extra methods. The former allows all of the standard methods to 
>>>>> just work efficiently using sum(ravel(A)) or sum(A.flat) [ and max 
>>>>> and min, etc]. Making special whole array methods for everything 
>>>>> just leads to method eplosion.
>>>>
>>>>
>>>>  I completely agree with this ... an efficient flat/ravel would 
>>>> seem to solve many of the issues being raised. Forgive the 
>>>> potentially naive question here, but is there any reason such an 
>>>> efficient, enhanced view can't be implemented for the .flat method?
>>>
>>>
>>> I believe it is not possible without copying data. The strides 
>>> between elements of a noncontiguous array are not always the same, 
>>> so you cannot efficiently view it as a 1D array.
>>
>>
>> How about providing an iterator that counts through all the elements 
>> of an array (e.g. arr.itervalues()). So long as C extensions could 
>> efficiently make use of such an iterator, I think it'd do the job.
>
>
> It would still be slower, because you would need a function call at 
> each element that returns a value. Not a problem if you do a lot of 
> work at each element, but if you are just adding values you want a 
> custom written C function. You can do it a the C level with macros or 
> so, (I do that in nd_image) but that would not help at the python level.
>
>> One could also imagine:
>> - arr.iteritems(), which returned (index, value) for each item
>> - a mask argument: a boolean array the same shape as the data array; 
>> True means elide the corresponding value from the data array
>> - general support for indexing
>
>
> Essentially you are suggesting to expose iterators at the python level 
> that iterate over an array in some predefined way. That is possible, 
> but I doubt it will be efficient.
>
> At the C level however, it might be worth thinking about as a way of 
> easing writing functions in C. I proposed to do it the other way 
> around in an earlier mail: providing a set of generic functions that 
> take a python or a C function to be applied at each element. I most 
> likely will implement something in that direction, but I should give 
> your idea also some thought.
>
>> More generally, I agree that sum should work the same as a function 
>> and a method, and that an extra axis argument could be a good thing 
>> (it is so common elsewhere, e.g. size). I'd be tempted to break 
>> backwards compatibility to fix this, since numarray is still new and 
>> the current situation is very confusing.
>
>
> I would absolutely vote for such a change. Simply because we would 
> like a range of such functions, e.g. minimum, maximum, and so on. Even 
> if we have to leave sum() as it is, I think we should have the 
> alternatives, we would just have to come up with an alternative name 
> for sum(). In fact I would consider volunteering implementing these 
> functions.

Why the need to break backwards compatability? If one is going to 
reimplement sum, et al so as to operate on an arbitrary set of axes 
there's no reason one couldn't maintain the current behaviour as the 
default. All that is required is to allow axis to be a number (current 
behaviour), a tuple (reduce across the designated axes) or some special 
value to sum over all (None?, "all"?).

Having two sum functions with different names is not particularly better 
than the current proposal of a method and a function.

-tim


From verveer at embl-heidelberg.de  Mon Oct 25 15:48:03 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Oct 25 15:48:03 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <417D69CD.7070604@cox.net>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <p06200505bda2e56656f4@[128.95.99.44]> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net>
Message-ID: <E4AF47EA-26D7-11D9-8DC3-000D932805AC@embl-heidelberg.de>

On Oct 25, 2004, at 11:02 PM, Tim Hochberg wrote:

> Peter Verveer wrote:
>
>>
>> On 25 Oct 2004, at 19:32, Russell E Owen wrote:
>>
>>> At 7:08 PM +0200 2004-10-25, Peter Verveer wrote:
>>>
>>>> On 25 Oct 2004, at 18:51, Gary Strangman wrote:
>>>>
>>>>>
>>>>>>  I'm not sure how feasible it is, but I'd much rather an 
>>>>>> efficient, non-copying, 1-D view of an noncontiguous array (from 
>>>>>> an enhanced version of flat or ravel or whatever) than a bunch of 
>>>>>> extra methods. The former allows all of the standard methods to 
>>>>>> just work efficiently using sum(ravel(A)) or sum(A.flat) [ and 
>>>>>> max and min, etc]. Making special whole array methods for 
>>>>>> everything just leads to method eplosion.
>>>>>
>>>>>
>>>>>  I completely agree with this ... an efficient flat/ravel would 
>>>>> seem to solve many of the issues being raised. Forgive the 
>>>>> potentially naive question here, but is there any reason such an 
>>>>> efficient, enhanced view can't be implemented for the .flat 
>>>>> method?
>>>>
>>>>
>>>> I believe it is not possible without copying data. The strides 
>>>> between elements of a noncontiguous array are not always the same, 
>>>> so you cannot efficiently view it as a 1D array.
>>>
>>>
>>> How about providing an iterator that counts through all the elements 
>>> of an array (e.g. arr.itervalues()). So long as C extensions could 
>>> efficiently make use of such an iterator, I think it'd do the job.
>>
>>
>> It would still be slower, because you would need a function call at 
>> each element that returns a value. Not a problem if you do a lot of 
>> work at each element, but if you are just adding values you want a 
>> custom written C function. You can do it a the C level with macros or 
>> so, (I do that in nd_image) but that would not help at the python 
>> level.
>>
>>> One could also imagine:
>>> - arr.iteritems(), which returned (index, value) for each item
>>> - a mask argument: a boolean array the same shape as the data array; 
>>> True means elide the corresponding value from the data array
>>> - general support for indexing
>>
>>
>> Essentially you are suggesting to expose iterators at the python 
>> level that iterate over an array in some predefined way. That is 
>> possible, but I doubt it will be efficient.
>>
>> At the C level however, it might be worth thinking about as a way of 
>> easing writing functions in C. I proposed to do it the other way 
>> around in an earlier mail: providing a set of generic functions that 
>> take a python or a C function to be applied at each element. I most 
>> likely will implement something in that direction, but I should give 
>> your idea also some thought.
>>
>>> More generally, I agree that sum should work the same as a function 
>>> and a method, and that an extra axis argument could be a good thing 
>>> (it is so common elsewhere, e.g. size). I'd be tempted to break 
>>> backwards compatibility to fix this, since numarray is still new and 
>>> the current situation is very confusing.
>>
>>
>> I would absolutely vote for such a change. Simply because we would 
>> like a range of such functions, e.g. minimum, maximum, and so on. 
>> Even if we have to leave sum() as it is, I think we should have the 
>> alternatives, we would just have to come up with an alternative name 
>> for sum(). In fact I would consider volunteering implementing these 
>> functions.
>
> Why the need to break backwards compatability? If one is going to 
> reimplement sum, et al so as to operate on an arbitrary set of axes 
> there's no reason one couldn't maintain the current behaviour as the 
> default.

It seems to me that the behavior one would expect for a function like 
that, would be to apply the operation to the whole array. Not along an 
axis. What would you expect as a new user if you call a minimum() 
function? A single value that is the minimum. So that is the logical 
choice for the default behavior, I would think.

>  All that is required is to allow axis to be a number (current 
> behaviour), a tuple (reduce across the designated axes) or some 
> special value to sum over all (None?, "all"?).

Yes, that would be the idea anyway. The question is what should be the 
default behavior for this type of functions, something I think we 
should not decide based on the current behavior of a single existing 
function, but based on what makes the most sense. That is obviously 
something that can be discussed...

>
> Having two sum functions with different names is not particularly 
> better than the current proposal of a method and a function.

This is certainly true. I would prefer breaking compability...

Peter


From meikuan75 at hotmail.com  Tue Oct 26 02:22:05 2004
From: meikuan75 at hotmail.com (Mei Kuan)
Date: Tue Oct 26 02:22:05 2004
Subject: [Numpy-discussion] Singaporeans ay tumutulong para mapaunlad ang sariling negosyo
Message-ID: <E1CMNW4-00022D-Q8@sc8-sf-mx2.sourceforge.net>

Dear Filipino friend,
 
Kumusta ka na?
 
We were looking and your email just appeared, perhaps it was GOD's will. We sincerely hope that you read on this letter. This may be of significant relevance to you and your loved ones and give you something you are looking for in life. 
 
Do allow us to provide you with a brief introduction of ourselves.
 
We are a team of Singaporean entrepreneurs hailing from various professional fields. 
 
We know that, in the new millennium, more Filipino employees and professionals are finding it harder to get ahead in life due to greater job insecurity as a result of corporate downsizing and global outsourcing, diminishing wages, office politics, not forgetting constant retrenchment threats. They are further affected by the rising costs of living and interest rates, not forgetting the current economic difficulties that Philippines is currently facing.
 
There are also thousands of Filipinos who have to endure the heart-break of leaving their loved ones to venture overseas in order to support their loved ones and the Philippines economy once again.
 
Filipino businessmen too, have to grapple with increasing economic and political uncertainties, epidemic threats such as the Avian Flu, competitive threats and unstable crude oil crisis. Further, due to the increasingly rapid changes in the business environment, they find it harder to keep up with the increasingly volatile business cycles. 
 
We recognise these problems faced by many Filipinos today and decide to embark on a more fulfilling long term career of helping them solve their problems and improving their lives in the process.
 
What we do is to help Filipinos develop/diversify into their own businesses in a new, potentially huge and expanding industry so that they can start managing the above adversities and making significant progress towards what they and their loved ones want in life once again. 
 
Would this be something that may be deemed as a long term solution in your life? 
 
Our fellow associates from Singapore will be flying specially to the Philippines to conduct a series of exclusive previews in Makati, Cebu and Naga in November. 
 
Would you be interested to attend one of our previews for you to discover how our revolutionary platform can possibly help you and your loved ones improve your results on a long-term basis?
 
If you are interested to attend, could you kindly provide us with your cellphone no. for our senior associate, Mr. Chew to text you when he is in Philippines next month?
 
Mr. Chew was a very successful corporate executive from a Multi-National Corporation and a former Economic Lecturer. He held a Master of Science Degree in Financial Economics. Hence, he knows what it takes for a business to be considered a viable one and of course, what it takes to succeed in the business. He gave up a very successful corporate life to help many Filipinos change their lives. Despite his busy schedule, he is committed to flying to Philippines to help them. As such, he is a great mentor, inspirational, dynamic leader to many of us. He gained great respects from many of our Filipinos and non Filipinos friends. We believe he is the best person to share with you in depth how our revolutionary platform can fulfill your goals in life and turn your dreams into reality.
 
We would handle all enquiries via Chikka: 001877961 or Skype: Reychell 
 
We sincerely urge you to communicate with us on Chikka/Skype to know you better as a friend and understand the challenges you are currently facing because we are looking to help you on a long-term basis. 
 
Ingats.
 
GOD BLESS.
 
Chow Mei Kuan (Ms) / Don (Mr.) 
Email: reychell at singnet.com.sg /chewlw at singnet.com.sg
Chikka No.: 001877961
Skype ID: Reychell
 
P.S.: This may be a GOD-send opportunity to improve your life. 
 

Disclaimer:
This email, together with any attachments, is intended ONLY for the use of the individual or entity to which it is addressed, and may contain information that is legally privileged, confidential, and/or subject to copyright. If you are not the intended recipient, please be informed that any dissemination, distribution or copying of this email, any attachment, or part thereof is strictly prohibited. Kindly note that internet communications are not secure, and therefore are susceptible to alterations. If you have received this email in error, please advise the sender by reply email, and delete this message. Your co-operation on this matter is highly appreciated. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20041026/b20d2176/attachment-0001.html>

From Chris.Barker at noaa.gov  Tue Oct 26 09:21:08 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Tue Oct 26 09:21:08 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <E4AF47EA-26D7-11D9-8DC3-000D932805AC@embl-heidelberg.de>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <p06200505bda2e56656f4@[128.95.99.44]> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net> <E4AF47EA-26D7-11D9-8DC3-000D932805AC@embl-heidelberg.de>
Message-ID: <417E7907.9060107@noaa.gov>

Peter Verveer wrote:
> On Oct 25, 2004, at 11:02 PM, Tim Hochberg wrote:
>> Why the need to break backwards compatability? If one is going to 
>> reimplement sum, et al so as to operate on an arbitrary set of axes 
>> there's no reason one couldn't maintain the current behaviour as the 
>> default.

Great idea!

> It seems to me that the behavior one would expect for a function like 
> that, would be to apply the operation to the whole array. Not along an 
> axis. What would you expect as a new user if you call a minimum() 
> function?  A single value that is the minimum. So that is the logical 
> choice for the default behavior, I would think.

nope. I'd expect it to be along an axis, by default the last one. To me, 
that's what vectorization is all about. Maybe this is because of my 
MATLAB (and now Numeric) background, but it makes the most sense to me 
that a method either returns an array of the same rank, or "reducing" 
methods return an array of rank reduced by one. Having a method return 
the same rank answer, no matter the rank of the input, is weird to me.

This all depends on how you use arrays. I can see that if you tend to 
use a 2-d array to store an image, that the single minimum would seem 
logical, but for many other uses, each dimension has an independent meaning.

> Yes, that would be the idea anyway. The question is what should be the 
> default behavior for this type of functions, something I think we should 
> not decide based on the current behavior of a single existing function, 
> but based on what makes the most sense. That is obviously something that 
> can be discussed...

yup, but frankly, this isn't about just one function, it's really about 
all the reductions: min, max, sum, etc, etc. I think the rule of thumb 
is not to break backward compatibility unless there is a compelling 
reason, and given that it's not clear what is most "natural" in this 
case, keeping the default the same makes the most sense.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From verveer at embl-heidelberg.de  Tue Oct 26 11:20:02 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Tue Oct 26 11:20:02 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <417E7907.9060107@noaa.gov>
References: <1098381332.8249.12.camel@freyer.sfo.csun.edu>  <417809B9.5000108@noaa.gov> <1098392607.8249.20.camel@freyer.sfo.csun.edu>  <41794B47.4090909@noaa.gov> <p06200510bd9eff3962b0@[128.95.99.44]>  <1098479844.29804.260.camel@halloween.stsci.edu> <41797FFE.8090802@colorado.edu>  <1098670236.1907.21.camel@localhost.localdomain> <92F5E404-265F-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <1098721171.19183.12.camel@sunspot.csun.edu> <417D2A3C.7010108@cox.net> <Pine.LNX.4.60.0410251242130.27302@gate.nmr.mgh.harvard.edu> <7BE8019A-26A8-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <p06200505bda2e56656f4@[128.95.99.44]> <211FDC07-26B0-11D9-9F77-000A95C92C8E@embl-heidelberg.de> <417D69CD.7070604@cox.net> <E4AF47EA-26D7-11D9-8DC3-000D932805AC@embl-heidelberg.de> <417E7907.9060107@noaa.gov>
Message-ID: <8629C0DC-277B-11D9-8DC3-000D932805AC@embl-heidelberg.de>

On Oct 26, 2004, at 6:19 PM, Chris Barker wrote:

> Peter Verveer wrote:
>> It seems to me that the behavior one would expect for a function like 
>> that, would be to apply the operation to the whole array. Not along 
>> an axis. What would you expect as a new user if you call a minimum() 
>> function?  A single value that is the minimum. So that is the logical 
>> choice for the default behavior, I would think.
>
> nope. I'd expect it to be along an axis, by default the last one.

I still do not agree completely with that, I will elaborate more below, 
because I also do not agree anymore with my own earlier writings :-).

But I see your point that this type of operation can be natural 
depending on what you are doing. Sometimes a single value does make 
sense, sometimes not, I think we can agree on that.

>> Yes, that would be the idea anyway. The question is what should be 
>> the default behavior for this type of functions, something I think we 
>> should not decide based on the current behavior of a single existing 
>> function, but based on what makes the most sense. That is obviously 
>> something that can be discussed...
>
> yup, but frankly, this isn't about just one function, it's really 
> about all the reductions: min, max, sum, etc, etc.

Actually no. It seems that sum() is a special case, along with a few 
others. Again: I elaborate on the general case below.

> I think the rule of thumb is not to break backward compatibility 
> unless there is a compelling reason, and given that it's not clear 
> what is most "natural" in this case, keeping the default the same 
> makes the most sense.

I agree. In contrast what I have said before I think we should keep it 
as it is, for compatibility.

Now to elaborate on the general problem, please correct me if I get 
something wrong. I will use the minimum function as an example and come 
back to sum() later.

If you look at a minimum operation then there are three different 
things you might like to do:

1) An element by element minimum: minimum(a1, a2). This is the current 
behaviour. Like all binary ufuncs of this type, it operates on pairs of 
arrays. So by default it does not do reduction or calculate a single 
minimum. For most ufuncs that is the natural behavior anyway.

2) A reduction: minimum.reduce(a1). The reduce method of ufuncs is 
generally used for reductions. Having to use .reduce makes clear what 
you are doing. Although a bit odd at first sight, I think it is a 
clever way to overload ufuncs names with different functionality.

3) The minimum of the array:  In numarray you do a1.min(). I think in 
Numeric, you have to do something like minimum.reduce(a1.flat), correct 
me if I am wrong. Not nice in both cases...

Note that calling a binary ufunc with a single argument will give an 
error: minimum(a1) raises a TypeError. That seems to be a good 
decision, because people seem to have different ideas of what should 
happen: I would expect the minimum of the array, others expect a 
reduction. Generally I guess it was a wise decision not to change the 
meaning of a function depending on wether it has one or two arguments.

The sum() function is an alias to add.reduce. there are a few more of 
these aliases (i.e. product). I would still say that it is a bit 
unfortunate, since not everybody may immediately realize that these 
functions are in fact reductions.

I wonder if one would not be better of without these functions at all, 
after all you can access the functionality through .reduce(). If you 
mind the extra typing, just define your own alias. Can't we shift them 
into numarray.numeric? Just a thought...

In any case, clearly these functions need to stay around as they are 
for compatibility reasons. It is far more productive to add the 
functionality that a few people already proposed: allow reductions over 
multiple axes. I would welcome that, I always found 1D reductions a bit 
limited anyway. Obviously you can do sequential 1D reductions, but that 
can be quite inefficient. As proposed, the axis argument would take 
maybe a list of dimensions, and 'all' or None. I would like to propose 
an additional possibility: like minimum.reduce(), we could have a 
minimum.all() function that reduces over all dimensions (with a 
potentially much more efficient implementation.) We don't need a 
sum_all(a1) then, you would use add.all(a1). I guess this would be 
easily prototyped using sequential reductions, one can worry  about 
efficiency later.

Sorry for the long story...

Cheers, Peter


From haase at msg.ucsf.edu  Wed Oct 27 09:59:02 2004
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Wed Oct 27 09:59:02 2004
Subject: [Numpy-discussion] bug? in len(arr.flat)
Message-ID: <200410270958.20025.haase@msg.ucsf.edu>

Hi,
I have a (UInt16) 3d data stack and want to get to it's underlying buffer (to 
(later) feed it into memmap) ...  
I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't 
multiply itemsize in.
>>> pr2.shape
(40, 512, 512)
>>> pr2.flat.shape
(10485760)
>>> 512*512*40
10485760
>>> len(pr2.flat)
10485760
>>> pr2.flat._itemsize
2
>>> len(pr2._data)
20971520
>>> pr2._byteoffset
0

Is this a bug or am I missunderstanding ?

Thanks,
Sebastian Haase
  

From strawman at astraw.com  Thu Oct 28 19:21:02 2004
From: strawman at astraw.com (Andrew Straw)
Date: Thu Oct 28 19:21:02 2004
Subject: [Numpy-discussion] floating point exception weirdness
In-Reply-To: <41795006.1040807@astraw.com>
References: <4119BBFC.6020304@astraw.com> <1092221365.3752.32.camel@localhost.localdomain> <411A08FA.7000601@astraw.com> <41795006.1040807@astraw.com>
Message-ID: <4181A8CC.2040807@astraw.com>

Just a small addendum, (which I hope will spur on bug-fixing once Todd 
et al. are back from the conference -- let me know if I should file a 
sourceforge bug report):

Numeric is not necessary to trigger the bug in the below code -- 
numarray is sufficient on its own.  Furthermore, I can confirm that 
merely removing the "atlas3-sse2" Debian package from my system causes 
the code, whether or not numarray.ieeespecial is imported, to run 
without being killed by an FPE.

Andrew Straw wrote:

> I've isolated a bug I first reported on this mailing list in August.  
> I've now confined it to a small code snippet using entirely 
> open-source software (previously I saw it while using Intel's IPP).  
> In a nutshell, importing numarray.ieeespecial triggers a floating 
> point exception (which kills my program) when I call Numeric's 
> singular_value_decomposition() function:
>
> import Numeric
> from LinearAlgebra import singular_value_decomposition
>
> if want_FPE:
>    import numarray.ieeespecial
>
> A= [[-5.7, 2.2, -0.53, 46.0],
>    [-2.3, -5.5, -1.0, 1091.0],
>    [5.9, 1.4, -0.1, -142.0],
>    [-1.3, 5.7, -1.5, 2673.0]]
> A=Numeric.array(A)
> u,s,v = singular_value_decomposition(A) # FPE triggered here
>
> Here's my setup:
>
> $ python
> Python 2.3.4 (#2, Sep 24 2004, 08:39:09)
> [GCC 3.3.4 (Debian 1:3.3.4-12)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import Numeric
> >>> Numeric.__version__
> '23.6'
> >>> import numarray
> >>> numarray.__version__
> '1.2a'
>
> $ gcc -v
> Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.4/specs
> Configured with: ../src/configure -v 
> --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang 
> --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info 
> --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared 
> --with-system-zlib --enable-nls --without-included-gettext 
> --enable-__cxa_atexit --enable-clocale=gnu --enable-debug 
> --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
> Thread model: posix
> gcc version 3.3.4 (Debian 1:3.3.4-13)
>
> Now, for the clue:  the above error is ONLY triggered when I compile 
> Numeric to use system blas and friends, not when I use lapack_lite 
> included with Numeric.  This leads me to suspect it is related to the 
> SSE2 unit -- I have Debian sarge's atlas3-base, atlas3-see, 
> atlas3-sse2, blas, lapack, lapack3, and refblas3 packages installed on 
> my P4 machine.
>
> So, to propose a hypothesis: numarray.ieeespecial sets the FPE bit in 
> the SSE2 hardware, but for some reason this does not raise SIGFPE.  
> However, when the next call that touches SSE2 happens, the kernel sees 
> that error bit and throws the signal.  Does this explanation make 
> sense?  Is it easy to fix?
>
> Cheers!
> Andrew
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out 
> more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From stevech1097 at yahoo.com.au  Thu Oct 28 21:56:30 2004
From: stevech1097 at yahoo.com.au (Steve Chaplin)
Date: Thu Oct 28 21:56:30 2004
Subject: [Numpy-discussion] Re: floating point exception weirdness (Andrew Straw)
In-Reply-To: <E1CNNnd-0004TT-4x@sc8-sf-list2.sourceforge.net>
References: <E1CNNnd-0004TT-4x@sc8-sf-list2.sourceforge.net>
Message-ID: <1099025806.2742.23.camel@f1>

> Just a small addendum, (which I hope will spur on bug-fixing once Todd 
> et al. are back from the conference -- let me know if I should file a 
> sourceforge bug report):

I've not read all this thread so I don't know the full background. But I
had a floating point / SSE problem using numarray.

It turned out to be a glibc not numarray problem and was solved by
upgrading glibc.
http://sources.redhat.com/bugzilla/show_bug.cgi?id=10
There was also a SourceForge bug report but I can't locate it.

Regards
Steve


From jmiller at stsci.edu  Fri Oct 29 06:27:11 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Oct 29 06:27:11 2004
Subject: [Numpy-discussion] bug? in len(arr.flat)
In-Reply-To: <200410270958.20025.haase@msg.ucsf.edu>
References: <200410270958.20025.haase@msg.ucsf.edu>
Message-ID: <1099056380.4904.12.camel@localhost.localdomain>

On Wed, 2004-10-27 at 12:58, Sebastian Haase wrote:
> Hi,
> I have a (UInt16) 3d data stack and want to get to it's underlying buffer (to 
> (later) feed it into memmap) ...  
> I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't 
> multiply itemsize in.
> >>> pr2.shape
> (40, 512, 512)
> >>> pr2.flat.shape
> (10485760)
> >>> 512*512*40
> 10485760
> >>> len(pr2.flat)
> 10485760
> >>> pr2.flat._itemsize
> 2
> >>> len(pr2._data)
> 20971520
> >>> pr2._byteoffset
> 0
> 
> Is this a bug 

No.

> or am I missunderstanding ?

Yes.  _data is "an object which supports the buffer protocol".  In this
context,  it is effectively a string and thus the product of the total
number of elements and the itemsize.  (We'll ignore for now the fact
that not every array uses the entire buffer.)  In contrast, shape(.flat)
is only the total number of elements and is independent of itemsize.

Regards,
Todd


From haase at msg.ucsf.edu  Fri Oct 29 09:03:25 2004
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri Oct 29 09:03:25 2004
Subject: [Numpy-discussion] bug? in len(arr.flat)
In-Reply-To: <1099056380.4904.12.camel@localhost.localdomain>
References: <200410270958.20025.haase@msg.ucsf.edu> <1099056380.4904.12.camel@localhost.localdomain>
Message-ID: <200410290902.25410.haase@msg.ucsf.edu>

Of course !  sorry I forgot.

Thanks,
Sebastian


On Friday 29 October 2004 06:26 am, Todd Miller wrote:
> On Wed, 2004-10-27 at 12:58, Sebastian Haase wrote:
> > Hi,
> > I have a (UInt16) 3d data stack and want to get to it's underlying buffer
> > (to (later) feed it into memmap) ...
> > I noticed that len(pr2._flat) is half of len(pr2._data) - like it doesn't
> > multiply itemsize in.
> >
> > >>> pr2.shape
> >
> > (40, 512, 512)
> >
> > >>> pr2.flat.shape
> >
> > (10485760)
> >
> > >>> 512*512*40
> >
> > 10485760
> >
> > >>> len(pr2.flat)
> >
> > 10485760
> >
> > >>> pr2.flat._itemsize
> >
> > 2
> >
> > >>> len(pr2._data)
> >
> > 20971520
> >
> > >>> pr2._byteoffset
> >
> > 0
> >
> > Is this a bug
>
> No.
>
> > or am I missunderstanding ?
>
> Yes.  _data is "an object which supports the buffer protocol".  In this
> context,  it is effectively a string and thus the product of the total
> number of elements and the itemsize.  (We'll ignore for now the fact
> that not every array uses the entire buffer.)  In contrast, shape(.flat)
> is only the total number of elements and is independent of itemsize.
>
> Regards,
> Todd
>
>
>
>
> -------------------------------------------------------
> This Newsletter Sponsored by: Macrovision
> For reliable Linux application installations, use the industry's leading
> setup authoring tool, InstallShield X. Learn more and evaluate
> today. http://clk.atdmt.com/MSI/go/ins0030000001msi/direct/01/
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From jmiller at stsci.edu  Fri Oct 29 11:19:14 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Oct 29 11:19:14 2004
Subject: [Numpy-discussion] Counting array elements
Message-ID: <1099073854.4904.321.camel@localhost.localdomain>

I have returned from our astronomical data systems conference and I am
going to take a short cut and summarize what I saw as the key
developments of this thread.  I apologize for not responding sooner and
individually but the web-mail system I use isn't effective for
conducting any kind of discussion.  You guys did a great job sorting
this out this week.  I marked my key points with **.  The rest is
probably only for people with a lot of patience.

** I've finally come to terms with the fact that functions are the right
way to do numarray rather than methods.   The arguments in the Numeric
manual are no more persuasive now than they ever were,  but Stephen
Walton's remarks about method explosion finally convinced me what the
"real" reason for doing functions is that using methods combines every
new feature under the umbrella of a single namespace, the NumArray
class.  Using functions lets us partition things into modules which can
be used selectively and makes a more extensible and understandable
system.  Thanks Stephen.

A couple people remarked that using .flat might solve everything with
something like a.flat.sum() or sum(ravel(a).  This gets to the original
motivation for the sum() method, which was the codification of a simple
and storage efficient technique for reducing noncontiguous arrays.  The
first point is that a non-contiguous array cannot generally be reshaped
without making a copy.   The basic idea of the sum() method is to do
*two* reductions,  the first, along a single axis,  results in a smaller
contiguous array.  In the case of astronomical images which are
generally square or at least non-degenerate,  the reduction result is a
*much* smaller array.  The second reduction handles all the remaining
dimensions since .flat is guaranteed to work because the array is
contiguous.  The end result is a complete sum() without righting
additional ufuncs or making an array copy.

There was understandable confusion about why .flat is sometimes allowed
to fail.  Since it is an attribute,  we thought it inappropriate to make
it return a copy of the source array and chose instead to raise an
exception.  In contrast, it is reasonable for the ravel() function to
return a completely different array, so it always works.  (I just
noticed that ravel() is not named flat()).  Some of our more
contemporary thinkers suggested using iterators to produce a .flat which
always works.  If anyone has an idea how to make this work with good
performance,  please let me know;  I don't.

** Tim Hochberg pointed out that we can overload the reduction (and not
accumulation?) axis parameter with an "all" or a tuple describing a
sequence of axes to reduce along.  My perception was that there was a
consensus behind this and in any case I'm in agreement with Tim.  Alan
Isaac pointed out that None might be better here than "all" and I
agree.  At this point,  I think sumAll() is dead, the sum() method will
be deprecated, and the reductions should be expanded as Tim suggested.

** Peter Verveer made some comments about the expectations of a naive
user regarding reductions, namely that "all" should be the default.   My
own experience bears this out,  and I am torn about what to do here. 
Chris Barker pointed out the need for backward compatibility with
Numeric,  and given the current numarray goal of supporting SciPy,  this
need is growing stronger and more complex.  SciPy uses yet another axis
convention.  If anyone has any ideas how to handle these multiple
conventions with elegance,  let me know.

A number of people commented on our naming conventions, an issue which
we have side stepped for the moment with sumAll().  My impression is
that, for better or worse, numarray uses the lowerUpper() version of
Camel case.  I think this is very much a matter of personal taste and
don't claim to have any.   My guess is that numarray is probably
inconsistent at the moment, in part because lowerUpper() often
degenerates into merely lower() which degenerates into confusion. 

Regards,
Todd


From verveer at embl-heidelberg.de  Sat Oct 30 08:39:28 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Sat Oct 30 08:39:28 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain>
References: <1099073854.4904.321.camel@localhost.localdomain>
Message-ID: <BE9A7550-2A89-11D9-B5CD-000D932805AC@embl-heidelberg.de>

> ** Peter Verveer made some comments about the expectations of a naive
> user regarding reductions, namely that "all" should be the default.   
> My
> own experience bears this out,  and I am torn about what to do here.
> Chris Barker pointed out the need for backward compatibility with
> Numeric,  and given the current numarray goal of supporting SciPy,  
> this
> need is growing stronger and more complex.  SciPy uses yet another axis
> convention.  If anyone has any ideas how to handle these multiple
> conventions with elegance,  let me know.

Numarray should probably be either completely compatible in every small 
detail, or we could take the opportunity to change what we believe was 
the wrong choice. Not sure what is really best, although personally 
feel breaking compatibility is fine if the result is better. Is there 
not already a sub-package numeric within numarray that provides Numeric 
compatibility? Such a package could at  least provide wrappers with 
compatible behavior for people who need that.

Peter


From tim.hochberg at cox.net  Sat Oct 30 11:49:36 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sat Oct 30 11:49:36 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain>
References: <1099073854.4904.321.camel@localhost.localdomain>
Message-ID: <4183E208.6050001@cox.net>

Todd Miller wrote:

[SNIP]

>** Tim Hochberg pointed out that we can overload the reduction (and not
>accumulation?) 
>
It seems possible. It's probably marginally useful at best. However, it 
might be worth doing if not too painful, just so that the accumulate and 
reduce signatures match.

>axis parameter with an "all" or a tuple describing a
>sequence of axes to reduce along.  My perception was that there was a
>consensus behind this and in any case I'm in agreement with Tim.  Alan
>Isaac pointed out that None might be better here than "all" and I
>agree.  
>
Using None to mean ALL seems a little perverse to me, but I'll grant 
that using an existing singleton makes things simpler. I'll just point 
out that it would also be possible to define an ALL singleton and use that.

Very tangential: it's too bad that '...' can't be typed more places: the 
natural spelling for ALL is [...] as in:
    add.reduce(a, axis=[...])
Sadly, that won't work.

>At this point,  I think sumAll() is dead, the sum() method will
>be deprecated, and the reductions should be expanded as Tim suggested.
>
>** Peter Verveer made some comments about the expectations of a naive
>user regarding reductions, namely that "all" should be the default.   My
>own experience bears this out,  and I am torn about what to do here. 
>  
>
I suspect that one's experience here depends on your typical problem 
domain. If one does a lot 2D work ALL would seem to be the natural 
choice. If you use a lot of arrays of vectors, as I do, -1 is the 
natural choice. At this point I can't recall a case where ALL would have 
been the natural choice for me.

In addition to backwards compatibility, one argument for not using ALL 
as the default is that it makes little sense or no sense for accumulate. 
Having the default for reduce be ALL, but that for accumulate be -1 (for 
instance) would be confusing.
 

>Chris Barker pointed out the need for backward compatibility with
>Numeric,  
>
I'd think that the importance of backward compatibility with not just 
Numeric, but with Numarray itself has been underrated. Changing the 
default for reduce / sum is a particularly insiduous since many uses 
will fail silently, producing the wrong answer, but continuing to run.  
This means that all instances of sum, product and reduce will need to be 
inspected and corrected. Having 10k LOC that use Numarray, I'll be a bit 
irked if this gets changed without a better justification than what I've 
seen thus far.

>and given the current numarray goal of supporting SciPy,  this
>need is growing stronger and more complex.  SciPy uses yet another axis
>convention.  If anyone has any ideas how to handle these multiple
>conventions with elegance,  let me know.
>  
>
Could you describe the SciPy axis convention: I'm not familiar with it.

[SNIP]

-tim


From gazzar at email.com  Sun Oct 31 04:22:01 2004
From: gazzar at email.com (Gary Ruben)
Date: Sun Oct 31 04:22:01 2004
Subject: [Numpy-discussion] vector cross product
Message-ID: <20041031121856.E2DDC1CE304@ws1-6.us4.outblaze.com>

Not that I have a really urgent need, but is there a reason that nice, fast C-based vector operations aren't implemented in Numeric or numarray? I notice Fernando Perez has a cross product as a useful SciPy weave example on his site. I've also seen comments elsewhere about Numpy's lack of a cross product. eg. <http://mail.python.org/pipermail/python-list/2004-March/213878.html>
I'm using Konrad Hinsen's Scientific Python for the convenience value of his Vector class, which also provides a nice angle() method but it bothers me that it's implemented in native Python. The Vector type in vpython probably does it 'properly', but I don't use it just for the convenience since it adds an extra dependency to my code.

comments?
Gary R.
-- 
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm


From perry at stsci.edu  Sun Oct 31 09:22:28 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Sun Oct 31 09:22:28 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <1099073854.4904.321.camel@localhost.localdomain>
Message-ID: <NEBBIJKBMLDBLNCEEFOCEEDDFIAA.perry@stsci.edu>

Todd Miller wrote:
> 
> There was understandable confusion about why .flat is sometimes allowed
> to fail.  Since it is an attribute,  we thought it inappropriate to make
> it return a copy of the source array and chose instead to raise an
> exception.  In contrast, it is reasonable for the ravel() function to
> return a completely different array, so it always works.  (I just
> noticed that ravel() is not named flat()).  Some of our more
> contemporary thinkers suggested using iterators to produce a .flat which
> always works.  If anyone has an idea how to make this work with good
> performance,  please let me know;  I don't.
> 
This aspect of flat can be considered a wart. There are three different
desired behaviors depending on who you talk to. For efficiency reasons,
some only want flat (and even ravel) to work if the array is already
contiguous; that is, they don't want copies unless they ask for them.
Others want it to always work, producing a copy if necessary but
otherwise for it to return a view. Yet others always want a copy.
So, are three different versions needed? Or options to a function?
The drawback of .flat (as an attribute) is there is only one choice
for behavior. For a function (or a method) we could modify the
behavior with a keyword argument. Personally, I would rather .flat
always work, even if it means returning a copy. Is there any 
consensus on how this problem should be handled?

> ** Peter Verveer made some comments about the expectations of a naive
> user regarding reductions, namely that "all" should be the default.   My
> own experience bears this out,  and I am torn about what to do here. 
> Chris Barker pointed out the need for backward compatibility with
> Numeric,  and given the current numarray goal of supporting SciPy,  this
> need is growing stronger and more complex.  SciPy uses yet another axis
> convention.  If anyone has any ideas how to handle these multiple
> conventions with elegance,  let me know.
> 
I find this issue particularly vexing as well. Let's be clear about 
this, scipy changes the behavior of Numeric to produce a new flavor.
What should numarray do? Follow the scipy behavior or the Numeric
behavior? Or should there be a scipy/numarray flavor vs the more
Numeric compatible numarray? Note, we never intended numarray to be
100% compatible with Numeric since there were aspects we thought
should be changed (e.g., scalar/array type coercions). Yet there
appear to be two camps of the Numeric community. Some sort of 
survey may be in order here. Is scipy where all the new growth is
now? Should we just adopt the axis convention used there? I'd
very much prefer not proliferate any more flavors of behavior
and just settle on one.

> A number of people commented on our naming conventions, an issue which
> we have side stepped for the moment with sumAll().  My impression is
> that, for better or worse, numarray uses the lowerUpper() version of
> Camel case.  I think this is very much a matter of personal taste and
> don't claim to have any.   My guess is that numarray is probably
> inconsistent at the moment, in part because lowerUpper() often
> degenerates into merely lower() which degenerates into confusion. 
> 
How much of the public interface uses camelCase? I don't think
all that much if any. It seems to me the inclination of scipy
is to avoid it and I'm happy with that. The internal implementation
is a different issue, and there I think Todd is right that it 
probably is somewhat inconsistent on that front.

Perry


From perry at stsci.edu  Sun Oct 31 09:30:28 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Sun Oct 31 09:30:28 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <BE9A7550-2A89-11D9-B5CD-000D932805AC@embl-heidelberg.de>
Message-ID: <NEBBIJKBMLDBLNCEEFOCMEDDFIAA.perry@stsci.edu>

Peter Verveer wrote:

> Numarray should probably be either completely compatible in every small 
> detail, or we could take the opportunity to change what we believe was 

Well, as I mentioned before having numarray match Numeric in every
small detail is not going to happen (and even there, which flavor?
the original Numeric or the scipy version?). We've been pretty clear
about where incompatibilities were deliberate. But on the other hand,
that leaves many other choices that could be revisited if enough 
people support them. The problem is that no matter what is done,
I suspect some people are going to be inconvenienced since there
is already (without numarray) a split in the community because
of scipy. 

> the wrong choice. Not sure what is really best, although personally 
> feel breaking compatibility is fine if the result is better. Is there 
> not already a sub-package numeric within numarray that provides Numeric 
> compatibility? Such a package could at  least provide wrappers with 
> compatible behavior for people who need that.
> 
At the moment the numeric module provides more Numeric compatibility
(but not complete). In matplotlib we use a module called numerix to
provide a uniform interface to both Numeric and numerix (along with
prohibitions on use of certain features that don't exist in the other).
We are looking at scipy_base now that undoubtably will highlight
similar cases where we will suggest internal reorganization to 
do the same sort of thing that was done for matplotlib.

For those that intend to use numarray only now and forever, one is
free to use all the features they desire. But there still is the
behavior issue of those things that are currently incompatible like
the axis issue.

Perry


From tim.hochberg at cox.net  Sun Oct 31 14:24:01 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Oct 31 14:24:01 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <4183F168.3060205@ucsd.edu>
References: <1099073854.4904.321.camel@localhost.localdomain> <4183E208.6050001@cox.net> <4183F168.3060205@ucsd.edu>
Message-ID: <418564AE.6050206@cox.net>

Robert Kern wrote:

> Tim Hochberg wrote:
>
>> Could you describe the SciPy axis convention: I'm not familiar with it.
>
>
> axis=-1


OK, so Numarray (currently) and Numeric use axis=0, SciPy uses axis=-1 
and there is some desire to use axis=ALL as instead.

One advantage of ALL is that it breaks everyone's code equally, so there 
wouldn't be any charges of favoritism <0.8 wink>.

I can't come up with any way to reconcile the three, but I can suggest a 
transition strategy whatever the decision. Supply an option so that one 
can require axis arguments to all calls to reduce. Then it's relatively 
easy to track down all the reduce calls and fix the ones that are 
broken. Something like numarray.setRequireReduceAxisArg(True).

FWIW, it wouldn't bother me much to use SciPy's default here: supporting 
SciPy is a worthwhile goal and I think SciPy's choice here is a 
reasonable one. Another alternative that wouldn't bother me much is "In 
the face of ambiguity, refuse the temptation to guess". That is, always 
require axis arguments for multidimensional arrays. While not backwards 
compatible, this would make the transition relatively easy, since uses 
that might fail would raise exceptions.

-tim


From rkern at ucsd.edu  Sun Oct 31 16:01:04 2004
From: rkern at ucsd.edu (Robert Kern)
Date: Sun Oct 31 16:01:04 2004
Subject: [Numpy-discussion] Counting array elements
In-Reply-To: <418564AE.6050206@cox.net>
References: <1099073854.4904.321.camel@localhost.localdomain> <4183E208.6050001@cox.net> <4183F168.3060205@ucsd.edu> <418564AE.6050206@cox.net>
Message-ID: <41857B53.5010308@ucsd.edu>

Tim Hochberg wrote:
> Robert Kern wrote:
> 
>> Tim Hochberg wrote:
>>
>>> Could you describe the SciPy axis convention: I'm not familiar with it.
>>
>> axis=-1
> 
> OK, so Numarray (currently) and Numeric use axis=0,

Well, sometimes.  :-)

> SciPy uses axis=-1 

I should note that this convention is for Scipy-defined functions. With 
one unfortunate exception (cumsum), Scipy does not overwrite Numeric's 
axis default for Numeric-defined functions.

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter