From falted at pytables.org  Thu Jul  1 01:51:39 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Jul  1 01:51:39 2004
Subject: [Numpy-discussion] Speeding up wxPython/numarray
In-Reply-To: <1088632048.7526.204.camel@halloween.stsci.edu>
References: <40E31B31.7040105@cox.net> <1088632048.7526.204.camel@halloween.stsci.edu>
Message-ID: <200407011048.01929.falted@pytables.org>

A Dimecres 30 Juny 2004 23:47, Todd Miller va escriure:
> > There were a couple of other things I tried that resulted in additional 
> > small speedups, but the tactics I used were too horrible to reproduce 
> > here. The main one of interest is that all of the calls to 
> > NA_updateDataPtr seem to burn some time. However, I don't have any idea 
> > what one could do about that.
> 
> Francesc Alted had the same comment about NA_updateDataPtr a while ago.
> I tried to optimize it then but didn't get anywhere.  NA_updateDataPtr()
> should be called at most once per extension function (more is
> unnecessary but not harmful) but needs to be called at least once as a
> consequence of the way the buffer protocol doesn't give locked
> pointers.  

FYI I'm still refusing to call NA_updateDataPtr() in a spoecific part of my
code that requires as much speed as possible. It works just fine from
numarray 0.5 on (numarray 0.4 gave a segmentation fault on that). However,
Todd already warned me about that and told me that this is unsafe.
Nevertheless, I'm using the optimization for read-only purposes (i.e. they
are not accessible to users) over numarray objects, and that *seems* to be
safe (at least I did not have any single problem after numarray 0.5). I know
that I'm walking on the cutting edge, but life is dangerous anyway ;).

By the way, that optimization gives me a 70% of improvement during element
access to NumArray elements. It would be very nice if you finally can
achieve additional performance with your recent bet :).

Good luck!,

-- 
Francesc Alted


From haase at msg.ucsf.edu  Thu Jul  1 09:06:24 2004
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Thu Jul  1 09:06:24 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <20040701053355.M99698@grenoble.cnrs.fr>
References: <1088451653.3744.200.camel@localhost.localdomain> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr>
Message-ID: <200407010904.25498.haase@msg.ucsf.edu>

On Wednesday 30 June 2004 11:33 pm, gerard.vermeulen at grenoble.cnrs.fr wrote:
> On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
>
> > So... you use the "meta" code to provide package specific ordinary
> > (not-macro-fied) functions to keep the different versions of the
> > Present() and isArray() macros from conflicting.
> >
> > It would be nice to have a standard approach for using the same
> > "extension enhancement code" for both numarray and Numeric.  The PEP
> > should really be expanded to provide an example of dual support for one
> > complete and real function, guts and all, so people can see the process
> > end-to-end;  Something like a simple arrayprint.  That process needs
> > to be refined to remove as much tedium and duplication of effort as
> > possible.  The idea is to make it as close to providing one
> > implementation to support both array packages as possible.  I think it's
> > important to illustrate how to partition the extension module into
> > separate compilation units which correctly navigate the dual
> > implementation mine field in the easiest possible way.
> >
> > It would also be nice to add some logic to the meta-functions so that
> > which array package gets used is configurable.  We did something like
> > that for the matplotlib plotting software at the Python level with
> > the "numerix" layer, an idea I think we copied from Chaco.  The kind
> > of dispatch I think might be good to support configurability looks like
> > this:
> >
> > PyObject *
> > whatsThis(PyObject *dummy, PyObject *args)
> > {
> >     PyObject *result, *what = NULL;
> >     if (!PyArg_ParseTuple(args, "O", &what))
> >       return 0;
> >     switch(PyArray_Which(what)) {
> >       USE_NUMERIC:
> >          result = Numeric_whatsThis(what); break;
> >       USE_NUMARRAY:
> >          result = Numarray_whatsThis(what); break;
> >       USE_SEQUENCE:
> >          result = Sequence_whatsThis(what); break;
> >     }
> >     Py_INCREF(Py_None);
> >     return Py_None;
> > }
> >
> > In the above,  I'm picturing a separate .c file for Numeric_whatsThis
> > and for Numarray_whatsThis.  It would be nice to streamline that to one
> > .c and a process which somehow (simply) produces both functions.
> >
> > Or, ideally, the above would be done more like this:
> >
> > PyObject *
> > whatsThis(PyObject *dummy, PyObject *args)
> > {
> >     PyObject *result, *what = NULL;
> >     if (!PyArg_ParseTuple(args, "O", &what))
> >        return 0;
> >     switch(Numerix_Which(what)) {
> >        USE_NUMERIX:
> >           result = Numerix_whatsThis(what); break;
> >        USE_SEQUENCE:
> >           result = Sequence_whatsThis(what); break;
> >     }
> >     Py_INCREF(Py_None);
> >     return Py_None;
> > }
> >
> > Here, a common Numerix implementation supports both numarray and Numeric
> > from a single simple .c.  The extension module would do "#include
> > numerix/arrayobject.h" and "import_numerix()" and otherwise just call
> > PyArray_* functions.
> >
> > The current stumbling block is that numarray is not binary compatible
> > with Numeric... so numerix in C falls apart.  I haven't analyzed
> > every symbol and struct to see if it is really feasible... but it
> > seems like it is *almost* feasible, at least for typical usage.
> >
> > So, in a nutshell,  I think the dual implementation support you
> > demoed is important and we should work up an example and kick it
> > around to make sure it's the best way we can think of doing it.
> > Then we should add a section to the PEP describing dual support as well.
>
> I would never apply numarray code to Numeric arrays and the inverse. It
> looks dangerous and I do not know if it is possible.  The first thing
> coming to mind is that numarray and Numeric arrays refer to different type
> objects (this is what my pep module uses to differentiate them).  So, even
> if numarray and Numeric are binary compatible, any 'alien' code referring
> the the 'Python-standard part' of the type objects may lead to surprises. A
> PEP proposing hacks will raise eyebrows at least.
>
> Secondly, most people use Numeric *or* numarray and not both.
>
> So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out
> (NINO) Of course, Numeric or numarray output can be a user option if NINO
> does not apply.  (explicit safe conversion between Numeric and numarray is
> possible if really needed).
>
> I'll try to flesh out the demo with real functions in the way you indicated
> (going as far as I consider safe).
>
> The problem of coding the Numeric (or numarray) functions in more than
> a single source file has also be addressed.
>
> It may take 2 weeks because I am off to a conference next week.
>
> Regards -- Gerard

Hi all,
first, I would like to state that I don't understand much of this discussion;
so the only comment I wanted to make is that IF this where possible, to make 
(C/C++) code that can live with both Numeric and numarray, then I think it 
would be used more and more - think: transition phase !! (e.g. someone could 
start making the FFTW part  of scipy numarray friendly without having to 
switch everything at one [hint ;-)] )

These where just my 2 cents.
Cheers,
Sebastian Haase


From jmiller at stsci.edu  Thu Jul  1 09:44:13 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul  1 09:44:13 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <20040701053355.M99698@grenoble.cnrs.fr>
References: <1088451653.3744.200.camel@localhost.localdomain>
	 <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr>
	 <1088536183.17789.346.camel@halloween.stsci.edu>
	 <20040629211800.M55753@grenoble.cnrs.fr>
	 <1088632459.7526.213.camel@halloween.stsci.edu>
	 <20040701053355.M99698@grenoble.cnrs.fr>
Message-ID: <1088700210.14402.17.camel@halloween.stsci.edu>

On Thu, 2004-07-01 at 02:33, gerard.vermeulen at grenoble.cnrs.fr wrote: 
> On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
> > 
> > So... you use the "meta" code to provide package specific ordinary
> > (not-macro-fied) functions to keep the different versions of the
> > Present() and isArray() macros from conflicting.
> > 
> > It would be nice to have a standard approach for using the same
> > "extension enhancement code" for both numarray and Numeric.  The PEP
> > should really be expanded to provide an example of dual support for one
> > complete and real function, guts and all, so people can see the process
> > end-to-end;  Something like a simple arrayprint.  That process needs 
> > to be refined to remove as much tedium and duplication of effort as 
> > possible.  The idea is to make it as close to providing one 
> > implementation to support both array packages as possible.  I think it's
> > important to illustrate how to partition the extension module into
> > separate compilation units which correctly navigate the dual
> > implementation mine field in the easiest possible way.
> > 
> > It would also be nice to add some logic to the meta-functions so that
> > which array package gets used is configurable.  We did something like
> > that for the matplotlib plotting software at the Python level with 
> > the "numerix" layer, an idea I think we copied from Chaco.  The kind 
> > of dispatch I think might be good to support configurability looks like
> > this:
> > 
> > PyObject *
> > whatsThis(PyObject *dummy, PyObject *args)
> > {
> >     PyObject *result, *what = NULL;
> >     if (!PyArg_ParseTuple(args, "O", &what))
> >       return 0;
> >     switch(PyArray_Which(what)) {
> >       USE_NUMERIC:
> >          result = Numeric_whatsThis(what); break;
> >       USE_NUMARRAY:
> >          result = Numarray_whatsThis(what); break;
> >       USE_SEQUENCE:
> >          result = Sequence_whatsThis(what); break;
> >     }
> >     Py_INCREF(Py_None);
> >     return Py_None;
> > }
> > 
> > In the above,  I'm picturing a separate .c file for Numeric_whatsThis
> > and for Numarray_whatsThis.  It would be nice to streamline that to one
> > .c and a process which somehow (simply) produces both functions.
> > 
> > Or, ideally, the above would be done more like this:
> > 
> > PyObject *
> > whatsThis(PyObject *dummy, PyObject *args)
> > {
> >     PyObject *result, *what = NULL;
> >     if (!PyArg_ParseTuple(args, "O", &what))
> >        return 0;
> >     switch(Numerix_Which(what)) {
> >        USE_NUMERIX:
> >           result = Numerix_whatsThis(what); break;
> >        USE_SEQUENCE:
> >           result = Sequence_whatsThis(what); break;
> >     }
> >     Py_INCREF(Py_None);
> >     return Py_None;
> > }
> > 
> > Here, a common Numerix implementation supports both numarray and Numeric
> > from a single simple .c.  The extension module would do "#include
> > numerix/arrayobject.h" and "import_numerix()" and otherwise just call
> > PyArray_* functions.
> > 
> > The current stumbling block is that numarray is not binary compatible
> > with Numeric... so numerix in C falls apart.  I haven't analyzed 
> > every symbol and struct to see if it is really feasible... but it 
> > seems like it is *almost* feasible, at least for typical usage.
> > 
> > So, in a nutshell,  I think the dual implementation support you 
> > demoed is important and we should work up an example and kick it 
> > around to make sure it's the best way we can think of doing it.  
> > Then we should add a section to the PEP describing dual support as well.
> > 
> I would never apply numarray code to Numeric arrays and the inverse. It looks
> dangerous and I do not know if it is possible.  

I think that's definitely the marching orders for now... but you gotta
admit, it would be nice.

> The first thing coming
> to mind is that numarray and Numeric arrays refer to different type objects
> (this is what my pep module uses to differentiate them).  So, even if
> numarray and Numeric are binary compatible, any 'alien' code referring the
> the 'Python-standard part' of the type objects may lead to surprises.
> A PEP proposing hacks will raise eyebrows at least. 

I'm a little surprised it took someone to talk me out of it...  I'll
just concede that this was probably a bad idea.

> Secondly, most people use Numeric *or* numarray and not both.

A class of question which will arise for developers is this: "X works
with Numeric,  but X doesn't work with numaray."  The reverse also
happens occasionally.  For this reason, being able to choose would be
nice for developers.

> So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out (NINO)
> Of course, Numeric or numarray output can be a user option if NINO does not
> apply.  

When I first heard it, I though NINO was a good idea,  with the
limitation that it doesn't apply when a function produces an array
without consuming any.  But... there is another problem with NINO that
Perry Greenfield pointed out:  with multiple arguments,  there can be a
mix of array types.  For this reason,  it makes sense to be able to
coerce all the inputs to a particular array package.  This form might
look more like:

switch(PyArray_Which(<no_parameter_at_all!>)) {
	case USE_NUMERIC:
		result = Numeric_doit(a1, a2, a3);  break;
	case USE_NUMARRAY:
		result = Numarray_doit(a1, a2, a3);  break;
	case USE_SEQUENCE:
		result = Sequence_doit(a1, a2, a3);  break;
}

One last thing:  I think it would be useful to be able to drive the code
into sequence mode with arrays.  This would enable easy benchmarking of
the performance improvement.

> (explicit safe conversion between Numeric and numarray is possible
> if really needed).
> 
>I'll try to flesh out the demo with real functions in the way you indicated
> (going as far as I consider safe).
> 
> The problem of coding the Numeric (or numarray) functions in more than
> a single source file has also be addressed.
> 
> It may take 2 weeks because I am off to a conference next week.


Excellent.  See you in a couple weeks.

Regards,
Todd


From jmiller at stsci.edu  Thu Jul  1 09:59:01 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul  1 09:59:01 2004
Subject: [Numpy-discussion] Speeding up wxPython/numarray
In-Reply-To: <40E3462A.9080303@cox.net>
References: <40E31B31.7040105@cox.net>
	 <1088632048.7526.204.camel@halloween.stsci.edu>  <40E3462A.9080303@cox.net>
Message-ID: <1088701077.14402.20.camel@halloween.stsci.edu>

On Wed, 2004-06-30 at 19:00, Tim Hochberg wrote: 
> By this do you mean the "#if PY_VERSION_HEX >= 0x02030000 " that is 
> wrapped around _ndarray_item? If so, I believe that it *is* getting 
> compiled, it's just never getting called.
> 
> What I think is happening is that the class NumArray inherits its 
> sq_item from PyClassObject. In particular, I think it picks up 
> instance_item from Objects/classobject.c. This appears to be fairly 
> expensive and, I think, ends up calling tp_as_mapping->mp_subscript. 
> Thus, _ndarray's sq_item slot never gets called. All of this is pretty 
> iffy since I don't know this stuff very well and I didn't trace it all 
> the way through. However, it explains what I've seen thus far.
> 
> This is why I ended up using the horrible hack. I'm resetting NumArray's 
> sq_item to point to _ndarray_item instead of instance_item.  I believe 
> that access at the python level goes through mp_subscrip, so it 
> shouldn't be affected, and only objects at the C level should notice and 
> they should just get the faster sq_item. You, will notice that there are 
> an awful lot of I thinks in the above paragraphs though...

Ugh...  Thanks for explaining this.

> >>I then optimized _ndarray_item (code 
> >>at end). This halved the execution time of my arbitrary benchmark. This 
> >>trick may have horrible, unforseen consequences so use at your own risk.
> >>    
> >>
> >
> >Right now the sq_item hack strikes me as somewhere between completely
> >unnecessary and too scary for me!  Maybe if python-dev blessed it.
> >  
> >
> Yes, very scary. And it occurs to me that it will break subclasses of 
> NumArray if they override __getitem__. When these subclasses are 
> accessed from C they will see nd_array's sq_item instead of the 
> overridden getitem.   However,  I think I also know how to fix it. But 
> it does point out that it is very dangerous and there are probably dark 
> corners of which I'm unaware. Asking on Python-List or PyDev would 
> probably be a good idea.
> 
> The nonscary, but painful, fix would to rewrite NumArray in C.

Non-scary to whom?

> >This optimization looks good to me.
> >  
> >
> Unfortunately, I don't think the optimization to sq_item will affect 
> much since NumArray appears to override it with
> 
> >>Finally I commented out the __del__  method numarraycore. This resulted 
> >>in an additional speedup of 64% for a total speed up of 240%. Still not 
> >>close to 10x, but a large improvement. However, this is obviously not 
> >>viable for real use, but it's enough of a speedup that I'll try to see 
> >>if there's anyway to move the shadow stuff back to tp_dealloc.
> >>    
> >>
> >
> >FYI, the issue with tp_dealloc may have to do with which mode Python is
> >compiled in, --with-pydebug, or not.  One approach which seems like it
> >ought to work (just thought of this!) is to add an extra reference in C
> >to the NumArray instance __dict__ (from NumArray.__init__ and stashed
> >via a new attribute in the PyArrayObject struct) and then DECREF it as
> >the last part of the tp_dealloc.  
> >  
> >
> That sounds promising.

I looked at this some, and while INCREFing __dict__ maybe the right
idea,  I forgot that there *is no* Python NumArray.__init__ anymore.  

So the INCREF needs to be done in C without doing any getattrs;  this
seems to mean calling a private _PyObject_GetDictPtr function to get a
pointer to the __dict__ slot which can be dereferenced to get the
__dict__.

> [SNIP]
> 
> >
> >Well, be picking out your beer.
> >  
> >
> I was only about half right, so I'm not sure I qualify...

We could always reduce your wages to a 12-pack... 


Todd


From gerard.vermeulen at grenoble.cnrs.fr  Thu Jul  1 11:39:08 2004
From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen)
Date: Thu Jul  1 11:39:08 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <1088700210.14402.17.camel@halloween.stsci.edu>
References: <1088451653.3744.200.camel@localhost.localdomain>
	<20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr>
	<1088536183.17789.346.camel@halloween.stsci.edu>
	<20040629211800.M55753@grenoble.cnrs.fr>
	<1088632459.7526.213.camel@halloween.stsci.edu>
	<20040701053355.M99698@grenoble.cnrs.fr>
	<1088700210.14402.17.camel@halloween.stsci.edu>
Message-ID: <20040701203739.31f80e02.gerard.vermeulen@grenoble.cnrs.fr>

On 01 Jul 2004 12:43:31 -0400
Todd Miller <jmiller at stsci.edu> wrote:

> A class of question which will arise for developers is this: "X works
> with Numeric,  but X doesn't work with numaray."  The reverse also
> happens occasionally.  For this reason, being able to choose would be
> nice for developers.
> 
> > So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out (NINO)
> > Of course, Numeric or numarray output can be a user option if NINO does not
> > apply.  
> 
> When I first heard it, I though NINO was a good idea,  with the
> limitation that it doesn't apply when a function produces an array
> without consuming any.  But... there is another problem with NINO that
> Perry Greenfield pointed out:  with multiple arguments,  there can be a
> mix of array types.  For this reason,  it makes sense to be able to
> coerce all the inputs to a particular array package.  This form might
> look more like:
> 
> switch(PyArray_Which(<no_parameter_at_all!>)) {
> 	case USE_NUMERIC:
> 		result = Numeric_doit(a1, a2, a3);  break;
> 	case USE_NUMARRAY:
> 		result = Numarray_doit(a1, a2, a3);  break;
> 	case USE_SEQUENCE:
> 		result = Sequence_doit(a1, a2, a3);  break;
> }
> 
> One last thing:  I think it would be useful to be able to drive the code
> into sequence mode with arrays.  This would enable easy benchmarking of
> the performance improvement.
> 
> > (explicit safe conversion between Numeric and numarray is possible
> > if really needed).

Yeah, when I wrote 'if really needed', I was hoping to shift the
responsibility of coercion (or conversion) to the Python programmer (my
lazy side telling me that it can be done in pure Python).

You talked me into doing it in C :-)

Regards -- Gerard


From tim.hochberg at cox.net  Thu Jul  1 11:52:05 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Jul  1 11:52:05 2004
Subject: [Numpy-discussion] Speeding up wxPython/numarray
In-Reply-To: <1088701077.14402.20.camel@halloween.stsci.edu>
References: <40E31B31.7040105@cox.net>	 <1088632048.7526.204.camel@halloween.stsci.edu>  <40E3462A.9080303@cox.net> <1088701077.14402.20.camel@halloween.stsci.edu>
Message-ID: <40E45D3C.7020501@cox.net>

Todd Miller wrote:

>On Wed, 2004-06-30 at 19:00, Tim Hochberg wrote: 
>  
>
>>>>
>>>>        
>>>>
>>>FYI, the issue with tp_dealloc may have to do with which mode Python is
>>>compiled in, --with-pydebug, or not.  One approach which seems like it
>>>ought to work (just thought of this!) is to add an extra reference in C
>>>to the NumArray instance __dict__ (from NumArray.__init__ and stashed
>>>via a new attribute in the PyArrayObject struct) and then DECREF it as
>>>the last part of the tp_dealloc.  
>>> 
>>>
>>>      
>>>
>>That sounds promising.
>>    
>>
> <>
> I looked at this some, and while INCREFing __dict__ maybe the right
> idea, I forgot that there *is no* Python NumArray.__init__ anymore.
>
> So the INCREF needs to be done in C without doing any getattrs; this
> seems to mean calling a private _PyObject_GetDictPtr function to get a
> pointer to the __dict__ slot which can be dereferenced to get the
> __dict__.

Might there be a simpler way? Since you're putting an extra attribute on 
the PyArrayObject structure anyway, wouldn't it be possible to just 
stash _shadows there instead of the reference to the dictionary? It 
appears that that the only time _shadows is accessed from python is in 
__del__. If it were instead an attribute on ndarray, the dealloc problem 
would go away since the responsibility for deallocing it would fall to 
ndarray. Since everything else accesses it from C, that shouldn't be 
much of a problem and should speed that stuff up as well.

-tim


From cjw at sympatico.ca  Thu Jul  1 12:59:01 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Thu Jul  1 12:59:01 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <200407010904.25498.haase@msg.ucsf.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <200407010904.25498.haase@msg.ucsf.edu>
Message-ID: <40E46CD3.9090802@sympatico.ca>

Sebastian Haase wrote:

>On Wednesday 30 June 2004 11:33 pm, gerard.vermeulen at grenoble.cnrs.fr wrote:
>  
>
>>On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
>>
>>    
>>
>>>So... you use the "meta" code to provide package specific ordinary
>>>(not-macro-fied) functions to keep the different versions of the
>>>Present() and isArray() macros from conflicting.
>>>
>>>It would be nice to have a standard approach for using the same
>>>"extension enhancement code" for both numarray and Numeric.  The PEP
>>>should really be expanded to provide an example of dual support for one
>>>complete and real function, guts and all, so people can see the process
>>>end-to-end;  Something like a simple arrayprint.  That process needs
>>>to be refined to remove as much tedium and duplication of effort as
>>>possible.  The idea is to make it as close to providing one
>>>implementation to support both array packages as possible.  I think it's
>>>important to illustrate how to partition the extension module into
>>>separate compilation units which correctly navigate the dual
>>>implementation mine field in the easiest possible way.
>>>
>>>It would also be nice to add some logic to the meta-functions so that
>>>which array package gets used is configurable.  We did something like
>>>that for the matplotlib plotting software at the Python level with
>>>the "numerix" layer, an idea I think we copied from Chaco.  The kind
>>>of dispatch I think might be good to support configurability looks like
>>>this:
>>>
>>>PyObject *
>>>whatsThis(PyObject *dummy, PyObject *args)
>>>{
>>>    PyObject *result, *what = NULL;
>>>    if (!PyArg_ParseTuple(args, "O", &what))
>>>      return 0;
>>>    switch(PyArray_Which(what)) {
>>>      USE_NUMERIC:
>>>         result = Numeric_whatsThis(what); break;
>>>      USE_NUMARRAY:
>>>         result = Numarray_whatsThis(what); break;
>>>      USE_SEQUENCE:
>>>         result = Sequence_whatsThis(what); break;
>>>    }
>>>    Py_INCREF(Py_None);
>>>    return Py_None;
>>>}
>>>
>>>In the above,  I'm picturing a separate .c file for Numeric_whatsThis
>>>and for Numarray_whatsThis.  It would be nice to streamline that to one
>>>.c and a process which somehow (simply) produces both functions.
>>>
>>>Or, ideally, the above would be done more like this:
>>>
>>>PyObject *
>>>whatsThis(PyObject *dummy, PyObject *args)
>>>{
>>>    PyObject *result, *what = NULL;
>>>    if (!PyArg_ParseTuple(args, "O", &what))
>>>       return 0;
>>>    switch(Numerix_Which(what)) {
>>>       USE_NUMERIX:
>>>          result = Numerix_whatsThis(what); break;
>>>       USE_SEQUENCE:
>>>          result = Sequence_whatsThis(what); break;
>>>    }
>>>    Py_INCREF(Py_None);
>>>    return Py_None;
>>>}
>>>
>>>Here, a common Numerix implementation supports both numarray and Numeric
>>>from a single simple .c.  The extension module would do "#include
>>>numerix/arrayobject.h" and "import_numerix()" and otherwise just call
>>>PyArray_* functions.
>>>
>>>The current stumbling block is that numarray is not binary compatible
>>>with Numeric... so numerix in C falls apart.  I haven't analyzed
>>>every symbol and struct to see if it is really feasible... but it
>>>seems like it is *almost* feasible, at least for typical usage.
>>>
>>>So, in a nutshell,  I think the dual implementation support you
>>>demoed is important and we should work up an example and kick it
>>>around to make sure it's the best way we can think of doing it.
>>>Then we should add a section to the PEP describing dual support as well.
>>>      
>>>
>>I would never apply numarray code to Numeric arrays and the inverse. It
>>looks dangerous and I do not know if it is possible.  The first thing
>>coming to mind is that numarray and Numeric arrays refer to different type
>>objects (this is what my pep module uses to differentiate them).  So, even
>>if numarray and Numeric are binary compatible, any 'alien' code referring
>>the the 'Python-standard part' of the type objects may lead to surprises. A
>>PEP proposing hacks will raise eyebrows at least.
>>
>>Secondly, most people use Numeric *or* numarray and not both.
>>
>>So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out
>>(NINO) Of course, Numeric or numarray output can be a user option if NINO
>>does not apply.  (explicit safe conversion between Numeric and numarray is
>>possible if really needed).
>>
>>I'll try to flesh out the demo with real functions in the way you indicated
>>(going as far as I consider safe).
>>
>>The problem of coding the Numeric (or numarray) functions in more than
>>a single source file has also be addressed.
>>
>>It may take 2 weeks because I am off to a conference next week.
>>
>>Regards -- Gerard
>>    
>>
>
>Hi all,
>first, I would like to state that I don't understand much of this discussion;
>so the only comment I wanted to make is that IF this where possible, to make 
>(C/C++) code that can live with both Numeric and numarray, then I think it 
>would be used more and more - think: transition phase !! (e.g. someone could 
>start making the FFTW part  of scipy numarray friendly without having to 
>switch everything at one [hint ;-)] )
>
>These where just my 2 cents.
>Cheers,
>Sebastian Haase
>  
>
I feel lower on the understanding tree with respect to what is being 
proposed in the draft PEP, but would still like to offer my 2 cents 
worth.  I get the feeling that numarray is being bent out of shape to 
fit Numeric.

It was my understanding that Numeric had certain weakness which made it 
unacceptable as a Python component and that numarray was intended to 
provide the same or better functionality within a pythonic framework.

numarray has not achieved the expected performance level to date, but 
progress is being made and I believe that, for larger arrays, numarray 
has been shown to be be superior to Numeric - please correct me if I'm 
wrong here.

The shock came for me when Todd Miller said:  

    <>
    I looked at this some, and while INCREFing __dict__ maybe the right
    idea, I forgot that there *is no* Python NumArray.__init__ anymore.

Wasn't it the intent of numarray to work towards the full use of the 
Python class structure to provide the benefits which it offers?

The Python class has two constructors and one destructor.

The constructors are __init__ and __new__, the latter only provides the 
shell of an instance which later has to be initialized.  In version 0.9, 
which I use, there is no __new__, but there is a new function which has 
a functionality similar to that intended for __new__.  Thus, with this 
change, numarray appears to be moving further away from being pythonic.

Colin W


From jmiller at stsci.edu  Thu Jul  1 13:03:12 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul  1 13:03:12 2004
Subject: [Numpy-discussion] Speeding up wxPython/numarray
In-Reply-To: <40E45D3C.7020501@cox.net>
References: <40E31B31.7040105@cox.net>
	 <1088632048.7526.204.camel@halloween.stsci.edu> <40E3462A.9080303@cox.net>
	 <1088701077.14402.20.camel@halloween.stsci.edu> <40E45D3C.7020501@cox.net>
Message-ID: <1088712102.14402.73.camel@halloween.stsci.edu>

On Thu, 2004-07-01 at 14:51, Tim Hochberg wrote:
> Todd Miller wrote:
> 
> >On Wed, 2004-06-30 at 19:00, Tim Hochberg wrote: 
> >  
> >
> >>>>
> >>>>        
> >>>>
> >>>FYI, the issue with tp_dealloc may have to do with which mode Python is
> >>>compiled in, --with-pydebug, or not.  One approach which seems like it
> >>>ought to work (just thought of this!) is to add an extra reference in C
> >>>to the NumArray instance __dict__ (from NumArray.__init__ and stashed
> >>>via a new attribute in the PyArrayObject struct) and then DECREF it as
> >>>the last part of the tp_dealloc.  
> >>> 
> >>>
> >>>      
> >>>
> >>That sounds promising.
> >>    
> >>
> > <>
> > I looked at this some, and while INCREFing __dict__ maybe the right
> > idea, I forgot that there *is no* Python NumArray.__init__ anymore.
> >
> > So the INCREF needs to be done in C without doing any getattrs; this
> > seems to mean calling a private _PyObject_GetDictPtr function to get a
> > pointer to the __dict__ slot which can be dereferenced to get the
> > __dict__.
> 
> Might there be a simpler way? Since you're putting an extra attribute on 
> the PyArrayObject structure anyway, wouldn't it be possible to just 
> stash _shadows there instead of the reference to the dictionary? 

_shadows is already in the struct.  The root problem (I recall) is not
the loss of self->_shadows, it's the loss self->__dict__ before self can
be copied onto self->_shadows.  The cause of the problem appeared to me
to be the tear down order of self:  the NumArray part appeared to be
torn down before the _numarray part, and the tp_dealloc needs to do a
Python callback where a half destructed object just won't do.  
 
To really know what the problem is,  I need to stick tp_dealloc back in
and see what breaks.  I'm pretty sure the problem was a missing instance
__dict__,  but my memory is quite fallable.

Todd


From Chris.Barker at noaa.gov  Thu Jul  1 13:18:01 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul  1 13:18:01 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <20040701053355.M99698@grenoble.cnrs.fr>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr>
Message-ID: <40E470D9.8060603@noaa.gov>

Hi all,

I'm looking for a way to read data from ascii text files quickly. I've 
found that using the standard python idioms like:

data = array((M,N),Float)
for in range(N):
     data.append(map(float,file.readline().split()))

Can be pretty slow. What I'd like is something like Matlab's fscanf:

data = fscanf(file, "%g", [M,N] )

I may have the syntax a little wrong, but the gist is there. What Matlab 
does keep recycling the format string until the desired number of 
elements have been read.

It is quite flexible, and ends up being pretty fast.

Has anyone written something like this for Numeric (or numarray, but I'd 
prefer Numeric at this point) ?

I was surprised not to find something like this in SciPy, maybe I didn't 
look hard enough.

If no one has done this, I guess I'll get started on it....

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Fernando.Perez at colorado.edu  Thu Jul  1 13:28:01 2004
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Thu Jul  1 13:28:01 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <40E470D9.8060603@noaa.gov>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov>
Message-ID: <40E473A9.5040109@colorado.edu>

Chris Barker wrote:
> Hi all,
> 
> I'm looking for a way to read data from ascii text files quickly. I've 
> found that using the standard python idioms like:
> 
> data = array((M,N),Float)
> for in range(N):
>      data.append(map(float,file.readline().split()))
> 
> Can be pretty slow. What I'd like is something like Matlab's fscanf:
> 
> data = fscanf(file, "%g", [M,N] )
> 
> I may have the syntax a little wrong, but the gist is there. What Matlab 
> does keep recycling the format string until the desired number of 
> elements have been read.
> 
> It is quite flexible, and ends up being pretty fast.
> 
> Has anyone written something like this for Numeric (or numarray, but I'd 
> prefer Numeric at this point) ?
> 
> I was surprised not to find something like this in SciPy, maybe I didn't 
> look hard enough.

scipy.io.read_array?

I haven't timed it, because it's been 'fast enough' for my needs.

For reading binary data files, I have this little utility which is basically a 
wrapper around Numeric.fromstring (N below is Numeric imported 'as N').  Note 
that it can read binary .gz files directly, a _huge_ gain for very sparse 
files representing 3d arrays (I can read a 400k gz file which blows up to 
~60MB when unzipped in no time at all, while reading the unzipped file is very 
slow):

def read_bin(fname,dims,typecode,recast_type=None,offset=0,verbose=0):
     """Read in a binary data file.

     Does NOT check for endianness issues.

     Inputs:
     fname - can be .gz
     dims (nx1,nx2,...,nxd)
     typecode
     recast_type
     offset=0: # of bytes to skip in file *from the beginning* before data starts
     """
     # config parameters
     item_size = N.zeros(1,typecode).itemsize()  # size in bytes
     data_size = N.product(N.array(dims))*item_size
     # read in data
     if fname.endswith('.gz'):
         data_file = gzip.open(fname)
     else:
         data_file = file(fname)
     data_file.seek(offset)
     data = N.fromstring(data_file.read(data_size),typecode)
     data_file.close()
     data.shape = dims
     if verbose:
         #print 'Read',data_size/item_size,'data points. Shape:',dims
         print 'Read',N.size(data),'data points. Shape:',dims
     if recast_type is not None:
         data = data.astype(recast_type)
     return data


HTH,

f


From squirrel at WPI.EDU  Thu Jul  1 13:37:13 2004
From: squirrel at WPI.EDU (Christopher T King)
Date: Thu Jul  1 13:37:13 2004
Subject: [Numpy-discussion] numarray and SMP
Message-ID: <Pine.LNX.4.44.0407011626120.20112-100000@ccc1.wpi.edu>

(I originally posted this in comp.lang.python and was redirected here)

In a quest to speed up numarray computations, I tried writing a 'threaded 
array' class for use on SMP systems that would distribute its workload 
across the processors. I hit a snag when I found out that since the Python 
interpreter is not reentrant, this effectively disables parallel 
processing in Python. I've come up with two solutions to this problem, 
both involving numarray's C functions that perform the actual vector 
operations:

1) Surround the C vector operations with Py_BEGIN_ALLOW_THREADS and 
   Py_END_ALLOW_THREADS, thus allowing the vector operations (which don't 
   access Python structures) to run in parallel with the interpreter.
   Python glue code would take care of threading and locking.

2) Move the parallelization into the C vector functions themselves. This 
   would likely get poorer performance (a chain of vector operations
   couldn't be combined into one threaded operation).

I'd much rather do #1, but will playing around with the interpreter state 
like that cause any problems?

Update from original posting:

I've partially implemented method #1 for Float64s. Running on four 2.4GHz
Xeons (possibly two with hyperthreading?), I get about a 30% speedup while
dividing 10 million Float64s, but a small (<10%) slowdown doing addition
or multiplication. The operation was repeated 100 times, with the threads
created outside of the loop (i.e. the threads weren't recreated for each
iteration). Is there really that much overhead in Python? I can post the
code I'm using and the numarray patch if it's requested.


From gerard.vermeulen at grenoble.cnrs.fr  Thu Jul  1 13:40:07 2004
From: gerard.vermeulen at grenoble.cnrs.fr (gerard.vermeulen at grenoble.cnrs.fr)
Date: Thu Jul  1 13:40:07 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <40E46CD3.9090802@sympatico.ca>
References: <1088451653.3744.200.camel@localhost.localdomain> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <200407010904.25498.haase@msg.ucsf.edu> <40E46CD3.9090802@sympatico.ca>
Message-ID: <20040701200934.M74616@grenoble.cnrs.fr>

On Thu, 01 Jul 2004 15:58:11 -0400, Colin J. Williams wrote
> Sebastian Haase wrote:
> 
> >On Wednesday 30 June 2004 11:33 pm, gerard.vermeulen at grenoble.cnrs.fr wrote:
> >  
> >
> >>On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
> >>
> >>    
> >>
> >>>So... you use the "meta" code to provide package specific ordinary
> >>>(not-macro-fied) functions to keep the different versions of the
> >>>Present() and isArray() macros from conflicting.
> >>>
> >>>It would be nice to have a standard approach for using the same
> >>>"extension enhancement code" for both numarray and Numeric.  The PEP
> >>>should really be expanded to provide an example of dual support for one
> >>>complete and real function, guts and all, so people can see the process
> >>>end-to-end;  Something like a simple arrayprint.  That process needs
> >>>to be refined to remove as much tedium and duplication of effort as
> >>>possible.  The idea is to make it as close to providing one
> >>>implementation to support both array packages as possible.  I think it's
> >>>important to illustrate how to partition the extension module into
> >>>separate compilation units which correctly navigate the dual
> >>>implementation mine field in the easiest possible way.
> >>>
> >>>It would also be nice to add some logic to the meta-functions so that
> >>>which array package gets used is configurable.  We did something like
> >>>that for the matplotlib plotting software at the Python level with
> >>>the "numerix" layer, an idea I think we copied from Chaco.  The kind
> >>>of dispatch I think might be good to support configurability looks like
> >>>this:
> >>>
> >>>PyObject *
> >>>whatsThis(PyObject *dummy, PyObject *args)
> >>>{
> >>>    PyObject *result, *what = NULL;
> >>>    if (!PyArg_ParseTuple(args, "O", &what))
> >>>      return 0;
> >>>    switch(PyArray_Which(what)) {
> >>>      USE_NUMERIC:
> >>>         result = Numeric_whatsThis(what); break;
> >>>      USE_NUMARRAY:
> >>>         result = Numarray_whatsThis(what); break;
> >>>      USE_SEQUENCE:
> >>>         result = Sequence_whatsThis(what); break;
> >>>    }
> >>>    Py_INCREF(Py_None);
> >>>    return Py_None;
> >>>}
> >>>
> >>>In the above,  I'm picturing a separate .c file for Numeric_whatsThis
> >>>and for Numarray_whatsThis.  It would be nice to streamline that to one
> >>>.c and a process which somehow (simply) produces both functions.
> >>>
> >>>Or, ideally, the above would be done more like this:
> >>>
> >>>PyObject *
> >>>whatsThis(PyObject *dummy, PyObject *args)
> >>>{
> >>>    PyObject *result, *what = NULL;
> >>>    if (!PyArg_ParseTuple(args, "O", &what))
> >>>       return 0;
> >>>    switch(Numerix_Which(what)) {
> >>>       USE_NUMERIX:
> >>>          result = Numerix_whatsThis(what); break;
> >>>       USE_SEQUENCE:
> >>>          result = Sequence_whatsThis(what); break;
> >>>    }
> >>>    Py_INCREF(Py_None);
> >>>    return Py_None;
> >>>}
> >>>
> >>>Here, a common Numerix implementation supports both numarray and Numeric
> >>>from a single simple .c.  The extension module would do "#include
> >>>numerix/arrayobject.h" and "import_numerix()" and otherwise just call
> >>>PyArray_* functions.
> >>>
> >>>The current stumbling block is that numarray is not binary compatible
> >>>with Numeric... so numerix in C falls apart.  I haven't analyzed
> >>>every symbol and struct to see if it is really feasible... but it
> >>>seems like it is *almost* feasible, at least for typical usage.
> >>>
> >>>So, in a nutshell,  I think the dual implementation support you
> >>>demoed is important and we should work up an example and kick it
> >>>around to make sure it's the best way we can think of doing it.
> >>>Then we should add a section to the PEP describing dual support as well.
> >>>      
> >>>
> >>I would never apply numarray code to Numeric arrays and the inverse. It
> >>looks dangerous and I do not know if it is possible.  The first thing
> >>coming to mind is that numarray and Numeric arrays refer to different type
> >>objects (this is what my pep module uses to differentiate them).  So, even
> >>if numarray and Numeric are binary compatible, any 'alien' code referring
> >>the the 'Python-standard part' of the type objects may lead to surprises. A
> >>PEP proposing hacks will raise eyebrows at least.
> >>
> >>Secondly, most people use Numeric *or* numarray and not both.
> >>
> >>So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out
> >>(NINO) Of course, Numeric or numarray output can be a user option if NINO
> >>does not apply.  (explicit safe conversion between Numeric and numarray is
> >>possible if really needed).
> >>
> >>I'll try to flesh out the demo with real functions in the way you indicated
> >>(going as far as I consider safe).
> >>
> >>The problem of coding the Numeric (or numarray) functions in more than
> >>a single source file has also be addressed.
> >>
> >>It may take 2 weeks because I am off to a conference next week.
> >>
> >>Regards -- Gerard
> >>    
> >>
> >
> >Hi all,
> >first, I would like to state that I don't understand much of this discussion;
> >so the only comment I wanted to make is that IF this where possible, to make 
> >(C/C++) code that can live with both Numeric and numarray, then I think it 
> >would be used more and more - think: transition phase !! (e.g. someone could 
> >start making the FFTW part  of scipy numarray friendly without having to 
> >switch everything at one [hint ;-)] )
> >
> >These where just my 2 cents.
> >Cheers,
> >Sebastian Haase
> >  
> >
> I feel lower on the understanding tree with respect to what is being 
> proposed in the draft PEP, but would still like to offer my 2 cents 
> worth.  I get the feeling that numarray is being bent out of shape 
> to fit Numeric.
>
What we are discussing are methods to make it possible to import
Numeric and numarray in the same extension module.  This can be
done by separating the colliding APIs of Numeric and numarray in
separate *.c files.  To achieve this, no changes to Numeric and
numarray itself are necessary.   In fact, this can be done by
the author of the C-extension himself, but since it is not obvious
we discuss the best methods and we like to provide the necessary glue
code.  It will make life easier for extension writers and facilitate
the transition to numarray.

Try to look at the problem from the other side:

I am using Numeric (since my life depends on SciPy) but have written an
extension that can also import numarray (hoping to get more users).  I will
never use the methods proposed in the draft PEP, because it excludes importing
Numeric.

> 
> It was my understanding that Numeric had certain weakness which made 
> it unacceptable as a Python component and that numarray was intended 
> to provide the same or better functionality within a pythonic framework.
> 
> numarray has not achieved the expected performance level to date,
>  but progress is being made and I believe that, for larger arrays, 
> numarray has been shown to be be superior to Numeric - please 
> correct me if I'm wrong here.
>

I think you are correct. I don't know why the __init__ has disappeared,
but I don't think it is because of the PEP and certainly not because
of the thread.

> 
> The shock came for me when Todd Miller said:
> 
>     <>
>     I looked at this some, and while INCREFing __dict__ maybe the right
>     idea, I forgot that there *is no* Python NumArray.__init__ anymore.
> 
> Wasn't it the intent of numarray to work towards the full use of the 
> Python class structure to provide the benefits which it offers?
> 
> The Python class has two constructors and one destructor.
> 
> The constructors are __init__ and __new__, the latter only provides 
> the shell of an instance which later has to be initialized.  In 
> version 0.9, which I use, there is no __new__, but there is a new 
> function which has a functionality similar to that intended for 
> __new__.  Thus, with this change, numarray appears to be moving 
> further away from being pythonic.
> 

Gerard


From jmiller at stsci.edu  Thu Jul  1 13:46:07 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul  1 13:46:07 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <40E46CD3.9090802@sympatico.ca>
References: <1088451653.3744.200.camel@localhost.localdomain>
	 <1088632459.7526.213.camel@halloween.stsci.edu>
	 <20040701053355.M99698@grenoble.cnrs.fr>
	 <200407010904.25498.haase@msg.ucsf.edu>  <40E46CD3.9090802@sympatico.ca>
Message-ID: <1088714723.14402.114.camel@halloween.stsci.edu>

On Thu, 2004-07-01 at 15:58, Colin J. Williams wrote:
> Sebastian Haase wrote:
> 
> >On Wednesday 30 June 2004 11:33 pm, gerard.vermeulen at grenoble.cnrs.fr wrote:
> >  
> >
> >>On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
> >>
> >>    
> >>
> >>>So... you use the "meta" code to provide package specific ordinary
> >>>(not-macro-fied) functions to keep the different versions of the
> >>>Present() and isArray() macros from conflicting.
> >>>
> >>>It would be nice to have a standard approach for using the same
> >>>"extension enhancement code" for both numarray and Numeric.  The PEP
> >>>should really be expanded to provide an example of dual support for one
> >>>complete and real function, guts and all, so people can see the process
> >>>end-to-end;  Something like a simple arrayprint.  That process needs
> >>>to be refined to remove as much tedium and duplication of effort as
> >>>possible.  The idea is to make it as close to providing one
> >>>implementation to support both array packages as possible.  I think it's
> >>>important to illustrate how to partition the extension module into
> >>>separate compilation units which correctly navigate the dual
> >>>implementation mine field in the easiest possible way.
> >>>
> >>>It would also be nice to add some logic to the meta-functions so that
> >>>which array package gets used is configurable.  We did something like
> >>>that for the matplotlib plotting software at the Python level with
> >>>the "numerix" layer, an idea I think we copied from Chaco.  The kind
> >>>of dispatch I think might be good to support configurability looks like
> >>>this:
> >>>
> >>>PyObject *
> >>>whatsThis(PyObject *dummy, PyObject *args)
> >>>{
> >>>    PyObject *result, *what = NULL;
> >>>    if (!PyArg_ParseTuple(args, "O", &what))
> >>>      return 0;
> >>>    switch(PyArray_Which(what)) {
> >>>      USE_NUMERIC:
> >>>         result = Numeric_whatsThis(what); break;
> >>>      USE_NUMARRAY:
> >>>         result = Numarray_whatsThis(what); break;
> >>>      USE_SEQUENCE:
> >>>         result = Sequence_whatsThis(what); break;
> >>>    }
> >>>    Py_INCREF(Py_None);
> >>>    return Py_None;
> >>>}
> >>>
> >>>In the above,  I'm picturing a separate .c file for Numeric_whatsThis
> >>>and for Numarray_whatsThis.  It would be nice to streamline that to one
> >>>.c and a process which somehow (simply) produces both functions.
> >>>
> >>>Or, ideally, the above would be done more like this:
> >>>
> >>>PyObject *
> >>>whatsThis(PyObject *dummy, PyObject *args)
> >>>{
> >>>    PyObject *result, *what = NULL;
> >>>    if (!PyArg_ParseTuple(args, "O", &what))
> >>>       return 0;
> >>>    switch(Numerix_Which(what)) {
> >>>       USE_NUMERIX:
> >>>          result = Numerix_whatsThis(what); break;
> >>>       USE_SEQUENCE:
> >>>          result = Sequence_whatsThis(what); break;
> >>>    }
> >>>    Py_INCREF(Py_None);
> >>>    return Py_None;
> >>>}
> >>>
> >>>Here, a common Numerix implementation supports both numarray and Numeric
> >>>from a single simple .c.  The extension module would do "#include
> >>>numerix/arrayobject.h" and "import_numerix()" and otherwise just call
> >>>PyArray_* functions.
> >>>
> >>>The current stumbling block is that numarray is not binary compatible
> >>>with Numeric... so numerix in C falls apart.  I haven't analyzed
> >>>every symbol and struct to see if it is really feasible... but it
> >>>seems like it is *almost* feasible, at least for typical usage.
> >>>
> >>>So, in a nutshell,  I think the dual implementation support you
> >>>demoed is important and we should work up an example and kick it
> >>>around to make sure it's the best way we can think of doing it.
> >>>Then we should add a section to the PEP describing dual support as well.
> >>>      
> >>>
> >>I would never apply numarray code to Numeric arrays and the inverse. It
> >>looks dangerous and I do not know if it is possible.  The first thing
> >>coming to mind is that numarray and Numeric arrays refer to different type
> >>objects (this is what my pep module uses to differentiate them).  So, even
> >>if numarray and Numeric are binary compatible, any 'alien' code referring
> >>the the 'Python-standard part' of the type objects may lead to surprises. A
> >>PEP proposing hacks will raise eyebrows at least.
> >>
> >>Secondly, most people use Numeric *or* numarray and not both.
> >>
> >>So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out
> >>(NINO) Of course, Numeric or numarray output can be a user option if NINO
> >>does not apply.  (explicit safe conversion between Numeric and numarray is
> >>possible if really needed).
> >>
> >>I'll try to flesh out the demo with real functions in the way you indicated
> >>(going as far as I consider safe).
> >>
> >>The problem of coding the Numeric (or numarray) functions in more than
> >>a single source file has also be addressed.
> >>
> >>It may take 2 weeks because I am off to a conference next week.
> >>
> >>Regards -- Gerard
> >>    
> >>
> >
> >Hi all,
> >first, I would like to state that I don't understand much of this discussion;
> >so the only comment I wanted to make is that IF this where possible, to make 
> >(C/C++) code that can live with both Numeric and numarray, then I think it 
> >would be used more and more - think: transition phase !! (e.g. someone could 
> >start making the FFTW part  of scipy numarray friendly without having to 
> >switch everything at one [hint ;-)] )
> >
> >These where just my 2 cents.
> >Cheers,
> >Sebastian Haase
> >  
> >
> I feel lower on the understanding tree with respect to what is being 
> proposed in the draft PEP, but would still like to offer my 2 cents 
> worth.  I get the feeling that numarray is being bent out of shape to 
> fit Numeric.

Yes and no.  The numarray team has over time realized the importance of
backward compatibility with the dominant array package, Numeric.  A lot
of People use Numeric now.  We're trying to make it as easy as possible
to use numarray.

> It was my understanding that Numeric had certain weakness which made it 
> unacceptable as a Python component and that numarray was intended to 
> provide the same or better functionality within a pythonic framework.

My understanding is that until there is a consensus on an array package,
neither numarray nor Numeric is going into the Python core.  

> numarray has not achieved the expected performance level to date, but 
> progress is being made and I believe that, for larger arrays, numarray 
> has been shown to be be superior to Numeric - please correct me if I'm 
> wrong here.

I think that's a fair summary.

> 
> The shock came for me when Todd Miller said:  
>     <>
>     I looked at this some, and while INCREFing __dict__ maybe the right
>     idea, I forgot that there *is no* Python NumArray.__init__ anymore.
> 
> Wasn't it the intent of numarray to work towards the full use of the 
> Python class structure to provide the benefits which it offers?
> 

Ack.  I wasn't trying to start a panic.  The __init__ still exists, as
does __new__, they're just in C.   Sorry if I was unclear.

> The Python class has two constructors and one destructor.

We're mostly on the same page.

> The constructors are __init__ and __new__, the latter only provides the 
> shell of an instance which later has to be initialized.  In version 0.9, 
> which I use, there is no __new__, 

It's there,  but it's not very useful:

>>> import numarray
>>> numarray.NumArray.__new__
<built-in method __new__ of type object at 0x402fc860>
>>> a = numarray.NumArray.__new__(numarray.NumArray)
>>> a.info()
class: <class 'numarray.numarraycore.NumArray'>
shape: ()
strides: ()
byteoffset: 0
bytestride: 0
itemsize: 0
aligned: 1
contiguous: 1
data: None
byteorder: little
byteswap: 0
type: Any

I don't, however, recommend doing this.

> but there is a new function which has 
> a functionality similar to that intended for __new__.  Thus, with this 
> change, numarray appears to be moving further away from being pythonic.

Nope.  I'm talking about moving toward better speed with no change in
functionality at the Python level.  I also think maybe we've gotten list
threads crossed here:  the "Numarray header PEP" thread is independent
(but admittedly related) of the "Speeding up wxPython/numarray" thread.

The Numarray header PEP is about making it easy for packages to write C
extensions which *optionally* support numarray (and now Numeric as
well).  One aspect of the PEP is getting headers included in the Python
core so that extensions can be compiled even when the numarray is not
installed.  The other aspect will be illustrating a good technique for
supporting both numarray and Numeric, optionally and with choice, at the
same time.  Such an extension would still run where there is numarray,
Numeric, both, or none installed.  Gerard V. has already done some
integration of numarray and Numeric with PyQwt so he has a few good
ideas on how to do the "good technique" aspect of the PEP.

The Speeding up wxPython/numarray thread is about improving the
performance of a 50000 point wxPython drawlines which is 10x slower with
numarray than Numeric.  Tim H. and Chris B. have nailed this down
(mostly) to the numarray sequence protocol and destructor, __del__.

Regards,
Todd


From perry at stsci.edu  Thu Jul  1 13:57:02 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Thu Jul  1 13:57:02 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <40E46CD3.9090802@sympatico.ca>
Message-ID: <CEELJPECNGEGKFNDLHFFMEABCEAA.perry@stsci.edu>

Collin J. Williams Wrote:

> I feel lower on the understanding tree with respect to what is being 
> proposed in the draft PEP, but would still like to offer my 2 cents 
> worth.  I get the feeling that numarray is being bent out of shape to 
> fit Numeric.
>  
Todd and Gerard address this point well.

> It was my understanding that Numeric had certain weakness which made it 
> unacceptable as a Python component and that numarray was intended to 
> provide the same or better functionality within a pythonic framework.
> 
Let me reiterate what our motivations were. We wanted to use
an array package for our software, and Numeric had enough
shortcomings that we needed some changes in behavior (e.g., 
type coercion for scalars), changes in performance (particularly
with regard to memory usage), and enhancements in capabilities
(e.g., memory mapping, record arrays, etc.). It was the opinion
of some (Paul Dubois, for example) that a rewrite was in order in
any case since the code was not that maintainable (not everyone felt
this way, though at the time that wasn't as clear).

At the same time there was some hope that Numeric could be accepted
into the standard Python distribution. That's something we thought
would be good (but wasn't the highest priority for us) and I've
come to believe that perhaps a better solution with regard to that
is what this PEP is trying to address. In any case Guido made it
clear that he would not accept Numeric in its (then) current form.

That it be written mostly in Python was something suggested by
Guido, and we started off that way, mainly because it would get
us going much faster than writing it all in C. We definitely 
understood that it would also have the consequence of making
small array performance worse. We said as much when we started;
it wasn't as clear as it is now that many users objected to a factor
of few slower performance (as it turned out, a mostly Python based
implemenation was more than an order of magnitude slower for small
arrays).

> numarray has not achieved the expected performance level to date, but 
> progress is being made and I believe that, for larger arrays, numarray 
> has been shown to be be superior to Numeric - please correct me if I'm 
> wrong here.
> 
We never expected numarray to ever reach the performance level for small
arrays that Numeric has. If it were within a factor of two I would be
thrilled (its more like a factor of 3 or 4 currently for simple ufuncs).
I still don't think it ever will be as fast for small arrays. The
focus all along was on handling large arrays, which I think it does
quite well, both regard to memory and speed. Yes, there are some
functions and operations that may be much slower. Mainly they
need to be called out so they can be improved. Generally we
only notice performance issues that affect our software. Others
need to point out remaining large discrepancies.

I'm still of the opinion that if small array performance is really
important, a very different approach should be used and have a 
completely different implementation. I would think that improvements
of an order of magnitude over what Numeric does now are possible.
But since that isn't important to us (STScI), don't expect us to work on
that :-)

> The shock came for me when Todd Miller said:  
> 
>     <>
>     I looked at this some, and while INCREFing __dict__ maybe the right
>     idea, I forgot that there *is no* Python NumArray.__init__ anymore.
> 
> Wasn't it the intent of numarray to work towards the full use of the 
> Python class structure to provide the benefits which it offers?
> 
> The Python class has two constructors and one destructor.
> 
> The constructors are __init__ and __new__, the latter only provides the 
> shell of an instance which later has to be initialized.  In version 0.9, 
> which I use, there is no __new__, but there is a new function which has 
> a functionality similar to that intended for __new__.  Thus, with this 
> change, numarray appears to be moving further away from being pythonic.
> 
I'll agree that optimization is driving the underlying implementation to
one that is more complex and that is the drawback (no surprise there).
There's Pythonic in use and Pythonic in implementation. We are certainly 
receptive to better ideas for the implementation, but I doubt that
a heavily Python-based implementation is ever going to be competitive
for small arrays (unless something like psyco become universal, but
I think there are a whole mess of problems to be solved for that kind
of approach to work well generically).

Perry 


From perry at stsci.edu  Thu Jul  1 15:01:04 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Thu Jul  1 15:01:04 2004
Subject: [Numpy-discussion] numarray and SMP
In-Reply-To: <Pine.LNX.4.44.0407011626120.20112-100000@ccc1.wpi.edu>
Message-ID: <CEELJPECNGEGKFNDLHFFGEACCEAA.perry@stsci.edu>

Christopher T King wrote:
>
> (I originally posted this in comp.lang.python and was redirected here)
>
> In a quest to speed up numarray computations, I tried writing a 'threaded
> array' class for use on SMP systems that would distribute its workload
> across the processors. I hit a snag when I found out that since
> the Python
> interpreter is not reentrant, this effectively disables parallel
> processing in Python. I've come up with two solutions to this problem,
> both involving numarray's C functions that perform the actual vector
> operations:
>
> 1) Surround the C vector operations with Py_BEGIN_ALLOW_THREADS and
>    Py_END_ALLOW_THREADS, thus allowing the vector operations (which don't
>    access Python structures) to run in parallel with the interpreter.
>    Python glue code would take care of threading and locking.
>
> 2) Move the parallelization into the C vector functions themselves. This
>    would likely get poorer performance (a chain of vector operations
>    couldn't be combined into one threaded operation).
>
> I'd much rather do #1, but will playing around with the interpreter state
> like that cause any problems?
>
I don't think so, but it raises a number of questions that I
ask just below.

> Update from original posting:
>
> I've partially implemented method #1 for Float64s. Running on four 2.4GHz
> Xeons (possibly two with hyperthreading?), I get about a 30% speedup while
> dividing 10 million Float64s, but a small (<10%) slowdown doing addition
> or multiplication. The operation was repeated 100 times, with the threads
> created outside of the loop (i.e. the threads weren't recreated for each
> iteration). Is there really that much overhead in Python? I can post the
> code I'm using and the numarray patch if it's requested.
>
Questions and comments:

1) I suppose you did this for generated ufunc code? (ideally one
would put this in the codegenerator stuff but for the purposes
of testing it would be fine). I guess we would like to see
how you actually changed the code fragment (you can email
me or Todd Miller directly if you wish)
2) How much improvement you would see depends on many details.
But if you were doing this for 10 million element arrays, I'm
surprised you saw such a small improvement (30% for 4 processors
isn't worth the trouble it would seem). So seeing the actual
test code would be helpful. If the array operation you are doing
for numarray aren't simple (that's a specialized use of the word;
by that I mean if the arrays are not the same type, aren't
contiguous, aren't aligned, or aren't of proper byte-order)
then there are a number of other issues that may slow it down
quite a bit (and there are ways of improving these for
parallel processing).
3) I don't speak as an expert on threading or parallel processors,
but I believe so long as you don't call any Python API functions
(either directly or indirectly) between the global interpreter
lock release and reacquisition, you should be fine. The vector
ufunc code in numarray should satisfy this fine.

Perry Greenfield


From squirrel at WPI.EDU  Fri Jul  2 06:37:20 2004
From: squirrel at WPI.EDU (Christopher T King)
Date: Fri Jul  2 06:37:20 2004
Subject: [Numpy-discussion] numarray and SMP
In-Reply-To: <CEELJPECNGEGKFNDLHFFGEACCEAA.perry@stsci.edu>
Message-ID: <Pine.LNX.4.44.0407020930040.20420-100000@ccc9.wpi.edu>

On Thu, 1 Jul 2004, Perry Greenfield wrote:

> 1) I suppose you did this for generated ufunc code? (ideally one
> would put this in the codegenerator stuff but for the purposes
> of testing it would be fine). I guess we would like to see
> how you actually changed the code fragment (you can email
> me or Todd Miller directly if you wish)

Yep, I didn't know it was automatically generated :P

> 2) How much improvement you would see depends on many details.
> But if you were doing this for 10 million element arrays, I'm
> surprised you saw such a small improvement (30% for 4 processors
> isn't worth the trouble it would seem). So seeing the actual
> test code would be helpful. If the array operation you are doing
> for numarray aren't simple (that's a specialized use of the word;
> by that I mean if the arrays are not the same type, aren't
> contiguous, aren't aligned, or aren't of proper byte-order)
> then there are a number of other issues that may slow it down
> quite a bit (and there are ways of improving these for
> parallel processing).

I've been careful not to use anything to cause discontiguities in the 
arrays, and to keep them all the same type (Float64 in this case). See my 
next post for the code I'm using.


From haase at msg.ucsf.edu  Fri Jul  2 08:28:01 2004
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri Jul  2 08:28:01 2004
Subject: [Numpy-discussion] bug in numarray.maximum.reduce ?
In-Reply-To: <200406291705.55454.haase@msg.ucsf.edu>
References: <200406291705.55454.haase@msg.ucsf.edu>
Message-ID: <200407020827.05407.haase@msg.ucsf.edu>

On Tuesday 29 June 2004 05:05 pm, Sebastian Haase wrote:
> Hi,
>
> Is this a bug?:
> >>> # (import numarray as na ; 'd' is a 3 dimensional array)
> >>> d.type()
>
> Float32
>
> >>> d[80, 136, 122]
>
> 80.3997039795
>
> >>> na.maximum.reduce(d[:,136, 122])
>
> 85.8426361084
>
> >>> na.maximum.reduce(d) [136, 122]
>
> 37.3658103943
>
> >>> na.maximum.reduce(d,0)[136, 122]
>
> 37.3658103943
>
> >>> na.maximum.reduce(d,1)[136, 122]
>
> Traceback (most recent call last):
>   File "<input>", line 1, in ?
> IndexError: Index out of range
>
> I was using  na.maximum.reduce(d)  to get a "pixelwise" maximum along Z
> (axis 0). But as seen above it does not get it right.  I then tried to
> reproduce
>
> this with some simple arrays, but here it works just fine:
> >>> a = na.arange(4*4*4)
> >>> a.shape=(4,4,4)
> >>> na.maximum.reduce(a)
>
> [[48 49 50 51]
>  [52 53 54 55]
>  [56 57 58 59]
>  [60 61 62 63]]
>
> >>> a = na.arange(4*4*4).astype(na.Float32)
> >>> a.shape=(4,4,4)
> >>> na.maximum.reduce(a)
>
> [[ 48.  49.  50.  51.]
>  [ 52.  53.  54.  55.]
>  [ 56.  57.  58.  59.]
>  [ 60.  61.  62.  63.]]
>
>
> Any hint ?
>
> Regards,
> Sebastian Haase

Hi again,
I think the reason that no one responded to this is that it just sounds to 
unbelievable ...
Sorry for the missing piece of information, but 'd' is actually a memmapped 
array !
>>> d.info()
class: <class 'numarray.numarraycore.NumArray'>
shape: (80, 150, 150)
strides: (90000, 600, 4)
byteoffset: 0
bytestride: 4
itemsize: 4
aligned: 1
contiguous: 1
data: <MemmapSlice of length:7290000 readonly>
byteorder: big
byteswap: 1
type: Float32
>>> dd = d.copy()
>>> na.maximum.reduce(dd[:,136, 122])
85.8426361084
>>> na.maximum.reduce(dd)[136, 122]
85.8426361084
>>>

Apparently we are using memmap so frequently now that I didn't even think 
about that - which is good news for everyone, because it means that it works 
(mostly).

I just see that 'byteorder' is 'big' - I'm running this on an Intel Linux PC. 
Could this be the problem?
Please some comments !

Thanks,
Sebastian


From jmiller at stsci.edu  Fri Jul  2 09:03:08 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jul  2 09:03:08 2004
Subject: [Numpy-discussion] bug in numarray.maximum.reduce ?
In-Reply-To: <200407020827.05407.haase@msg.ucsf.edu>
References: <200406291705.55454.haase@msg.ucsf.edu>
	 <200407020827.05407.haase@msg.ucsf.edu>
Message-ID: <1088784157.26482.14.camel@halloween.stsci.edu>

On Fri, 2004-07-02 at 11:27, Sebastian Haase wrote:
> On Tuesday 29 June 2004 05:05 pm, Sebastian Haase wrote:
> > Hi,
> >
> > Is this a bug?:
> > >>> # (import numarray as na ; 'd' is a 3 dimensional array)
> > >>> d.type()
> >
> > Float32
> >
> > >>> d[80, 136, 122]
> >
> > 80.3997039795
> >
> > >>> na.maximum.reduce(d[:,136, 122])
> >
> > 85.8426361084
> >
> > >>> na.maximum.reduce(d) [136, 122]
> >
> > 37.3658103943
> >
> > >>> na.maximum.reduce(d,0)[136, 122]
> >
> > 37.3658103943
> >
> > >>> na.maximum.reduce(d,1)[136, 122]
> >
> > Traceback (most recent call last):
> >   File "<input>", line 1, in ?
> > IndexError: Index out of range
> >
> > I was using  na.maximum.reduce(d)  to get a "pixelwise" maximum along Z
> > (axis 0). But as seen above it does not get it right.  I then tried to
> > reproduce
> >
> > this with some simple arrays, but here it works just fine:
> > >>> a = na.arange(4*4*4)
> > >>> a.shape=(4,4,4)
> > >>> na.maximum.reduce(a)
> >
> > [[48 49 50 51]
> >  [52 53 54 55]
> >  [56 57 58 59]
> >  [60 61 62 63]]
> >
> > >>> a = na.arange(4*4*4).astype(na.Float32)
> > >>> a.shape=(4,4,4)
> > >>> na.maximum.reduce(a)
> >
> > [[ 48.  49.  50.  51.]
> >  [ 52.  53.  54.  55.]
> >  [ 56.  57.  58.  59.]
> >  [ 60.  61.  62.  63.]]
> >
> >
> > Any hint ?
> >
> > Regards,
> > Sebastian Haase
> 
> Hi again,
> I think the reason that no one responded to this is that it just sounds to 
> unbelievable ...

This just slipped through the cracks for me.

> Sorry for the missing piece of information, but 'd' is actually a memmapped 
> array !
> >>> d.info()
> class: <class 'numarray.numarraycore.NumArray'>
> shape: (80, 150, 150)
> strides: (90000, 600, 4)
> byteoffset: 0
> bytestride: 4
> itemsize: 4
> aligned: 1
> contiguous: 1
> data: <MemmapSlice of length:7290000 readonly>
> byteorder: big
> byteswap: 1
> type: Float32
> >>> dd = d.copy()
> >>> na.maximum.reduce(dd[:,136, 122])
> 85.8426361084
> >>> na.maximum.reduce(dd)[136, 122]
> 85.8426361084
> >>>
> 
> Apparently we are using memmap so frequently now that I didn't even think 
> about that - which is good news for everyone, because it means that it works 
> (mostly).
> 
> I just see that 'byteorder' is 'big' - I'm running this on an Intel Linux PC. 
> Could this be the problem?

I think byteorder is a good guess at this point.  What version of Python
and numarray are you using?

Regards,
Todd


From haase at msg.ucsf.edu  Fri Jul  2 10:46:01 2004
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri Jul  2 10:46:01 2004
Subject: [Numpy-discussion] bug in numarray.maximum.reduce ?
In-Reply-To: <1088784157.26482.14.camel@halloween.stsci.edu>
References: <200406291705.55454.haase@msg.ucsf.edu> <200407020827.05407.haase@msg.ucsf.edu> <1088784157.26482.14.camel@halloween.stsci.edu>
Message-ID: <200407021045.00866.haase@msg.ucsf.edu>

On Friday 02 July 2004 09:02 am, Todd Miller wrote:
> On Fri, 2004-07-02 at 11:27, Sebastian Haase wrote:
> > On Tuesday 29 June 2004 05:05 pm, Sebastian Haase wrote:
> > > Hi,
> > >
> > > Is this a bug?:
> > > >>> # (import numarray as na ; 'd' is a 3 dimensional array)
> > > >>> d.type()
> > >
> > > Float32
> > >
> > > >>> d[80, 136, 122]
> > >
> > > 80.3997039795
> > >
> > > >>> na.maximum.reduce(d[:,136, 122])
> > >
> > > 85.8426361084
> > >
> > > >>> na.maximum.reduce(d) [136, 122]
> > >
> > > 37.3658103943
> > >
> > > >>> na.maximum.reduce(d,0)[136, 122]
> > >
> > > 37.3658103943
> > >
> > > >>> na.maximum.reduce(d,1)[136, 122]
> > >
> > > Traceback (most recent call last):
> > >   File "<input>", line 1, in ?
> > > IndexError: Index out of range
> > >
> > > I was using  na.maximum.reduce(d)  to get a "pixelwise" maximum along Z
> > > (axis 0). But as seen above it does not get it right.  I then tried to
> > > reproduce
> > >
> > > this with some simple arrays, but here it works just fine:
> > > >>> a = na.arange(4*4*4)
> > > >>> a.shape=(4,4,4)
> > > >>> na.maximum.reduce(a)
> > >
> > > [[48 49 50 51]
> > >  [52 53 54 55]
> > >  [56 57 58 59]
> > >  [60 61 62 63]]
> > >
> > > >>> a = na.arange(4*4*4).astype(na.Float32)
> > > >>> a.shape=(4,4,4)
> > > >>> na.maximum.reduce(a)
> > >
> > > [[ 48.  49.  50.  51.]
> > >  [ 52.  53.  54.  55.]
> > >  [ 56.  57.  58.  59.]
> > >  [ 60.  61.  62.  63.]]
> > >
> > >
> > > Any hint ?
> > >
> > > Regards,
> > > Sebastian Haase
> >
> > Hi again,
> > I think the reason that no one responded to this is that it just sounds
> > to unbelievable ...
>
> This just slipped through the cracks for me.
>
> > Sorry for the missing piece of information, but 'd' is actually a
> > memmapped array !
> >
> > >>> d.info()
> >
> > class: <class 'numarray.numarraycore.NumArray'>
> > shape: (80, 150, 150)
> > strides: (90000, 600, 4)
> > byteoffset: 0
> > bytestride: 4
> > itemsize: 4
> > aligned: 1
> > contiguous: 1
> > data: <MemmapSlice of length:7290000 readonly>
> > byteorder: big
> > byteswap: 1
> > type: Float32
> >
> > >>> dd = d.copy()
> > >>> na.maximum.reduce(dd[:,136, 122])
> >
> > 85.8426361084
> >
> > >>> na.maximum.reduce(dd)[136, 122]
> >
> > 85.8426361084
> >
> >
> > Apparently we are using memmap so frequently now that I didn't even think
> > about that - which is good news for everyone, because it means that it
> > works (mostly).
> >
> > I just see that 'byteorder' is 'big' - I'm running this on an Intel Linux
> > PC. Could this be the problem?
>
> I think byteorder is a good guess at this point.  What version of Python
> and numarray are you using?

Python 2.2.1 (#1, Feb 28 2004, 00:52:10)
[GCC 2.95.4 20011002 (Debian prerelease)] on linux2

numarray 0.9 - from CVS on 2004-05-13.

Regards,
Sebastian Haase


From jmiller at stsci.edu  Fri Jul  2 12:34:09 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jul  2 12:34:09 2004
Subject: [Numpy-discussion] bug in numarray.maximum.reduce ?
In-Reply-To: <200407021045.00866.haase@msg.ucsf.edu>
References: <200406291705.55454.haase@msg.ucsf.edu>
	 <200407020827.05407.haase@msg.ucsf.edu>
	 <1088784157.26482.14.camel@halloween.stsci.edu>
	 <200407021045.00866.haase@msg.ucsf.edu>
Message-ID: <1088796821.5974.15.camel@halloween.stsci.edu>

On Fri, 2004-07-02 at 13:45, Sebastian Haase wrote:
> On Friday 02 July 2004 09:02 am, Todd Miller wrote:
> > On Fri, 2004-07-02 at 11:27, Sebastian Haase wrote:
> > > On Tuesday 29 June 2004 05:05 pm, Sebastian Haase wrote:
> > > > Hi,
> > > >
> > > > Is this a bug?:
> > > > >>> # (import numarray as na ; 'd' is a 3 dimensional array)
> > > > >>> d.type()
> > > >
> > > > Float32
> > > >
> > > > >>> d[80, 136, 122]
> > > >
> > > > 80.3997039795
> > > >
> > > > >>> na.maximum.reduce(d[:,136, 122])
> > > >
> > > > 85.8426361084
> > > >
> > > > >>> na.maximum.reduce(d) [136, 122]
> > > >
> > > > 37.3658103943
> > > >
> > > > >>> na.maximum.reduce(d,0)[136, 122]
> > > >
> > > > 37.3658103943
> > > >
> > > > >>> na.maximum.reduce(d,1)[136, 122]
> > > >
> > > > Traceback (most recent call last):
> > > >   File "<input>", line 1, in ?
> > > > IndexError: Index out of range
> > > >
> > > > I was using  na.maximum.reduce(d)  to get a "pixelwise" maximum along Z
> > > > (axis 0). But as seen above it does not get it right.  I then tried to
> > > > reproduce
> > > >
> > > > this with some simple arrays, but here it works just fine:
> > > > >>> a = na.arange(4*4*4)
> > > > >>> a.shape=(4,4,4)
> > > > >>> na.maximum.reduce(a)
> > > >
> > > > [[48 49 50 51]
> > > >  [52 53 54 55]
> > > >  [56 57 58 59]
> > > >  [60 61 62 63]]
> > > >
> > > > >>> a = na.arange(4*4*4).astype(na.Float32)
> > > > >>> a.shape=(4,4,4)
> > > > >>> na.maximum.reduce(a)
> > > >
> > > > [[ 48.  49.  50.  51.]
> > > >  [ 52.  53.  54.  55.]
> > > >  [ 56.  57.  58.  59.]
> > > >  [ 60.  61.  62.  63.]]
> > > >
> > > >
> > > > Any hint ?
> > > >
> > > > Regards,
> > > > Sebastian Haase
> > >
> > > Hi again,
> > > I think the reason that no one responded to this is that it just sounds
> > > to unbelievable ...
> >
> > This just slipped through the cracks for me.
> >
> > > Sorry for the missing piece of information, but 'd' is actually a
> > > memmapped array !
> > >
> > > >>> d.info()
> > >
> > > class: <class 'numarray.numarraycore.NumArray'>
> > > shape: (80, 150, 150)
> > > strides: (90000, 600, 4)
> > > byteoffset: 0
> > > bytestride: 4
> > > itemsize: 4
> > > aligned: 1
> > > contiguous: 1
> > > data: <MemmapSlice of length:7290000 readonly>
> > > byteorder: big
> > > byteswap: 1
> > > type: Float32
> > >
> > > >>> dd = d.copy()
> > > >>> na.maximum.reduce(dd[:,136, 122])
> > >
> > > 85.8426361084
> > >
> > > >>> na.maximum.reduce(dd)[136, 122]
> > >
> > > 85.8426361084
> > >
> > >
> > > Apparently we are using memmap so frequently now that I didn't even think
> > > about that - which is good news for everyone, because it means that it
> > > works (mostly).
> > >
> > > I just see that 'byteorder' is 'big' - I'm running this on an Intel Linux
> > > PC. Could this be the problem?
> >
> > I think byteorder is a good guess at this point.  What version of Python
> > and numarray are you using?
> 
> Python 2.2.1 (#1, Feb 28 2004, 00:52:10)
> [GCC 2.95.4 20011002 (Debian prerelease)] on linux2
> 
> numarray 0.9 - from CVS on 2004-05-13.
> 
> Regards,
> Sebastian Haase

Hi Sebastian,

I logged this on SF as a bug but won't get to it until next week after
numarray-1.0 comes out.

Regards,
Todd


From jmiller at stsci.edu  Fri Jul  2 14:06:13 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jul  2 14:06:13 2004
Subject: [Numpy-discussion] ANN: numarray-1.0 released
Message-ID: <1088802348.5974.28.camel@halloween.stsci.edu>

Release Notes for numarray-1.0

Numarray is an array processing package designed to efficiently
manipulate large multi-dimensional arrays.  Numarray is modeled after
Numeric and features c-code generated from python template scripts,
the capacity to operate directly on arrays in files, and improved type
promotions.

I. ENHANCEMENTS

1. User added ufuncs

There's a setup.py file in numarray-1.0/Examples/ufunc which
demonstrates how a numarray user can define their own universal
functions of one or two parameters.  Ever wanted to write your own
bessel() function for use on arrays?  Now you can.  Your ufunc can use
exactly the same machinery as add().

2. Ports of Numeric functions

A bunch of Numeric functions were ported to numarray in the new
libnumeric module.  To get these import from numarray.numeric.  Most
notable among these are put, putmask, take, argmin, and argmax.  Also
added were sort, argsort, concatenate, repeat and resize.  These are
independent ports/implementations in C done for the purpose of best
Numeric compatibility and small array performance.  The numarray
versions, which handle additional cases, still exist and are the
default in numarray proper.

3. Faster matrix multiply

The setup for numarray's matrix multiply was moved into C-code.  This
makes it faster for small matrices.

4. The numarray "header PEP"

A PEP has been started for the inclusion of numarray (and possibly
Numeric) C headers into the Python core.  The PEP will demonstrate how
to provide optional support for arrays (the end-user may or may not
have numarray installed and the extension will still work).  It may
also (eventually) demonstrate how to build extensions which support
both numarray and Numeric.  Thus, the PEP is seeking to make it
possible to distribute extensions which will still compile when
numarray (or either) is not present in a user's Python installation,
which will work when numarry (or either) is not installed, and which
will improve performance when either is installed.  The PEP is now in
numarray-1.0/Doc/header_pep.txt in docutils format.  We want feedback
and consensus before we submit to python-dev so please consider
reading it and commenting.

For the PEP, the C-API has been partitioned into two parts: a
relatively simple Numeric compatible part and the numarray native
part.  This broke source and binary compatibility with numarray-0.9.
See CAUTIONS below for more information.

5. Changes to the manual

There are now brief sections on numarray.mlab and numarray.objects in
the manual.  The discussion of the C-API has been updated.

II. CAUTIONS

1. The numarray-1.0 C-API is neither completely source level nor
binary compatible with numarray-0.9. First, this means that some 3rd
party extensions will no longer compile without errors.  Second, this
means that binary packages built against numarray-0.9 will fail,
probably disastrously, using numarray-1.0.  Don't install numarray-1.0
until you are ready to recompile or replace your extensions with
numarray-1.0 binaries because 0.9 binaries will not work.

In order to support the header PEP, the numarray C-API was partitioned
into two parts: Numeric compatible and numarry extensions. You can use
the Numeric compatible API (the PyArray_* functions) by including
arrayobject.h and calling import_array() in your module init function.
You can use the extended API (the NA_* functions) by including
libnumarray.h and calling import_libnumarray() in your init function.
Because of the partitioning, all numarray extensions must be
recompiled to work with 1.0.  Extensions using *both* APIs must
include both files in order to compile, and must do both imports in
order to run.  Both APIs share a common PyArrayObject struct.

2. numarray extension writers should note that the documented use of
PyArray_INCREF and PyArray_XDECREF (in numarray) was found to be
incompatible with Numeric and these functions have therefore been
removed from the supported API and will now result in errors.

3. The numarray.objects.ObjectArray parameter order was changed.

4. The undocumented API function PyArray_DescrFromTypeObj was removed
from the Numeric compatible API because it is not provided by Numeric.

III. BUGS FIXED / CLOSED

See
http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse
for more details.

979834  convolve2d parameter order issues
979775  ObjectArray parameter order
979712  No exception for invalid axis
979702  too many slices fails silently
979123  A[n:n] = x no longer works
979028  matrixmultiply precision
976951  Unpickled numarray types unsable?
977472  CharArray concatenate
970356  bug in accumulate contiguity status
969162  object array bug/ambiguity
963921  bitwise_not over Bool type fails
963706  _reduce_out: problem with otype
942804  numarray C-API include file
932438  suggest moving mlab up a level
932436  mlab docs missing
857628  numarray allclose returns int
839401  Argmax's behavior has changed for ties
817348  a+=1j # Not converted to complex
957089  PyArray_FromObject dim check broken
923046  numarray.objects incompatibility  
897854  Type conflict when embedding on OS X  
793421  PyArray_INCREF / PyArray_XDECREF deprecated
735479  Build failure on Cygwin 1.3.22 (very current install).
870660  Numarray: CFLAGS build problem  
874198  numarray.random_array.random() broken?
874207  not-so random numbers in numarray.random_array
829662  Downcast from Float64 to UInt8 anomaly
867073  numarray diagonal bug?  
806705  a tale of two rank-0's
863155  Zero size numarray breaks for loop
922157  argmax returns integer in some cases
934514  suggest nelements -> size
953294  choose bug
955314  strings.num2char bug?
955336  searchsorted has strange behaviour
955409  MaskedArray problems
953567  Add read-write requirement to NA_InputArray
952705  records striding for > 1D arrays
944690  many numarray array methods not documented
915015  numarray/Numeric incompatabilities
949358  UsesOpPriority unexpected behavior
944678  incorrect help for "size" func/method
888430  NA_NewArray() creates array with wrong endianess
922798  The document Announce.txt is out of date
947080  numarray.image.median bugs
922796  Manual has some dated MA info
931384  What does True mean in a mask?
931379  numeric.ma called MA in manual
933842  Bool arrays don't allow bool assignment
935588  problem parsing argument "nbyte" in callStrideConvCFunc()
936162  problem parsing "nbytes" argument in copyToString()
937680  Error in Lib/numerictypes.py ?
936539  array([cmplx_array, int_array]) fails
936541  a[...,1] += 0 crashes interpreter.
940826  Ufunct operator don't work
935882  take for character arrays?
933783  numarray, _ufuncmodule.c: problem setting buffersize
930014  fromstring typecode param still broken
929841  searchsorted type coercion
924841  numarray.objects rank-0 results
925253  numarray.objects __str__ and __repr__
913782  Minor error in chapter 12: NUM_ or not?
889591  wrong header file for C extensions
925073  API manual comments
924854  take() errors
925754  arange() with large argument crashes interpreter
926246  ufunc reduction crash
902153  can't compile under RH9/gcc 3.2.2
916876  searchsorted/histogram broken in versions 0.8 and 0.9
920470  numarray arange() problem
915736  numarray-0.9: Doc/CHANGES not up to date

WHERE
-----------

Numarray-1.0 windows executable installers, source code, and manual is
here:

http://sourceforge.net/project/showfiles.php?group_id=1369

Numarray is hosted by Source Forge in the same project which hosts
Numeric:

http://sourceforge.net/projects/numpy/

The web page for Numarray information is at:

http://stsdas.stsci.edu/numarray/index.html

Trackers for Numarray Bugs, Feature Requests, Support, and Patches are
at the Source Forge project for NumPy at:

http://sourceforge.net/tracker/?group_id=1369

REQUIREMENTS
------------------------------

numarray-1.0 requires Python 2.2.2 or greater.  


AUTHORS, LICENSE
------------------------------

Numarray was written by Perry Greenfield, Rick White, Todd Miller, JC
Hsu, Paul Barrett, Phil Hodge at the Space Telescope Science
Institute.  We'd like to acknowledge the assitance of Francesc Alted,
Paul Dubois, Sebastian Haase, Tim Hochberg, Nadav Horesh, Edward
C. Jones, Eric Jones, Jochen K"upper, Travis Oliphant, Pearu Peterson,
Peter Verveer, Colin Williams, and everyone else who has contributed
with comments, bug reports, or patches.

Numarray is made available under a BSD-style License.  See
LICENSE.txt in the source distribution for details.

-- 
Todd Miller             jmiller at stsci.edu


From paustin at eos.ubc.ca  Sat Jul  3 10:11:03 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Sat Jul  3 10:11:03 2004
Subject: [Numpy-discussion] Bug in numarray.typecode()?
In-Reply-To: <1088796821.5974.15.camel@halloween.stsci.edu>
References: <200406291705.55454.haase@msg.ucsf.edu>
	<200407020827.05407.haase@msg.ucsf.edu>
	<1088784157.26482.14.camel@halloween.stsci.edu>
	<200407021045.00866.haase@msg.ucsf.edu>
	<1088796821.5974.15.camel@halloween.stsci.edu>
Message-ID: <16614.59532.288486.645869@gull.eos.ubc.ca>

I'm in the process of switching to numarray, but I still
need typecode().  I notice that, although it's discouraged,
the typecode ids have been extended to all new numarray
types described in table 4.1 (p. 19) of the manual, except UInt64.
That is, the following script:

import numarray as Na
print "Numarray version: ",Na.__version__
print Na.array([1],'Int8').typecode()
print Na.array([1],'UInt8').typecode()
print Na.array([1],'Int16').typecode()
print Na.array([1],'UInt16').typecode()
print Na.array([1],'Int32').typecode()
print Na.array([1],'UInt32').typecode()
print Na.array([1],'Float32').typecode()
print Na.array([1],'Float64').typecode()
print Na.array([1],'Complex32').typecode()
print Na.array([1],'Complex64').typecode()
print Na.array([1],'Bool').typecode()
print Na.array([1],'UInt64').typecode()

prints:

Numarray version:  1.0
1
b
s
w
l
u
f
d
F
D
1
Traceback (most recent call last):
  File "<stdin>", line 14, in ?
  File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 1092, in typecode
    return _nt.typecode[self._type]
KeyError: UInt64

Should this print 'U'?

Regards, Phil Austin


From curzio.basso at unibas.ch  Tue Jul  6 02:42:06 2004
From: curzio.basso at unibas.ch (Curzio Basso)
Date: Tue Jul  6 02:42:06 2004
Subject: [Numpy-discussion] inconsistencies between docs and C headers?
Message-ID: <40EA73C9.7070604@unibas.ch>

Hi all,
can someone explain me why in the docs functions like NA_NewArray() 
return a PyObject*, while in the headers they return a PyArrayObject*? 
Is it just the documentation which is slow to catch up with the 
development? Or am i missing something?

thanks, curzio


From jmiller at stsci.edu  Tue Jul  6 06:35:11 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul  6 06:35:11 2004
Subject: [Numpy-discussion] Bug in numarray.typecode()?
In-Reply-To: <16614.59532.288486.645869@gull.eos.ubc.ca>
References: <200406291705.55454.haase@msg.ucsf.edu>
	 <200407020827.05407.haase@msg.ucsf.edu>
	 <1088784157.26482.14.camel@halloween.stsci.edu>
	 <200407021045.00866.haase@msg.ucsf.edu>
	 <1088796821.5974.15.camel@halloween.stsci.edu>
	 <16614.59532.288486.645869@gull.eos.ubc.ca>
Message-ID: <1089120859.25460.3.camel@halloween.stsci.edu>

On Sat, 2004-07-03 at 13:10, Philip Austin wrote:
> I'm in the process of switching to numarray, but I still
> need typecode().  I notice that, although it's discouraged,
> the typecode ids have been extended to all new numarray
> types described in table 4.1 (p. 19) of the manual, except UInt64.
> That is, the following script:
> 
> import numarray as Na
> print "Numarray version: ",Na.__version__
> print Na.array([1],'Int8').typecode()
> print Na.array([1],'UInt8').typecode()
> print Na.array([1],'Int16').typecode()
> print Na.array([1],'UInt16').typecode()
> print Na.array([1],'Int32').typecode()
> print Na.array([1],'UInt32').typecode()
> print Na.array([1],'Float32').typecode()
> print Na.array([1],'Float64').typecode()
> print Na.array([1],'Complex32').typecode()
> print Na.array([1],'Complex64').typecode()
> print Na.array([1],'Bool').typecode()
> print Na.array([1],'UInt64').typecode()
> 
> prints:
> 
> Numarray version:  1.0
> 1
> b
> s
> w
> l
> u
> f
> d
> F
> D
> 1
> Traceback (most recent call last):
>   File "<stdin>", line 14, in ?
>   File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 1092, in typecode
>     return _nt.typecode[self._type]
> KeyError: UInt64
> 
> Should this print 'U'?

I think it could,  but I wouldn't go so far as to say it should. 
typecode() is there for backward compatibility with Numeric.  Since 'U'
doesn't work for Numeric,  I see no point in adding it to numarray. I'm
not sure it would hurt anything other than create the illusion that
something which works on numarray will also work on Numeric.

If anyone has a good reason to add it,  please speak up.

Regards,
Todd


From jmiller at stsci.edu  Tue Jul  6 06:58:09 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul  6 06:58:09 2004
Subject: [Numpy-discussion] inconsistencies between docs and C headers?
In-Reply-To: <40EA73C9.7070604@unibas.ch>
References: <40EA73C9.7070604@unibas.ch>
Message-ID: <1089122261.25460.41.camel@halloween.stsci.edu>

On Tue, 2004-07-06 at 05:41, Curzio Basso wrote:
> Hi all,
> can someone explain me why in the docs functions like NA_NewArray() 
> return a PyObject*, while in the headers they return a PyArrayObject*? 
> Is it just the documentation which is slow to catch up with the 
> development? 

Yes,  it's a bona fide inconsistency.  It's not great,  but it's fairly 
harmless since a PyArrayObject is a PyObject.


From paustin at eos.ubc.ca  Tue Jul  6 09:31:05 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Tue Jul  6 09:31:05 2004
Subject: [Numpy-discussion] Bug in numarray.typecode()?
In-Reply-To: <1089120859.25460.3.camel@halloween.stsci.edu>
References: <200406291705.55454.haase@msg.ucsf.edu>
	<200407020827.05407.haase@msg.ucsf.edu>
	<1088784157.26482.14.camel@halloween.stsci.edu>
	<200407021045.00866.haase@msg.ucsf.edu>
	<1088796821.5974.15.camel@halloween.stsci.edu>
	<16614.59532.288486.645869@gull.eos.ubc.ca>
	<1089120859.25460.3.camel@halloween.stsci.edu>
Message-ID: <16618.54200.934079.44467@gull.eos.ubc.ca>

Todd Miller writes:
 > > 
 > > Should this print 'U'?
 > 
 > I think it could,  but I wouldn't go so far as to say it should. 
 > typecode() is there for backward compatibility with Numeric.  Since 'U'
 > doesn't work for Numeric,  I see no point in adding it to numarray. I'm
 > not sure it would hurt anything other than create the illusion that
 > something which works on numarray will also work on Numeric.
 > 
 > If anyone has a good reason to add it,  please speak up.
 > 

I don't necessarily need typecode, but I couldn't find the
inverse of

a = array([10], type = 'UInt8') (p. 19) 

in the manual.

That is, I need a method that returns the string representation of a
numarray type in a single call (as opposed to the two-step
repr(array.type()).  This is for code that uses the Boost C++ bindings
to numarray.  These bindings work via callbacks to python (which
eliminates the need to link to the numarray or numeric api). Currently
I use typecode() to get an index into a map of types when I need to
check that the type of a passed argument is correct:

void check_type(boost::python::numeric::array arr, 
		string expected_type){
  string actual_type = arr.typecode();
  if (actual_type != expected_type) {
    std::ostringstream stream;
    stream << "expected Numeric type " << kindstrings[expected_type]
	   << ", found Numeric type " << kindstrings[actual_type] << std::ends;
    PyErr_SetString(PyExc_TypeError, stream.str().c_str());
    throw_error_already_set();
  }
  return;
}

Unless I'm missing something, without typecode I need a second
interpreter call to repr, or I need to import numarray and load all
the types into storage for a type object comparison.  It's not a
showstopper, but since I check every argument in every call, I'd like
to avoid this unless absolutely necessary.

Regards, Phil


From jmiller at stsci.edu  Tue Jul  6 11:40:08 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul  6 11:40:08 2004
Subject: [Numpy-discussion] Missing header_pep.txt
Message-ID: <1089139173.26741.2.camel@halloween.stsci.edu>

Somehow header_pep.txt didn't make it into the numarray-1.0 source
tar-ball.  It's now in CVS and also attached.

Regards,
Todd
-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 38
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040706/680d4770/attachment.mht>

From jmiller at stsci.edu  Tue Jul  6 10:15:27 2004
From: jmiller at stsci.edu (Todd Miller)
Date: 06 Jul 2004 10:15:27 -0400
Subject: ANN: numarray-1.0 released
In-Reply-To: <40C2E65B0000343B@cpfe4.be.tisc.dk>
References: <40C2E65B0000343B@cpfe4.be.tisc.dk>
Message-ID: <1089123327.25460.57.camel@halloween.stsci.edu>

On Tue, 2004-07-06 at 02:59, jjm at tiscali.dk wrote:
> > The PEP is now in
> > numarray-1.0/Doc/header_pep.txt in docutils format.  We want feedback
> > and consensus before we submit to python-dev so please consider
> > reading it and commenting.
> 
> I can't find header_pep.txt!  It is not in numarray-1.0.tar.gz.

Oops, you're right.  I attached it.  Apparently I forgot to add it to
CVS.

Todd

-------------- next part --------------
PEP: XXX
Title: numerical array headers
Version: $Revision: 1.3 $
Last-Modified: $Date: 2002/08/30 04:11:20 $
Author: Todd Miller <jmiller at stsci.edu>, Perry Greenfield <perry at stsci.edu>
Discussions-To:  numpy-discussion at lists.sf.net
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 02-Jun-2004
Python-Version: 2.4
Post-History: 30-Aug-2002


Abstract
========

We propose the inclusion of three numarray header files within the
CPython distribution to facilitate use of numarray array objects as an
optional data data format for 3rd party modules. The PEP illustrates a
simple technique by which a 3rd party extension may support numarrays
as input or output values if numarray is installed, and yet the 3rd
party extension does not require numarray to be installed to be
built. Nothing needs to be changed in the setup.py or makefile for
installing with or without numarray, and a subsequent installation of
numarray will allow its use without rebuilding the 3rd party
extension.

Specification
=============

This PEP applies only to the CPython platform and only to numarray.
Analogous PEPs could be written for Jython and Python.NET and Numeric,
but what is discussed here is a speed optimization that is tightly
coupled to CPython and numarray.  Three header files to support the
numarray C-API should be included in the CPython distribution within a
numarray subdirectory of the Python include directory:

*    numarray/arraybase.h
*    numarray/libnumeric.h
*    numarray/arrayobject.h


The files are shown prefixed with "numarray" to leave the door open
for doing similar PEPs with other packages, such as Numeric.  If a
plethora of such header contributions is anticipated, a further
refinement would be to locate the headers under something like
"third_party/numarray".

In order to provide enhanced performance for array objects, an
extension writer would start by including the numarray C-API in
addition to any other Python headers:

::

    #include "numarray/arrayobject.h"

Not shown in this PEP are the API calls which operate on numarrays.
These are documented in the numarray manual.  What is shown here are
two calls which are guaranteed to be safe even when numarray is not
installed:

* PyArray_Present()
* PyArray_isArray()

In an extension function that wants to access the numarray API,
a test needs to be performed to determine if the API functions
are safely callable:

::

    PyObject *
    some_array_returning_function(PyObject *m, PyObject *args)
    {
            int param;
            PyObject *result;

            if (!PyArg_ParseTuple(args, "i", &param))
               return NULL;

            if (PyArray_Present()) {
               result = numarray_returning_function(param);
            } else {
               result = list_returning_function(param);
            }
            return result;
    }

Within **numarray_returning_function**, a subset of the numarray C-API
(the Numeric compatible API) is available for use so it is possible to
create and return numarrays.

Within **list_returning_function**, only the standard Python C-API can
be used because numarray is assumed to be unavailable in that
particular Python installation.

In an extension function that wants to accept numarrays as inputs and
provide improved performance over the Python sequence protocol, an
additional convenience function exists which diverts arrays to
specialized code when numarray is present and the input is an array:

::

    PyObject *
    some_array_accepting_function(PyObject *m, PyObject *args)
    {
            PyObject *sequence, *result;

            if (!PyArg_ParseTuple(args, "O", &sequence))
               return NULL;

            if (PyArray_isArray(sequence)) {
               result = numarray_input_function(sequence);
            } else {
               result = sequence_input_function(sequence);
            }
            return result;
    }

During module initialization, a numarray enhanced extension must call
**import_array()**, a macro which imports numarray and assigns a value
to a static API pointer: PyArray_API.  Since the API pointer starts
with the value NULL and remains so if the numarray import fails, the
API pointer serves as a flag that indicates that numarray was
sucessfully imported whenever it is non-NULL.

::

    static void
    initfoo(void)
    {
	PyObject *m = Py_InitModule3(
                 "foo",
                  _foo_functions,
                  _foo__doc__);
	if (m == NULL) return;
        import_array();
    }       

**PyArray_Present()** indicates that numarray was successfully
imported.  It is defined in terms of the API function pointer as:

::

    #define PyArray_Present()  (PyArray_API != NULL)


**PyArray_isArray(s)** indicates that numarray was successfully
imported and the given parameter is a numarray instance.  It is
defined as:

::

    #define PyArray_isArray(s)  (PyArray_Present() && PyArray_Check(s))

Motivation
==========

The use of numeric arrays as an interchange format is eminently 
sensible for many kinds of modules. For example, image, graphics,
and audio modules all can accept or generate large amounts of
numerical data that could easily use the numarray format. But since
numarray is not part of the standard distribution, some authors
of 3rd party extensions may be reluctant to add a dependency
on a different 3rd party extension that isn't absolutely essential
for its use fearing dissuading users who may be put off by extra
installation requirements. Yet, not allowing easy interchange with
numarray introduces annoyances that need not be present. Normally,
in the absence of an explicit ability to generate or use numarray
objects, one must write conversion utilities to convert from the
data representation used to that for numarray. This typically involves
excess copying of data (usually from internal to string to numarray).
In cases where the 3rd party uses buffer objects, the data may not
need copying at all.

Either many users may have to develop their own conversion routines
or numarray will have to include adapters for many other 3rd party
packages. Since numarray is used by many projects, it makes more
sense to put the conversion logic on the other side of the
fence.

There is a clear need for a mechanism that allows 3rd party software
to use numarray objects if it is available without requiring
numarray's presence to build and install properly.

Rationale
=========

One solution is to make numarray part of the standard distribution.
That may be a good long-term solution, but at the moment, the numeric
community is in transition period between the Numeric and numarray
packages which may take years to complete. It is not likely that
numarray will be considered for adoption until the transition is
complete. Numarray is also a large package, and there is legitimate
concern about its inclusion as regards the long-term commitment to
support.

We can solve that problem by making a few include files part of the
Python Standard Distribution and demonstrating how extension writers
can write code that uses numarray conditionally.

The API submitted in this PEP is the subset of the numarray API which
is most source compatible with Numeric.  The headers consist of two
handwritten files (arraybase.h and arrayobject.h) and one generated
file (libnumeric.h).  

arraybase.h contains typedefs and enumerations which are important to
both the API presented here and to the larger numarray specific API.

arrayobject.h glues together arraybase and libnumeric and is needed
for Numeric compatibility.  

libnumeric.h consists of macros generated from a template and a list
of function prototypes.  The macros themselves are somewhat intricate
in order to provide the compile time checking effect of function
prototypes.  Further, the interface takes two forms: one form is used
to compile numarray and defines static function prototypes.  The other
form is used to compile extensions which use the API and defines
macros which execute function calls through pointers which are found
in a table located using a single public API pointer.  These macros
also test the value of the API pointer in order to deliver a fatal
error should a developer forget to initialize by calling
import_array().

The interface chosen here is the subset of numarray most useful for
porting existing Numeric code or creating new extensions which can be
compiled for either numarray or Numeric.  There are a number of other
numarray API functions which are omitted here for the sake of
simplicity.

By choosing to support only the Numeric compatible subset of the
numarray C-API, concerns about interface stability are minimized
because the Numeric API is well established.  However, it should be
made clear that the numarray API subset proposed here is source
compatible, not binary compatible, with Numeric.

Resources
=========

* numarray/arraybase.h      (http://cvs.sourceforge.net/viewcvs.py/numpy/numarray/Include/numarray/arraybase.h)

* numarray/libnumeric.h     (http://cvs.sourceforge.net/viewcvs.py/numpy/numarray/Include/numarray/libnumeric.h)

* numarray/arrayobject.h    (http://cvs.sourceforge.net/viewcvs.py/numpy/numarray/Include/numarray/arrayobject.h)

* numarray-1.0 manual PDF

* numarray-1.0 source distribution

* numarray website at STSCI (http://www.stsci.edu/resources/software_hardware/numarray)

* example numarray enhanced extension

References
==========

.. [1] PEP 1, PEP Purpose and Guidelines, Warsaw, Hylton
   (http://www.python.org/peps/pep-0001.html)

.. [2] PEP 9, Sample Plaintext PEP Template, Warsaw
   (http://www.python.org/peps/pep-0009.html)


Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:

From paustin at eos.ubc.ca  Tue Jul  6 16:09:02 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Tue Jul  6 16:09:02 2004
Subject: [Numpy-discussion] non-intuitive behaviour for isbyteswapped()?
In-Reply-To: <16614.59532.288486.645869@gull.eos.ubc.ca>
References: <200406291705.55454.haase@msg.ucsf.edu>
	<200407020827.05407.haase@msg.ucsf.edu>
	<1088784157.26482.14.camel@halloween.stsci.edu>
	<200407021045.00866.haase@msg.ucsf.edu>
	<1088796821.5974.15.camel@halloween.stsci.edu>
	<16614.59532.288486.645869@gull.eos.ubc.ca>
Message-ID: <16619.12490.596884.579782@gull.eos.ubc.ca>

With numarray 1.0 and Mandrake 10 i686 I get the following:

>>> y=N.array([1,1,2,1],type="Float64")
>>> y
array([ 1.,  1.,  2.,  1.])
>>> y.byteswap()
>>> y
array([  3.03865194e-319,   3.03865194e-319,   3.16202013e-322,
         3.03865194e-319])
>>> y.isbyteswapped()
0

Should this be 1?

Thanks, Phil


From paustin at eos.ubc.ca  Tue Jul  6 18:43:49 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Tue Jul  6 18:43:49 2004
Subject: [Numpy-discussion] optional arguments to the array constructor
Message-ID: <16619.21771.686179.152410@gull.eos.ubc.ca>

(for numpy v1.0 on Mandrake 10 i686)

As noted on p. 25 the array constructor takes up to 5 optional arguments

array(sequence=None, type=None, shape=None, copy=1, savespace=0,typecode=None)
(and raises an exception if both type and typecode are set).  

Is there any way to make an alias (copy=0) of an array without passing
keyword values?  That is, specifying the copy keyword alone works:

test=N.array((1., 3), "Float64", shape=(2,), copy=1, savespace=0)
a=N.array(test, copy=0)
a[1]=999
print test

>>> [   1.  999.]

But when intervening keywords are specified copy won't toggle:

test=N.array((1., 3))
a=N.array(sequence=test, type="Float64", shape=(2,), copy=0)
a[1]=999.
print test
>>> [ 1.  3.]

Which is also the behaviour I see when I drop the keywords:

test=N.array((1., 3))
a=N.array(test, "Float64", (2,), 0)
a[1]=999.
print test
>>> [ 1.  3.]

an additional puzzle is that adding the savespace parameter raises
the following exception:


>>> a=N.array(test, "Float64", (2,), 0,0)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 312, in array
    type = getTypeObject(sequence, type, typecode) 
  File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 256, in getTypeObject
    rtype = _typeFromTypeAndTypecode(type, typecode)
  File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 243, in _typeFromTypeAndTypecode
    raise ValueError("Can't define both 'type' and 'typecode' for an array.")
ValueError: Can't define both 'type' and 'typecode' for an array.

Thanks for any insights -- Phil


From postmaster at framatome-anp.com  Tue Jul  6 23:59:40 2004
From: postmaster at framatome-anp.com (System Administrator)
Date: Tue Jul  6 23:59:40 2004
Subject: [Numpy-discussion] Undeliverable: Re: Thanks!
Message-ID: <72B401374280BA4897AF65F8853386C0F193DC@fpari01mxb.di.framatome.fr>

Your message

  To:      jacques.heliot at framatome-anp.com
  Subject: Re: Thanks!
  Sent:    Wed, 7 Jul 2004 08:56:14 +0200

did not reach the following recipient(s):

jacques.heliot at framail.framatome-anp.com on Wed, 7 Jul 2004 08:56:06 +0200
    The recipient name is not recognized
	The MTS-ID of the original message is: c=fr;a=
;p=fragroup;l=FPARI01MXB0407070656LWFRLMFV
    MSEXCH:IMS:FRAGROUP:FRAANP-FR-PARIS-PARIS:FPARI01MXB 0 (000C05A6)
Unknown Recipient


-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 38
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040706/1797edc8/attachment.mht>

From numpy-discussion at lists.sourceforge.net  Wed Jul  7 02:56:14 2004
From: numpy-discussion at lists.sourceforge.net (numpy-discussion at lists.sourceforge.net)
Date: Wed, 7 Jul 2004 08:56:14 +0200 
Subject: Thanks!
Message-ID: <200407070647.i676lSFD047810@mx.framatome-anp.com>

------------------  Virus Warning Message (on octopussy)

Found virus WORM_NETSKY.D in file message_part2.pif
The uncleanable file is deleted.

---------------------------------------------------------

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ATT2026863.txt
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040707/04bbb76c/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ATT2026864.txt
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040707/04bbb76c/attachment-0001.txt>

From jmiller at stsci.edu  Wed Jul  7 07:58:05 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jul  7 07:58:05 2004
Subject: [Numpy-discussion] non-intuitive behaviour for isbyteswapped()?
In-Reply-To: <16619.12490.596884.579782@gull.eos.ubc.ca>
References: <200406291705.55454.haase@msg.ucsf.edu>
	 <200407020827.05407.haase@msg.ucsf.edu>
	 <1088784157.26482.14.camel@halloween.stsci.edu>
	 <200407021045.00866.haase@msg.ucsf.edu>
	 <1088796821.5974.15.camel@halloween.stsci.edu>
	 <16614.59532.288486.645869@gull.eos.ubc.ca>
	 <16619.12490.596884.579782@gull.eos.ubc.ca>
Message-ID: <1089212251.29456.212.camel@halloween.stsci.edu>

On Tue, 2004-07-06 at 19:07, Philip Austin wrote:
> With numarray 1.0 and Mandrake 10 i686 I get the following:
> 
> >>> y=N.array([1,1,2,1],type="Float64")
> >>> y
> array([ 1.,  1.,  2.,  1.])
> >>> y.byteswap()
> >>> y
> array([  3.03865194e-319,   3.03865194e-319,   3.16202013e-322,
>          3.03865194e-319])
> >>> y.isbyteswapped()
> 0
>
> Should this be 1?

The behavior of byteswap() has been controversial in the past, at one
time implementing exactly the behavior I think you expected.

Without giving any guarantee for the future,  here's how things work
now:  byteswap() just swaps the bytes.  There's a related method,
togglebyteorder(), which inverts the sense of the byteorder:

	>>> y.byteswap()
	>>> y.togglebyteorder()
	>>> y.isbyteswapped()
	1

The ability to munge bytes and change the sense of byteorder
independently is definitely needed... but you're certainly not the first
one to ask this question.

There is also (Numeric compatible) byteswapped(), which both swaps and
changes sense, but it creates a copy rather than operating in place:
	
	>>> x = y.byteswapped()
	>>> (x is not y) and (x._data is not y._data)
	1

Regards,
Todd


From jmiller at stsci.edu  Wed Jul  7 08:13:05 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jul  7 08:13:05 2004
Subject: [Numpy-discussion] optional arguments to the array constructor
In-Reply-To: <16619.21771.686179.152410@gull.eos.ubc.ca>
References: <16619.21771.686179.152410@gull.eos.ubc.ca>
Message-ID: <1089213153.29456.229.camel@halloween.stsci.edu>

On Tue, 2004-07-06 at 21:42, Philip Austin wrote:
> (for numpy v1.0 on Mandrake 10 i686)

My guess is you're talking about numarray here.  Please be charitable if
I'm talking out of turn... I tend to see everything as a numarray issue.

> As noted on p. 25 the array constructor takes up to 5 optional arguments
> 
> array(sequence=None, type=None, shape=None, copy=1, savespace=0,typecode=None)
> (and raises an exception if both type and typecode are set).  
> 
> Is there any way to make an alias (copy=0) of an array without passing
> keyword values?  

In numarray,  all you have to do to get an alias is:

>>> b = a.view()

It's an alias because:

>>> b._data is a._data
True

> That is, specifying the copy keyword alone works:
> 
> test=N.array((1., 3), "Float64", shape=(2,), copy=1, savespace=0)
> a=N.array(test, copy=0)
> a[1]=999
> print test
> 
> >>> [   1.  999.]
> 
> But when intervening keywords are specified copy won't toggle:
> 
> test=N.array((1., 3))
> a=N.array(sequence=test, type="Float64", shape=(2,), copy=0)
> a[1]=999.
> print test
> >>> [ 1.  3.]
> 
> Which is also the behaviour I see when I drop the keywords:
> 
> test=N.array((1., 3))
> a=N.array(test, "Float64", (2,), 0)
> a[1]=999.
> print test
> >>> [ 1.  3.]
> 
> an additional puzzle is that adding the savespace parameter raises
> the following exception:
> 
> 
> >>> a=N.array(test, "Float64", (2,), 0,0)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 312, in array
>     type = getTypeObject(sequence, type, typecode) 
>   File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 256, in getTypeObject
>     rtype = _typeFromTypeAndTypecode(type, typecode)
>   File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 243, in _typeFromTypeAndTypecode
>     raise ValueError("Can't define both 'type' and 'typecode' for an array.")
> ValueError: Can't define both 'type' and 'typecode' for an array.

All this looks like a documentation problem.  The numarray array()
signature has been tortured by Numeric backward compatibility,  so there
has been more flux in it than you would expect.  Anyway, the manual is
out of date.  Here's the current signature from the code:

def array(sequence=None, typecode=None, copy=1, savespace=0,
          type=None, shape=None):

Sorry about the confusion,
Todd


From paustin at eos.ubc.ca  Wed Jul  7 11:26:11 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Wed Jul  7 11:26:11 2004
Subject: [Numpy-discussion] optional arguments to the array constructor
In-Reply-To: <1089213153.29456.229.camel@halloween.stsci.edu>
References: <16619.21771.686179.152410@gull.eos.ubc.ca>
	<1089213153.29456.229.camel@halloween.stsci.edu>
Message-ID: <16620.16395.603789.28730@gull.eos.ubc.ca>

Todd Miller writes:
 > On Tue, 2004-07-06 at 21:42, Philip Austin wrote:
 > > (for numpy v1.0 on Mandrake 10 i686)
 > 
 > My guess is you're talking about numarray here.  Please be charitable if
 > I'm talking out of turn... I tend to see everything as a numarray
 > issue.

Right -- I'm still working through the boost test suite for numarray, which is
failing a couple of tests that passed (around numarray v0.3).

 > All this looks like a documentation problem.  The numarray array()
 > signature has been tortured by Numeric backward compatibility,  so there
 > has been more flux in it than you would expect.  Anyway, the manual is
 > out of date.  Here's the current signature from the code:
 > 
 > def array(sequence=None, typecode=None, copy=1, savespace=0,
 >           type=None, shape=None):
 > 

Actually, it seems to be a difference in the way that numeric and
numarray treat the copy flag when typecode is specified.  In numeric,
if no change in type is requested and copy=0, then the constructor
goes ahead and produces a view:

import Numeric as nc
test=nc.array([1,2,3],'i')
a=nc.array(test,'i',0)
a[0]=99
print test
>> [99  2  3]

but makes a copy if a cast is required:

test=nc.array([1,2,3],'i')
a=nc.array(test,'F',0)
a[0]=99
print test
>>> [1 2 3]

Looking at numarraycore.py line 305 I see that:

        if type is None and typecode is None:
            if copy:
                a = sequence.copy()
            else:
                a = sequence

i.e. numarray skips the check for a type match and ignores
the copy flag, even if the type is preserved:

import numarray as ny
test=ny.array([1,2,3],'i')
a=ny.array(test,'i',0)
a._data is test._data
>>> False

It look like there might have been a comment about this
in the docstring, but it got clipped at some point?:

array() constructs a NumArray by calling NumArray, one of its
    factory functions (fromstring, fromfile, fromlist), or by making a
    copy of an existing array.  If copy=0, array() will create a new
    array only if

    sequence             specifies the contents or storage for the array

Thanks, Phil


From jmiller at stsci.edu  Wed Jul  7 12:47:02 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jul  7 12:47:02 2004
Subject: [Numpy-discussion] optional arguments to the array constructor
In-Reply-To: <16620.16395.603789.28730@gull.eos.ubc.ca>
References: <16619.21771.686179.152410@gull.eos.ubc.ca>
	 <1089213153.29456.229.camel@halloween.stsci.edu>
	 <16620.16395.603789.28730@gull.eos.ubc.ca>
Message-ID: <1089229573.29456.544.camel@halloween.stsci.edu>

On Wed, 2004-07-07 at 14:25, Philip Austin wrote:
> Todd Miller writes:
>  > On Tue, 2004-07-06 at 21:42, Philip Austin wrote:
>  > > (for numpy v1.0 on Mandrake 10 i686)
>  > 
>  > My guess is you're talking about numarray here.  Please be charitable if
>  > I'm talking out of turn... I tend to see everything as a numarray
>  > issue.
> 
> Right -- I'm still working through the boost test suite for numarray, which is
> failing a couple of tests that passed (around numarray v0.3).
> 
>  > All this looks like a documentation problem.  The numarray array()
>  > signature has been tortured by Numeric backward compatibility,  so there
>  > has been more flux in it than you would expect.  Anyway, the manual is
>  > out of date.  Here's the current signature from the code:
>  > 
>  > def array(sequence=None, typecode=None, copy=1, savespace=0,
>  >           type=None, shape=None):
>  > 
> 
> Actually, it seems to be a difference in the way that numeric and
> numarray treat the copy flag when typecode is specified.  In numeric,
> if no change in type is requested and copy=0, then the constructor
> goes ahead and produces a view:
> 
> import Numeric as nc
> test=nc.array([1,2,3],'i')
> a=nc.array(test,'i',0)
> a[0]=99
> print test
> >> [99  2  3]
> 
> but makes a copy if a cast is required:
> 
> test=nc.array([1,2,3],'i')
> a=nc.array(test,'F',0)
> a[0]=99
> print test
> >>> [1 2 3]
> 
> Looking at numarraycore.py line 305 I see that:
> 
>         if type is None and typecode is None:
>             if copy:
>                 a = sequence.copy()
>             else:
>                 a = sequence
> 
> i.e. numarray skips the check for a type match and ignores
> the copy flag, even if the type is preserved:
> 
> import numarray as ny
> test=ny.array([1,2,3],'i')
> a=ny.array(test,'i',0)
> a._data is test._data
> >>> False
> 

OK,  I think I see what you're after and agree that it's a bug.  Here's
how I'll change the behavior:

>>> import numarray
>>> a = numarray.arange(10)
>>> b = numarray.array(a, copy=0)
>>> a is b
True
>>> b = numarray.array(a, copy=1)
>>> a is b
False

One possible point of note is that array() doesn't return views for
copy=0;  neither does Numeric; both return the original sequence.

Regards,
Todd


From paustin at eos.ubc.ca  Wed Jul  7 13:15:04 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Wed Jul  7 13:15:04 2004
Subject: [Numpy-discussion] optional arguments to the array constructor
In-Reply-To: <1089229573.29456.544.camel@halloween.stsci.edu>
References: <16619.21771.686179.152410@gull.eos.ubc.ca>
	<1089213153.29456.229.camel@halloween.stsci.edu>
	<16620.16395.603789.28730@gull.eos.ubc.ca>
	<1089229573.29456.544.camel@halloween.stsci.edu>
Message-ID: <16620.22921.791432.143944@gull.eos.ubc.ca>

Todd Miller writes:
 > 
 > OK,  I think I see what you're after and agree that it's a bug.  Here's
 > how I'll change the behavior:
 > 
 > >>> import numarray
 > >>> a = numarray.arange(10)
 > >>> b = numarray.array(a, copy=0)
 > >>> a is b
 > True
 > >>> b = numarray.array(a, copy=1)
 > >>> a is b
 > False

Just to be clear -- the above is the current numarray v1.0 behavior
(at least on my machine).  Numeric compatibility would additonally
require that

import numarray
a = numarray.arange(10)
theTypeCode=repr(a.type())
b = numarray.array(a, theTypeCode, copy=0)
print a is b
b = numarray.array(a, copy=1)
print a is b

produce

True
False

While currently it produces

True
True

Having said this, I can work around this difference -- so either
a note in the documentation or just removing the copy flag from 
numarray.array would also be ok.

-- Thanks, Phil


From paustin at eos.ubc.ca  Wed Jul  7 13:17:03 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Wed Jul  7 13:17:03 2004
Subject: [Numpy-discussion] Re: Correction -- optional arguments to the array constructor
In-Reply-To: <1089229573.29456.544.camel@halloween.stsci.edu>
References: <16619.21771.686179.152410@gull.eos.ubc.ca>
	<1089213153.29456.229.camel@halloween.stsci.edu>
	<16620.16395.603789.28730@gull.eos.ubc.ca>
	<1089229573.29456.544.camel@halloween.stsci.edu>
Message-ID: <16620.23066.506262.410021@gull.eos.ubc.ca>

Oops, note the change below at --->

Todd Miller writes:
 > 
 > OK,  I think I see what you're after and agree that it's a bug.  Here's
 > how I'll change the behavior:
 > 
 > >>> import numarray
 > >>> a = numarray.arange(10)
 > >>> b = numarray.array(a, copy=0)
 > >>> a is b
 > True
 > >>> b = numarray.array(a, copy=1)
 > >>> a is b
 > False

Just to be clear -- the above is the current numarray v1.0 behavior
(at least on my machine).  Numeric compatibility would additonally
require that

import numarray
a = numarray.arange(10)
theTypeCode=repr(a.type())
b = numarray.array(a, theTypeCode, copy=0)
print a is b
b = numarray.array(a, copy=1)
print a is b

produce

True
False

While currently it produces

--->

False
False

Having said this, I can work around this difference -- so either
a note in the documentation or just removing the copy flag from 
numarray.array would also be ok.

-- Thanks, Phil


From wlanger at bigpond.net.au  Thu Jul  8 10:29:01 2004
From: wlanger at bigpond.net.au (Wendy Langer)
Date: Thu Jul  8 10:29:01 2004
Subject: [Numpy-discussion] "buffer not aligned on 8 byte boundary" errors when running numarray.testall.test()
Message-ID: <NGBBLHCDALFCOGPKMFFJAEJBDAAA.wlanger@bigpond.net.au>


Hi there all :)

I am having trouble with my installation of numarray.  :(

I am a python newbie and a numarray extreme-newbie, so it could be that I
don't yet have the first clue what I am doing. ;)


Python 2.3.3 (#51, Feb 16 2004, 04:07:52) [MSC v.1200 32 bit (Intel)] on
win32
numarray 1.0


The Python I am using is the one that comes with the "Enthought" version
(www.enthought.com), a distro specifically designed to be useful for
scientists, so it comes with numerical stuff and scipy and chaco and things
like that preinstalled.

I used the windows binary installer. However it came with Numeric and not
numarray, so I installed numarray "by hand". This seemed to go ok, and it
seems that there is no problem having both Numeric and numarray in the same
installation, since they have (obviously) different names (still getting
used to this whole modules and namespaces &c &c)


At the bottom of this email I have pasted an example of what it was I was
trying to do, and the error messages that the interpreter gave me - but
before anyone bothers reading them in any detail, the essential error seems
to be as follows:

error: multiply_Float64_scalar_vector: buffer not aligned on 8 byte
boundary.


I have no idea what this means, but I do recall that when I ran the
numarray.testall.test() procedure after first completing my installation a
couple of days ago, it  reported a *lot* of problems, many of which sounded
quite similar to this.

I hoped for the best and  thought that perhaps I had "run the test wrong"(!)
since numarray seemed to be working ok, and I had investigated many of the
examples in chapters 3 and 4 of the user manual withour any obvious problems
(chapter 3 = "high level overview" and chapter 4 = "array basics")

I decided at the time to leave well enough alone until I actually came
across odd or mysterious behaviour ...however that time has come
all-too-soon...


The procedure I am using to run the test is as described on page 11 of the
excellent user's manual (release 0.8 at
http://www.pfdubois.com/numpy/numarray.pdf):


---------------------------------------------
Testing your Installation
Once you have installed numarray, test it with:
C:\numarray> python
Python 2.2.2 (#18, Dec 30 2002, 02:26:03) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import numarray.testall as testall
>>> testall.test()
numeric: (0, 1115)
records: (0, 48)
strings: (0, 166)
objects: (0, 72)
memmap: (0, 75)
Each line in the above output indicates that 0 of X tests failed. X grows
steadily with each release, so the numbers
shown above may not be current.
------------------------------------------------------------------------

Anyway, when I ran this, instead of the nice, comforting output above,  I
got about a million(!) errors and then a final count of 320 failures. This
number is not always constant - I recall the first time I ran it it was 209.
[I just ran it again and this time it was 324...it all has a rather
disturbing air of semi-randomness...]


So below is the (heavily snipped) output from the testall.test() run, and
below that is the code where I first noticed a possibly similar error, and
below *that*  is the output of that code, with the highly suspicous
error....


Any suggestions greatly appreciated!

I can give you more info about the setup on my computer and so on if you
need :)

wendy langer


======================================================================
<output when I ran numarray.testall.test() >
====================================

IDLE 1.0.2      ==== No Subprocess ====
>>> import numarray.testall as testall
>>> testall.test()
*****************************************************************
Failure in example: x+y
from line #50 of first pass
Exception raised:
Traceback (most recent call last):
  File "C:\PYTHON23\lib\doctest.py", line 442, in _run_examples_inner
    compileflags, 1) in globs
  File "<string>", line 1, in ?
  File "C:\PYTHON23\Lib\site-packages\numarray\numarraycore.py", line 733,
in __add__
    return ufunc.add(self, operand)
error: Int32asFloat64: buffer not aligned on 8 byte boundary.
*****************************************************************
Failure in example: x[:] = 0.1
from line #72 of first pass
Exception raised:
Traceback (most recent call last):
  File "C:\PYTHON23\lib\doctest.py", line 442, in _run_examples_inner
    compileflags, 1) in globs
  File "<string>", line 1, in ?
error: Float64asBool: buffer not aligned on 8 byte boundary.
*****************************************************************
Failure in example: y
from line #74 of first pass
Expected: array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])
Got: array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
*****************************************************************
Failure in example: x + z
from line #141 of first pass
Exception raised:
Traceback (most recent call last):
  File "C:\PYTHON23\lib\doctest.py", line 442, in _run_examples_inner
    compileflags, 1) in globs
  File "<string>", line 1, in ?
  File "C:\PYTHON23\Lib\site-packages\numarray\numarraycore.py", line 733,
in __add__
    return ufunc.add(self, operand)
error: Int32asFloat64: buffer not aligned on 8 byte boundary.
*****************************************************************


<BIG SNIP!!!!!!!!!!!!>

<snip a lot more  errors>

*****************************************************************
Failure in example: a2dma = average(a2dm, axis=1)
from line #812 of numarray.ma.dtest
Exception raised:
Traceback (most recent call last):
  File "C:\PYTHON23\lib\doctest.py", line 442, in _run_examples_inner
    compileflags, 1) in globs
  File "<string>", line 1, in ?
  File "C:\PYTHON23\Lib\site-packages\numarray\ma\MA.py", line 1686, in
average
    w = Numeric.choose(mask, (1.0, 0.0))
  File "C:\PYTHON23\Lib\site-packages\numarray\ufunc.py", line 1666, in
choose
    return _choose(selector, population, outarr, clipmode)
  File "C:\PYTHON23\Lib\site-packages\numarray\ufunc.py", line 1573, in
__call__
    result = self._doit(computation_mode, woutarr, cfunc, ufargs, 0)
  File "C:\PYTHON23\Lib\site-packages\numarray\ufunc.py", line 1558, in
_doit
    blockingparameters)
error: choose8bytes: buffer not aligned on 8 byte boundary.
*****************************************************************
Failure in example: alltest(a2dma, [1.5, 4.0])
from line #813 of numarray.ma.dtest
Exception raised:
Traceback (most recent call last):
  File "C:\PYTHON23\lib\doctest.py", line 442, in _run_examples_inner
    compileflags, 1) in globs
  File "<string>", line 1, in ?
NameError: name 'a2dma' is not defined
*****************************************************************
1 items had failures:
 320 of 671 in numarray.ma.dtest
***Test Failed*** 320 failures.
numarray.ma:                            (320, 671)


</output when I ran numarray.testall.test() >

=========================================================================
<my code>
========================


import numarray

class anXmatrix:
    def __init__(self, stepsize = 3):
        self.stepsize = stepsize
        self.populate_matrix()


    def describe(self):
        print "I am a ", self.__class__
        print "my stepsize is", self.stepsize
        print "my matrix is: \n"
        print self.matrix

    def populate_matrix(self):

        def xvalues(i,j):
            return self.stepsize*j

        mx = numarray.fromfunction(xvalues, (4,4))
        self.matrix = mx


if __name__ == '__main__':


    print "      "
    print "Making anXmatrix..."
    r = anXmatrix(stepsize = 5)
    r.describe()
    r = anXmatrix(stepsize = 0.02)
    r.describe()

</my code>

============================================================================
========
<output from interpreter when I run my code>


Making anXmatrix...
I am a  __main__.anXmatrix
my stepsize is 5
my matrix is:

[[ 0  5 10 15]
 [ 0  5 10 15]
 [ 0  5 10 15]
 [ 0  5 10 15]]
Traceback (most recent call last):
  File
"C:\Python23\Lib\site-packages\WendyStuff\wendycode\propagatorstuff\core_obj
ects\domain_objects.py", line 97, in ?
    r = anXmatrix(stepsize = 0.02)
  File
"C:\Python23\Lib\site-packages\WendyStuff\wendycode\propagatorstuff\core_obj
ects\domain_objects.py", line 72, in __init__
    self.populate_matrix()
  File
"C:\Python23\Lib\site-packages\WendyStuff\wendycode\propagatorstuff\core_obj
ects\domain_objects.py", line 86, in populate_matrix
    mx = numarray.fromfunction(xvalues, (4,4))
  File "C:\PYTHON23\Lib\site-packages\numarray\generic.py", line 1094, in
fromfunction
    return apply(function, tuple(indices(dimensions)))
  File
"C:\Python23\Lib\site-packages\WendyStuff\wendycode\propagatorstuff\core_obj
ects\domain_objects.py", line 84, in xvalues
    return self.stepsize*j
  File "C:\PYTHON23\Lib\site-packages\numarray\numarraycore.py", line 772,
in __rmul__
    r = ufunc.multiply(operand, self)
error: multiply_Float64_scalar_vector: buffer not aligned on 8 byte
boundary.
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


</output from interpreter when I run my code>
============================================================================
========

"You see, wire telegraph is a kind of a very, very long cat. You pull his
tail in New York and his head is meowing in Los Angeles. Do you understand
this? And radio operates exactly the same way: you send signals here, they
receive them there. The only difference is that there is no cat."  Albert
Einstein


From Chris.Barker at noaa.gov  Thu Jul  8 10:58:07 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul  8 10:58:07 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <40E473A9.5040109@colorado.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov> <40E473A9.5040109@colorado.edu>
Message-ID: <40ED8A6D.5050505@noaa.gov>

Thanks to Fernando Perez  and Travis Oliphant for pointing me to:

> scipy.io.read_array

In testing, I've found that it's very slow (for my needs), though quite 
nifty in other ways, so I'm sure I'll find a use for it in the future.

Travis Oliphant wrote:

 > Alternatively, we could move some of the Python code in read_array to 
 > C to improve the speed.

That was beyond me, so I wrote a very simple module in C that does what 
I want, and it is very much faster than read_array or straight python 
version. It has two functions:

FileScan(file)
"""
Reads all the values in rest of the ascii file, and produces a Numeric
vector full of Floats (C doubles).

All text in the file that is not part of a floating point number is
skipped over.
"""

FileScanN(file, N)

"""
Reads N values in the ascii file, and produces a Numeric vector of
length N full of Floats (C doubles).

Raises an exception if there are fewer than N  numbers in the file.

All text in the file that is not part of a floating point number is
skipped over.

After reading N numbers, the file is left before the next non-whitespace
character in the file. This will often leave the file at the start of
the next line, after scanning a line full of numbers.
"""

I implemented them separately, 'cause I wasn't sure how to deal with 
optional arguments in a C function. They could easily have wrapped in a 
Python function if you wanted one interface.

FileScan was much more complex, as I had to deal with all the dynamic 
memory allocation. I probably took a more complex approach to this than 
I had to, but it was an exercise for me, being a newbie at C.

I also decided not to specify a shape for the resulting array, always 
returning a rank-1 array, as that made the code easier, and you can 
always set A.shape afterward. This could be put in a Python wrapper as well.

It has the obvious limitation that it only does doubles. I'd like to add 
longs as well, but probably won't have a need for anything else. The way 
memory is these days, it seems just as easy to read the long ones, and 
convert afterward if you want.

Here is a quick benchmark (see below) run with a file that is 63,000 
lines, with two comma-delimited numbers on each line. Run on a 1GHz P4 
under Linux.

Reading with read_array
it took 16.351712 seconds to read the file with read_array
Reading with Standard Python methods
it took 2.832078 seconds to read the file with standard Python methods
Reading with FileScan
it took 0.444431 seconds to read the file with FileScan
Reading with FileScanN
it took 0.407875 seconds to read the file with FileScanN

As you can see, read_array is painfully slow for this kind of thing, 
straight Python is OK, and FileScan is pretty darn fast.

I've enclosed the C code and setup.py, if anyone wants to take a look, 
and use it, or give suggestions or bug fixes or whatever, that would be 
great.

In particular, I don't think I've structured the code very well, and 
there could be memory leak, which I have not tested carefully for.

Tested only on Linux with Python2.3.3, Numeric 23.1. If someone wants to 
  port it to numarray, that would be great too.

-Chris


The benchmark:

def test6():
     """
     Testing various IO options
     """
     from scipy.io import array_import

     filename = "JunkBig.txt"
     file = open(filename)
     print "Reading with read_array"
     start = time.time()
     A = array_import.read_array(file,",")
     print "it took %f seconds to read the file with 
read_array"%(time.time() - start)
     file.close()

     file = open(filename)
     print "Reading with Standard Python methods"
     start = time.time()
     A = []
     for line in file:
         A.append( map ( float, line.strip().split(",") ) )
     A = array(A)
     print "it took %f seconds to read the file with standard Python 
methods"%(time.time() - start)
     file.close()

     file = open(filename)
     print "Reading with FileScan"
     start = time.time()
     A = FileScanner.FileScan(file)
     A.shape = (-1,2)
     print "it took %f seconds to read the file with 
FileScan"%(time.time() - start)
     file.close()

     file = open(filename)
     print "Reading with FileScanN"
     start = time.time()
     A = FileScanner.FileScanN(file, product(A.shape) )
     A.shape = (-1,2)
     print "it took %f seconds to read the file with 
FileScanN"%(time.time() - start)

-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: FileScan_module.c
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040708/ede864ba/attachment.c>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: setup.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040708/ede864ba/attachment.ksh>

From jmiller at stsci.edu  Thu Jul  8 12:05:02 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul  8 12:05:02 2004
Subject: [Numpy-discussion] "buffer not aligned on 8 byte boundary"
	errors when running numarray.testall.test()
In-Reply-To: <NGBBLHCDALFCOGPKMFFJAEJBDAAA.wlanger@bigpond.net.au>
References: <NGBBLHCDALFCOGPKMFFJAEJBDAAA.wlanger@bigpond.net.au>
Message-ID: <1089313446.2639.55.camel@halloween.stsci.edu>

On Thu, 2004-07-08 at 13:28, Wendy Langer wrote: 
> Hi there all :)
> 
> I am having trouble with my installation of numarray.  :(
> 
> I am a python newbie and a numarray extreme-newbie, so it could be that I
> don't yet have the first clue what I am doing. ;)
> 
> 
> 
> Python 2.3.3 (#51, Feb 16 2004, 04:07:52) [MSC v.1200 32 bit (Intel)] on
> win32
> numarray 1.0
> 
> 
> The Python I am using is the one that comes with the "Enthought" version
> (www.enthought.com), a distro specifically designed to be useful for
> scientists, so it comes with numerical stuff and scipy and chaco and things
> like that preinstalled.
> 
> I used the windows binary installer. However it came with Numeric and not
> numarray, so I installed numarray "by hand". This seemed to go ok, and it
> seems that there is no problem having both Numeric and numarray in the same
> installation, since they have (obviously) different names (still getting
> used to this whole modules and namespaces &c &c)

I don't normally use SciPy,  but I normally have both numarray and
Numeric installed so there's no inherent conflict there.

> At the bottom of this email I have pasted an example of what it was I was
> trying to do, and the error messages that the interpreter gave me - but
> before anyone bothers reading them in any detail, the essential error seems
> to be as follows:
> 
> error: multiply_Float64_scalar_vector: buffer not aligned on 8 byte
> boundary.

This is a low level exception triggered by a misaligned data buffer. 
It's low level so it's impossible to tell what the real problem is
without more information.

> I have no idea what this means, but I do recall that when I ran the
> numarray.testall.test() procedure after first completing my installation a
> couple of days ago, it  reported a *lot* of problems, many of which sounded
> quite similar to this.

That sounds pretty bad.  Here's roughly how it should look these days:
% python
>>> import numarray.testall as testall
>>> testall.test()
numarray:                               ((0, 1165), (0, 1165))
numarray.records:                       (0, 48)
numarray.strings:                       (0, 176)
numarray.memmap:                        (0, 82)
numarray.objects:                       (0, 105)
numarray.memorytest:                    (0, 16)
numarray.examples.convolve:             ((0, 20), (0, 20), (0, 20), (0,
20))
numarray.convolve:                      (0, 52)
numarray.fft:                           (0, 75)
numarray.linear_algebra:                ((0, 46), (0, 51))
numarray.image:                         (0, 27)
numarray.nd_image:                      (0, 390)
numarray.random_array:                  (0, 53)
numarray.ma:                            (0, 671)

The tuple results for your test should all have leading zeros as above. 
The number of tests varies from release to release.


> I hoped for the best and  thought that perhaps I had "run the test wrong"(!)
> since numarray seemed to be working ok, and I had investigated many of the
> examples in chapters 3 and 4 of the user manual withour any obvious problems
> (chapter 3 = "high level overview" and chapter 4 = "array basics")
> 
> I decided at the time to leave well enough alone until I actually came
> across odd or mysterious behaviour ...however that time has come
> all-too-soon...
> 
> 
> 
> 
> The procedure I am using to run the test is as described on page 11 of the
> excellent user's manual (release 0.8 at
> http://www.pfdubois.com/numpy/numarray.pdf):

There's an updated manual here:
http://prdownloads.sourceforge.net/numpy/numarray-1.0.pdf?download

> --
> Testing your Installation
> Once you have installed numarray, test it with:
> C:\numarray> python
> Python 2.2.2 (#18, Dec 30 2002, 02:26:03) [MSC 32 bit (Intel)] on win32
> Type "copyright", "credits" or "license" for more information.
> >>> import numarray.testall as testall
> >>> testall.test()
> numeric: (0, 1115)
> records: (0, 48)
> strings: (0, 166)
> objects: (0, 72)
> memmap: (0, 75)
> Each line in the above output indicates that 0 of X tests failed. X grows
> steadily with each release, so the numbers
> shown above may not be current.
> --
> 
> Anyway, when I ran this, instead of the nice, comforting output above,  I
> got about a million(!) errors and then a final count of 320 failures. This
> number is not always constant - I recall the first time I ran it it was 209.
> [I just ran it again and this time it was 324...it all has a rather
> disturbing air of semi-randomness...]
> 
> 
> So below is the (heavily snipped) output from the testall.test() run, and
> below that is the code where I first noticed a possibly similar error, and
> below *that*  is the output of that code, with the highly suspicous
> error....
> 
> 
> Any suggestions greatly appreciated!

If you've ever had numarray installed before,  go to your site-packages
directory and delete numarray as well as any numarray.pth.  Then
reinstall numarray-1.0.

Also,  just do:

>>> import numarray
>>> numarray

and see what kind of path is involved getting to the numarray module.

> I can give you more info about the setup on my computer and so on if you
> need :)

I think you already included everything important;  the exact variant of
Windows you're using might be helpful;  I'm not aware of any problems
there though.  

It looks like you're on a well supported platform.  I just tested pretty
much the same configuration on Windows 2000 Pro, with Python-2.3.4, and
it worked fine even with SciPy-0.3.

> wendy langer
> 
> 
> ======================================================================
> <output when I ran numarray.testall.test() >
> 

<SNIP>

There's something hugely wrong with your test output.  I've never seen
anything like it other than during development.

> </output when I ran numarray.testall.test() >
> 
> =========================================================================
> <my code>
> ========================
> 
> 
> import numarray
> 
> class anXmatrix:
>     def __init__(self, stepsize = 3):
>         self.stepsize = stepsize
>         self.populate_matrix()
> 
> 
>     def describe(self):
>         print "I am a ", self.__class__
>         print "my stepsize is", self.stepsize
>         print "my matrix is: \n"
>         print self.matrix
> 
>     def populate_matrix(self):
> 
>         def xvalues(i,j):
>             return self.stepsize*j
> 
>         mx = numarray.fromfunction(xvalues, (4,4))
>         self.matrix = mx
> 
> 
> if __name__ == '__main__':
> 
> 
>     print "      "
>     print "Making anXmatrix..."
>     r = anXmatrix(stepsize = 5)
>     r.describe()
>     r = anXmatrix(stepsize = 0.02)
>     r.describe()
> 
> </my code>
> 
> ============================================================================
Here's what I get when I run your code, windows or linux:

Making anXmatrix...
I am a  __main__.anXmatrix
my stepsize is 5
my matrix is:

[[ 0  5 10 15]
 [ 0  5 10 15]
 [ 0  5 10 15]
 [ 0  5 10 15]]
I am a  __main__.anXmatrix
my stepsize is 0.02
my matrix is:

[[ 0.    0.02  0.04  0.06]
 [ 0.    0.02  0.04  0.06]
 [ 0.    0.02  0.04  0.06]
 [ 0.    0.02  0.04  0.06]]

Regards,
Todd


From Fernando.Perez at colorado.edu  Thu Jul  8 12:25:07 2004
From: Fernando.Perez at colorado.edu (Fernando.Perez at colorado.edu)
Date: Thu Jul  8 12:25:07 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <40ED8A6D.5050505@noaa.gov>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov> <40E473A9.5040109@colorado.edu> <40ED8A6D.5050505@noaa.gov>
Message-ID: <1089314664.40ed9f68e1db5@webmail.colorado.edu>

Quoting Chris Barker <Chris.Barker at noaa.gov>:

> Thanks to Fernando Perez  and Travis Oliphant for pointing me to:
>
> > scipy.io.read_array
>
> In testing, I've found that it's very slow (for my needs), though quite
> nifty in other ways, so I'm sure I'll find a use for it in the future.

Just a quick note Travis sent to me privately: he suggested using
io.numpyio.fread instead of Numeric.fromstring() for speed reasons.  I don't
know if it will help in your case, I just mention it in case it helps.

Cheers,

F


From Chris.Barker at noaa.gov  Thu Jul  8 12:41:06 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul  8 12:41:06 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <1089314664.40ed9f68e1db5@webmail.colorado.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov> <40E473A9.5040109@colorado.edu> <40ED8A6D.5050505@noaa.gov> <1089314664.40ed9f68e1db5@webmail.colorado.edu>
Message-ID: <40EDA2A8.9030300@noaa.gov>

Fernando.Perez at colorado.edu wrote:
\> Just a quick note Travis sent to me privately: he suggested using
> io.numpyio.fread instead of Numeric.fromstring() for speed reasons.  I don't
> know if it will help in your case, I just mention it in case it helps.

Thanks, but those are for binary files, which I have to do sometimes, so 
I'll keep it in mind. However, my problem at hand is text files, and my 
solution is working nicely, though I'd love a pair of more experienced 
eyes on the code....

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Jul  8 13:50:03 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul  8 13:50:03 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <004c01c46524$ab808090$ebeca782@stsci.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov> <40E473A9.5040109@colorado.edu> <40ED8A6D.5050505@noaa.gov> <1089314664.40ed9f68e1db5@webmail.colorado.edu> <40EDA2A8.9030300@noaa.gov> <004c01c46524$ab808090$ebeca782@stsci.edu>
Message-ID: <40EDB2BD.4080809@noaa.gov>

Todd Miller wrote:

> I looked this over to see how hard it would be to port to numarray.  At
> first glance,  it looks easy.  I didn't really read it closely enough to
> pick up bugs, but what I saw looks good.  One thing I did notice was a
> calloc of temporary data space.  That seemed like a possible waste:  can't
> you just preallocate the array and read your data directly into it?

The short answer is that I'm not very smart! The longer answer is that 
this is because at first I misunderstood what PyArray_FromDimsAndData 
was for. For ScanFileN, I'll re-do it as you suggest.

For ScanFile, it is unknown at the beginning how big the final array is, 
and I did scheme that would allocate the memory as it went, in 
reasonable sized chunks. However, this does require a full copy, which 
is a problem. Since posting, I thought of a MUCH easier scheme:

scan the file, without storing the data, to see how many numbers there are.

rewind the file

allocate the Array

Read the data.

This requires scanning the file twice, which would cost, but would be 
easier, and prevent an unnecessary copy of the data. I hope I"ll get a 
change to try it out and see what the performance is like. IN the 
meantime, anyone else have any thoughts?

By the way, does it matter whether I use malloc or calloc? I can't 
really tell the difference from K&R.

> This is
> probably a very minor speed issue,  but might be a significant storage issue
> as people are starting to max out 32-bit systems.

yup. This is all pointless if it's not a lot of data, after all.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Jul  8 16:21:16 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul  8 16:21:16 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <40EDB2BD.4080809@noaa.gov>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov> <40E473A9.5040109@colorado.edu> <40ED8A6D.5050505@noaa.gov> <1089314664.40ed9f68e1db5@webmail.colorado.edu> <40EDA2A8.9030300@noaa.gov> <004c01c46524$ab808090$ebeca782@stsci.edu> <40EDB2BD.4080809@noaa.gov>
Message-ID: <40EDD64A.1060508@noaa.gov>

Chris Barker wrote:

>> can't
>> you just preallocate the array and read your data directly into it?
> 
> The short answer is that I'm not very smart! The longer answer is that 
> this is because at first I misunderstood what PyArray_FromDimsAndData 
> was for. For ScanFileN, I'll re-do it as you suggest.

I've re-done it. Now I don't double allocate storage for ScanFileN. 
There was no noticeable difference in performance, but why use memory 
you don't have to?

For ScanFile, it is unknown at the beginning how big the final array is, 
so I now have two versions. One is what I had before, it allocates 
memory in blocks of some Buffersize as it reads the file (now set to 
1024 elements). Once it's all read in, it creates an appropriate size 
PyArray, and copies the data to it. This results in a double copy of all 
the data until the temporary memory is freed.

I now also have a ScanFile2, which scans the whole file first, then 
creates a PyArray, and re-reads the file to fill it up. This version 
takes about twice as long, confirming my expectation that the time to 
allocate and copy data is tiny compared to reading and parsing the file.

Here's a simple benchmark:

Reading with Standard Python methods
(62936, 2)
it took 2.824013 seconds to read the file with standard Python methods
Reading with FileScan
(62936, 2)
it took 0.400936 seconds to read the file with FileScan
Reading with FileScan2
(62936, 2)
it took 0.752649 seconds to read the file with FileScan2
Reading with FileScanN
(62936, 2)
it took 0.441714 seconds to read the file with FileScanN

So it takes twice as long to count the numbers first, but it's still 
three times as fast as just doing all this with Python. However, I 
usually don't think it's worth all this effort for a 3 times speed up, 
and I tend to make copies my arrays all over the place with NumPy 
anyway, so I'm inclined to stick with the first method. Also, if you are 
really that tight on memory, you could always read it in chunks with 
ScanFileN.

Any feedback anyone wants to give is very welcome.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: FileScan_module.c
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040708/a0eb5fcb/attachment.c>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: setup.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040708/a0eb5fcb/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: TestFileScan.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040708/a0eb5fcb/attachment-0001.ksh>

From falted at pytables.org  Fri Jul  9 03:55:03 2004
From: falted at pytables.org (Francesc Alted)
Date: Fri Jul  9 03:55:03 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
Message-ID: <200407091254.06579.falted@pytables.org>

Hi,

As Perry said not too long ago that numarray crew would ask for suggestions
for RecArray improvements, I'm going to suggest a couple.

I find quite inconvenient the .tolist() method when applied to RecArray
objects as it is now:

>>> r[2:4]
array(
[(3, 33.0, 'c'),
(4, 44.0, 'd')],
formats=['1UInt8', '1Float32', '1a1'],
shape=2,
names=['c1', 'c2', 'c3'])
>>> r[2:4].tolist()
[<numarray.records.Record instance at 0x406a946c>, <numarray.records.Record instance at 0x406a912c>]


The suggested behaviour would be:

>>> r[2:4].tolist()
[(3, 33.0, 'c'),(4, 44.0, 'd')]

Another thing is that an element of recarray would be returned as a tuple
instead as a records.Record object:

>>> r[2]
<numarray.records.Record instance at 0x4064074c>

The suggested behaviour would be:

>>> r[2]
(3, 33.0, 'c')

I think the latter would be consistent with the convention that a
__getitem__(int) of a NumArray object returns a python type instead of a
rank-0 array. In the same way, a __getitem__(int) of a RecArray should
return a a python type (a tuple in this case).

Below is the code that I use right now to simulate this behaviour, but it
would be nice if the code would included into numarray.records module.

    def tolist(arr):
        """Converts a RecArray or Record to a list of rows"""
        outlist = []
        if isinstance(arr, records.Record):
            for i in range(arr.array._nfields):
                outlist.append(arr.array.field(i)[arr.row])
            outlist = tuple(outlist)  # return a tuple for records
        elif isinstance(arr, records.RecArray):
            for j in range(arr.nelements()):
                tmplist = []
                for i in range(arr._nfields):
                    tmplist.append(arr.field(i)[j])
                outlist.append(tuple(tmplist))
        return outlist


Cheers,

-- 
Francesc Alted


From thomas_karlsson_569 at hotmail.com  Fri Jul  9 08:02:44 2004
From: thomas_karlsson_569 at hotmail.com (Thomas Karlsson)
Date: Fri Jul  9 08:02:44 2004
Subject: [Numpy-discussion] Numpy compiling error... Help!
Message-ID: <BAY19-F20bJMoNwX1me0001ca60@hotmail.com>

Hi

I'm trying to compile/install numpy on a RH9 machine. When doing so I run 
into problems.

I give the command:
python setup.py install

and get a long answer, with this error at the end:
gcc -shared build/temp.linux-i686-2.2/lapack_litemodule.o -L/usr/lib/atlas 
-llapack -lcblas -lf77blas -latlas -lg2c -o 
build/lib.linux-i686-2.2/lapack_lite.so
/usr/bin/ld: cannot find -llapack
collect2: ld returned 1 exit status
error: command 'gcc' failed with exit status 1

Does anyone know what I've done wrong? I've spent alot of time on this and 
really needs help now...

Regards
Thomas

_________________________________________________________________
Hitta r?tt p? n?tet med MSN S?k http://search.msn.se/


From Chris.Barker at noaa.gov  Fri Jul  9 09:44:12 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Jul  9 09:44:12 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <3afee4a2.5cf5a1c3.8234000@expms6.cites.uiuc.edu>
References: <3afee4a2.5cf5a1c3.8234000@expms6.cites.uiuc.edu>
Message-ID: <40EECAB8.3050900@noaa.gov>

Bruce,

Thanks for your feedback.

Bruce Southey wrote:
> While I am not really following your thread, I just wanted to comment that the
> Python Cookbook (at least the printed version) has some ways to count lines in a
> file - assuming that the number of lines provides the size.

The number of lines does not necessarily provide the size. In the 
general case, it doesn't at all. My whole goal here is the general case: 
being able to read a bunch of numbers out of any format of text file. 
This can be used as part of a parser for many file formats. If I was 
shooting for just one format, this would be easier, but not general 
purpose. Now that I have this, I can write a number of file format 
parsers in python with improved performance and easier syntax.

Under Unix (but not
> windows),

I am aiming for a portable solution.

> Alternatively if sufficient memory is available, storing the file in memory
> (during the counting of elements) should always be faster than reading it a
> second time from the hard disk.

The primary reason to scan the file ahead of time to count the elements 
is to save the memory of duplicate copies of data. The other reason is 
to make memory management easier, but since I've already solved that 
problem, I'm done.

thanks,
-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From perry at stsci.edu  Mon Jul 12 14:15:01 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Mon Jul 12 14:15:01 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407091254.06579.falted@pytables.org>
Message-ID: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu>

Francesc Alted wrote:
> 
> As Perry said not too long ago that numarray crew would ask for 
> suggestions
> for RecArray improvements, I'm going to suggest a couple.
> 
> I find quite inconvenient the .tolist() method when applied to RecArray
> objects as it is now:
> 
> >>> r[2:4]
> array(
> [(3, 33.0, 'c'),
> (4, 44.0, 'd')],
> formats=['1UInt8', '1Float32', '1a1'],
> shape=2,
> names=['c1', 'c2', 'c3'])
> >>> r[2:4].tolist()
> [<numarray.records.Record instance at 0x406a946c>, 
> <numarray.records.Record instance at 0x406a912c>]
> 
> 
> The suggested behaviour would be:
> 
> >>> r[2:4].tolist()
> [(3, 33.0, 'c'),(4, 44.0, 'd')]
> 
> Another thing is that an element of recarray would be returned as a tuple
> instead as a records.Record object:
> 
> >>> r[2]
> <numarray.records.Record instance at 0x4064074c>
> 
> The suggested behaviour would be:
> 
> >>> r[2]
> (3, 33.0, 'c')
> 
> I think the latter would be consistent with the convention that a
> __getitem__(int) of a NumArray object returns a python type instead of a
> rank-0 array. In the same way, a __getitem__(int) of a RecArray should
> return a a python type (a tuple in this case).
> 
These are good examples of where improvements are needed (we are 
also looking at how best to handle multidimensional arrays and
should have a proposal this week).

What I'm wondering about is what a single element of a record array
should be. Returning a tuple has an undeniable simplicity to it.
On the other hand, we've been using recarrays that allow naming the
various columns (which we refer to as "fields").  If one can refer
to fields of a recarray, shouldn't one be able to refer to a field
(by name) of one of it's elements? Or are you proposing that basic
recarrays not have that sort of capability (something added by a
subclass)?

Perry 


From rowen at u.washington.edu  Mon Jul 12 16:09:00 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Mon Jul 12 16:09:00 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu>
References: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu>
Message-ID: <p06110402bd18c5e33c9a@[128.95.99.44]>

At 5:14 PM -0400 2004-07-12, Perry Greenfield wrote:
>What I'm wondering about is what a single element of a record array
>should be. Returning a tuple has an undeniable simplicity to it.
>On the other hand, we've been using recarrays that allow naming the
>various columns (which we refer to as "fields").  If one can refer
>to fields of a recarray, shouldn't one be able to refer to a field
>(by name) of one of it's elements? Or are you proposing that basic
>recarrays not have that sort of capability (something added by a
>subclass)?

In my opinion, an single item of a record array should be a 
RecordItem object that is a dictionary that keeps items in field 
order. Thus:
- use the standard dictionary interface to deal with values by name 
(except the keys are always in the correct order.
- one can also get and set the all data at once as a tuple. This is 
NOT a standard dictionary interface, but is essential. Functions such 
as getvalues(), setvalues(dataTuple) should do it.

Adopting the full dictionary interface means one gets a standard, 
mature and fairly complete set of features. ALSO a RecordItem object 
can then be used wherever a dictionary object is needed.

I suspect it's also useful to have named field access:
RecordItem.fieldname
but am a bit reluctant to suggest so many different ways of getting 
to the data.

I assume it will continue to be easy to get all data for a field by 
naming the appropriate field. That's a really nice feature. It would 
be even better if a masked array could be used, but I have no idea 
how hard this would be.


Which brings up a side issue: any hope of integrating masked arrays 
into numarray, such that they could be used wherever a numarray array 
could be used? Areas that I particularly find myself needing them 
including nd_image filtering and writing C extensions.

-- Russell

P.S. I submitted several feature requests and bug reports for records 
on sourceforge months ago. I hope they'll not be overlooked during 
the review process.


From falted at pytables.org  Tue Jul 13 01:30:55 2004
From: falted at pytables.org (Francesc Alted)
Date: Tue Jul 13 01:30:55 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu>
References: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu>
Message-ID: <200407131028.04791.falted@pytables.org>

A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
> What I'm wondering about is what a single element of a record array
> should be. Returning a tuple has an undeniable simplicity to it.

Yeah, this why I'm strongly biased toward this possibility.

> On the other hand, we've been using recarrays that allow naming the
> various columns (which we refer to as "fields").  If one can refer
> to fields of a recarray, shouldn't one be able to refer to a field
> (by name) of one of it's elements? Or are you proposing that basic
> recarrays not have that sort of capability (something added by a
> subclass)?

Well, I'm not sure about that. But just in case most of people would like to
access records by field as well as by index, I would advocate for the
possibility that the Record instances would behave as similar as possible as
a tuple (or dictionary?). That include creating appropriate __str__() *and*
__repr__() methods as well as __getitem__() that supports both name fields
and indices. I'm not sure about whether providing an __getattr__() method
would ok, but for the sake of simplicity and in order to have (preferably)
only one way to do things, I would say no.

Regards,

-- 
Francesc Alted


From falted at pytables.org  Tue Jul 13 02:07:00 2004
From: falted at pytables.org (Francesc Alted)
Date: Tue Jul 13 02:07:00 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407131028.04791.falted@pytables.org>
References: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu> <200407131028.04791.falted@pytables.org>
Message-ID: <200407131106.19557.falted@pytables.org>

A Dimarts 13 Juliol 2004 10:28, Francesc Alted va escriure:
> A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
> > What I'm wondering about is what a single element of a record array
> > should be. Returning a tuple has an undeniable simplicity to it.
> 
> Yeah, this why I'm strongly biased toward this possibility.
> 
> > On the other hand, we've been using recarrays that allow naming the
> > various columns (which we refer to as "fields").  If one can refer
> > to fields of a recarray, shouldn't one be able to refer to a field
> > (by name) of one of it's elements? Or are you proposing that basic
> > recarrays not have that sort of capability (something added by a
> > subclass)?
> 
> Well, I'm not sure about that. But just in case most of people would like to
> access records by field as well as by index, I would advocate for the
> possibility that the Record instances would behave as similar as possible as
> a tuple (or dictionary?). That include creating appropriate __str__() *and*
> __repr__() methods as well as __getitem__() that supports both name fields
> and indices. I'm not sure about whether providing an __getattr__() method
> would ok, but for the sake of simplicity and in order to have (preferably)
> only one way to do things, I would say no.

I've been thinking that one can made compatible to return a tuple on a
single element of a RecArray and still being able to retrieve a field by
name is to play with the RecArray.__getitem__ and let it to suport key names
in addition to indices. This would be better seen as an example:

Right now, one can say:

>>> r=records.array([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8")
>>> r._fields["c1"]
array([1, 2])
>>> r._fields["c1"][1]
2

What I propose is to be able to say:

>>> r["c1"]
array([1, 2])
>>> r["c1"][1]
2

Which would replace the notation:

>>> r[1]["c1"]
2

which was recently suggested.

I.e. the suggestion is to realize RecArrays as a collection of columns,
as well as a collection of rows.

-- 
Francesc Alted


From falted at pytables.org  Tue Jul 13 02:13:03 2004
From: falted at pytables.org (Francesc Alted)
Date: Tue Jul 13 02:13:03 2004
Subject: [Numpy-discussion] PyTables 0.8.1 released
Message-ID: <200407131112.15345.falted@pytables.org>

PyTables is a hierarchical database package designed to efficiently
manage very large amounts of data. PyTables is built on top of the
HDF5 library and the numarray package. It features an object-oriented
interface that, combined with natural naming and C-code generated from
Pyrex sources, makes it a fast, yet extremely easy-to-use tool for
interactively saving and retrieving different kinds of datasets. It
also provides flexible indexed access on disk to anywhere in the data.

The primary purpose of this release is to incorporate updates to
related to the newly released numarray 1.0. I've taken the opportunity
to backport some improvements added in PyTables 0.9 (in alpha stage)
as well as to fix the known problems

Improvements:

- The logic for computing the buffer sizes has been revamped. As a
  consequence, the performance of writing/reading tables with large
  record sizes has improved by a factor of ten or more, now exceeding
  70 MB/s for writing and 130 MB/s for reading (using compression).

- The maximum record size for tables has been raised to 512 KB
  (before it was 8 KB, due to some internal limitations)

- Documentation has been improved in many minor details. As a result
  of a fix in the underlying documentation system (tbook), chapters
  start now at odd pages, instead of even. So those of you who want
  to print to double side probably will have better luck now when
  aligning pages ;).  Another one is that HTML documentation has
  improved its look as well.

Bug Fixes:

- Indexing of Arrays with list or tuple flavors (#968131)
  When retrieving single elements from an array with 'List' or
  'Tuple' flavors, an error occurred. This has been
  corrected and now you can retrieve fileh.root.array[2] without
  problems for 'List' or 'Tuple' flavored (E, VL)Arrays.
  
- Iterators on Arrays with list or tuple flavors fail (#968132)
  When using iterators with Array objects with 'List' or
  'Tuple' flavors, an error occurred. This has been
  corrected.

- Last Index (-1) of Arrays doesn't work (#968149)
  When accessing to the last element in an Array using the notation
  -1, an empty list (or tuple or array) is returned instead of the
  proper value. This happened in general with all negative
  indices. Fixed.

- Table.read(flavor="List") should return pure lists (#972534)
  However, it used to return a pointer to numarray.records.Record
  instances, as in:

   >>> fileh.root.table.read(1,2,flavor="List") 
    [<numarray.records.Record instance at 0x4128352c>] 
   >>> fileh.root.table.read(1,3,flavor="List") 
    [<numarray.records.Record instance at 0x4128396c>, 
     <numarray.records.Record instance at 0x41283a8c>] 
 
  Now the next records are returned:

   >>> fileh.root.table.read(1,2, flavor=List) 
    [(' ', 1, 1.0)] 
   >>> fileh.root.table.read(1,3, flavor=List) 
    [(' ', 1, 1.0), 
     (' ', 2, 2.0)] 
 
  In addition, when reading a single row of a table, a
  numarray.records.Record pointer was returned:
 
  >>> fileh.root.table[1] 
   <numarray.records.Record instance at 0x4128398c> 
 
  Now, it returns a tuple:

  >>> fileh.root.table[1] 
   (' ', 1, 1.0) 
 
  Which I think is more consistent, and more Pythonic.

- Copy of leaves fails... (#973370)
  Attempting to copy leaves (Table or Array with different flavors) on
  top of themselves caused an internal error in PyTables. This has
  been corrected by silently avoiding the copy and returning the
  original Leaf as a result.

Minor changes:

- When assigning a value to a non-existing field in a table row, now a
  KeyError is raised, instead of the AttributeError that was issued
  before. I think this is more consistent with the type of error.

- Tests have been improved so as to pass the whole suite when compiled
  in 64 bit mode on a Linux/PowerPC machine (namely a dual-G5 Powermac
  running a 64-bit, 2.6.4 Linux kernel and the preview YDL
  distribution for G5, with 64-bit GCC toolchain). Thanks to Ciro
  Cattuto for testing and reporting the modifications that were
  needed.


Where PyTables can be applied?
------------------------------

PyTables is not designed to work as a relational database competitor,
but rather as a teammate. If you want to work with large datasets of
multidimensional data (for example, for multidimensional analysis), or
just provide a categorized structure for some portions of your cluttered
RDBS, then give PyTables a try. It works well for storing data from data
acquisition systems (DAS), simulation software, network data monitoring
systems (for example, traffic measurements of IP packets on routers),
very large XML files, or for creating a centralized repository for system 
logs, to name only a few possible uses.
 
What is a table?
----------------

A table is defined as a collection of records whose values are stored in
fixed-length fields. All records have the same structure and all values
in each field have the same data type.  The terms "fixed-length" and
"strict data types" seem to be quite a strange requirement for a
language like Python that supports dynamic data types, but they serve a
useful function if the goal is to save very large quantities of data
(such as is generated by many scientific applications, for example) in
an efficient manner that reduces demand on CPU time and I/O resources.

What is HDF5?
-------------

For those people who know nothing about HDF5, it is a general purpose
library and file format for storing scientific data made at NCSA. HDF5
can store two primary objects: datasets and groups. A dataset is
essentially a multidimensional array of data elements, and a group is a
structure for organizing objects in an HDF5 file. Using these two basic
constructs, one can create and store almost any kind of scientific data
structure, such as images, arrays of vectors, and structured and
unstructured grids. You can also mix and match them in HDF5 files
according to your needs.

Platforms
---------

I'm using Linux (Intel 32-bit) as the main development platform, but
PyTables should be easy to compile/install on many other UNIX
machines. This package has also passed all the tests on a UltraSparc
platform with Solaris 7 and Solaris 8. It also compiles and passes all
the tests on a SGI Origin2000 with MIPS R12000 processors, with the
MIPSPro compiler and running IRIX 6.5. It also runs fine on Linux
64-bit platforms, like an AMD Opteron running SuSe Linux Enterprise
Server or PowerPC G5 with Linux 2.6.x in 64bit mode. It has also been
tested in MacOSX platforms (10.2 but should also work on newer
versions).

Regarding Windows platforms, PyTables has been tested with Windows
2000 and Windows XP (using the Microsoft Visual C compiler), but it
should also work with other flavors as well.

An example?
-----------

For online code examples, have a look at

http://pytables.sourceforge.net/html/tut/tutorial1-1.html

and, for newly introduced Variable Length Arrays:

http://pytables.sourceforge.net/html/tut/vlarray2.html

Web site
--------

Go to the PyTables web site for more details:

http://pytables.sourceforge.net/

Share your experience
---------------------

Let me know of any bugs, suggestions, gripes, kudos, etc. you may
have.

Enjoy!

-- 
Francesc Alted


From jmiller at stsci.edu  Tue Jul 13 10:42:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul 13 10:42:04 2004
Subject: [Numpy-discussion] numarray-1.0 Bug Alert
Message-ID: <1089740511.9509.372.camel@halloween.stsci.edu>

Overview

There is a bug in numarray's Numeric compatible C-API.  The bug has been
latent for a long time, since numarray-0.3 was released roughly two
years ago.  It is serious because it results in wrong answers for a
certain extension functions fed a certain class of arrays.

What's affected

The bug affects affects numarray's add-on packages or third party
extension functions which use the Numeric compatibility C-API. 
Generally, this means C-code that was either ported from Numeric or was
written with both Numeric and numarray in mind.  This includes the
add-on packages numarray.linear_algebra,  numarray.fft,
numarray.random_array, and numarray.mlab.  More recently, it includes
the ports of core Numeric functions to numarray.numeric.  Because
numarray.ma uses numarray.numeric,  the bug also affects numarray.ma. 
Finally, for numarray-1.0 this bug affects the functions numarray.argmin
and numarray.argmax; these should be the only two functions in core
numarray which are affected.

Detailed Bug Description

The bug is exposed by calling an extension function (written using the
Numeric compatible C-API) with an array that has a non-zero _byteoffset
attribute.  Arrays with non-zero _byteoffset are typically created as a
result of partially indexing higher dimensional arrays or slicing
arrays.  Partially indexing or slicing an array generally results in a
sub-array, a view which often refers to an interior region of the
original array buffer.  Because numarray's PyArrayObject does not
currently include it's ->byteoffset in its ->data pointer as the Numeric
compatibility API assumes it does, an extension function sees the base
region of the original array rather than the region belonging to the
sub-array.

Immediate User Workaround

A simple user level workaround for people that need to use the affected
packages and functions today is one like the following:

def make_safe_for_numeric_api(a):
	a = numarray.asarray(a)
	if a._byteoffset != 0:
		return a.copy()
	else:
		return a

The array inputs to an affected extension function need to be wrapped
with calls to make_safe_for_numeric_api().  Since this is intrusive and
a real fix should be released in the near future, this approach is not
recommended.

Long Term Fix

The real fix for the bug appears to be to redefine the semantics of
numarray's PyArrayObject ->data pointer to include ->byteoffset,
altering the C-API.  This should make most existing Numeric compatible
extension functions work without modification or recompilation,  but
will necessitate the re-compilation of some extension functions written
using the native numarray API approaches (the NA_* functions and
macros).   This recompilation will be required because key macros will
change, most notably NA_OFFSETDATA. This fix is not the only possible
one, and other suggestions are welcome,  but changing the semantics of
->data appears to be the best way to facilitate numarray/Numeric
interoperability.  By doing this fix, numarray operates more like
Numeric so fewer changes need to be made in the future to perform ports
of Numeric code to numarray.

Impact of Proposed Fix

Regrettably, the proposed fix will break binary compatibility for
clients of the numarray-1.0 native C-API.  So, extensions built using
the numarray native C-API will need to be rebuilt for numarray-1.1. 
Extensions that have made direct access to PyArrayObject's ->data and
require the original offsetless meaning will also need to change code
for numarray-1.1.  This is something we *really* wanted to avoid... it
just isn't going to happen this time.  

The Plan

The current plan is to fix the Numeric compatible API by changing the
semantics of ->data and release numarray-1.1 relatively soon, hopefully
within 2 weeks.   I'm sorry for any inconvenience this has caused
numarray users.

Regards,
Todd Miller


From zingale at ucolick.org  Tue Jul 13 12:54:02 2004
From: zingale at ucolick.org (Mike Zingale)
Date: Tue Jul 13 12:54:02 2004
Subject: [Numpy-discussion] differencing numarray arrays.
Message-ID: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>

Hi, I am trying to efficiently compute a difference of two 2-d flux
arrays, as arises quite commonly in finite-difference/finite-volume
methods.  Ex:

a = arange(64)
a.shape = (8,8)

I want to do create a new array, b, of shape such that

b[i,j] = a[i,j] - a[i-1,j]

for 1 <= i < 8
    0 <= i < 8

I can obviously do this through loops, but this is quite slow.  In IDL,
which is often compared to numarray/python, this is simple to do with the
shift() function, but I cannot find an efficient way to do it with
numarray arrays.

I tried defining a list

i = range(8)
im1[1:9] = im1[1:9] - 1

and indexing with im1, but this does not work.

Any suggestions?  For large array, this simple differencing in python is
very expensive when using loops.

Thanks,

Mike

------------------------------------------------------------------------------
Michael Zingale
UCO/Lick Observatory
UCSC
Santa Cruz, CA 95064

phone:  (831) 459-5246
fax:    (831) 459-5265
e-mail: zingale at ucolick.org
web:    http://www.ucolick.org/~zingale

``Don't worry head, the computer will do our thinking now''  -- Homer


From tim.hochberg at cox.net  Tue Jul 13 12:59:00 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Jul 13 12:59:00 2004
Subject: [Numpy-discussion] differencing numarray arrays.
In-Reply-To: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
References: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
Message-ID: <40F43EC4.70903@cox.net>

Mike Zingale wrote:

>Hi, I am trying to efficiently compute a difference of two 2-d flux
>arrays, as arises quite commonly in finite-difference/finite-volume
>methods.  Ex:
>
>a = arange(64)
>a.shape = (8,8)
>
>I want to do create a new array, b, of shape such that
>
>b[i,j] = a[i,j] - a[i-1,j]
>
>for 1 <= i < 8
>    0 <= i < 8
>  
>
That's supposed to be a j in the second eq., right?

If I understand you right, what you want is:

b = a[1:] - a[:-1]

-tim

>I can obviously do this through loops, but this is quite slow.  In IDL,
>which is often compared to numarray/python, this is simple to do with the
>shift() function, but I cannot find an efficient way to do it with
>numarray arrays.
>
>I tried defining a list
>
>i = range(8)
>im1[1:9] = im1[1:9] - 1
>
>and indexing with im1, but this does not work.
>
>Any suggestions?  For large array, this simple differencing in python is
>very expensive when using loops.
>
>Thanks,
>
>Mike
>
>------------------------------------------------------------------------------
>Michael Zingale
>UCO/Lick Observatory
>UCSC
>Santa Cruz, CA 95064
>
>phone:  (831) 459-5246
>fax:    (831) 459-5265
>e-mail: zingale at ucolick.org
>web:    http://www.ucolick.org/~zingale
>
>``Don't worry head, the computer will do our thinking now''  -- Homer
>
>
>
>-------------------------------------------------------
>This SF.Net email sponsored by Black Hat Briefings & Training.
>Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
>digital self defense, top technical experts, no vendor pitches, 
>unmatched networking opportunities. Visit www.blackhat.com
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>  
>


From rkern at ucsd.edu  Tue Jul 13 13:01:04 2004
From: rkern at ucsd.edu (Robert Kern)
Date: Tue Jul 13 13:01:04 2004
Subject: [Numpy-discussion] differencing numarray arrays.
In-Reply-To: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
References: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
Message-ID: <40F43F65.9040208@ucsd.edu>

Mike Zingale wrote:

> Hi, I am trying to efficiently compute a difference of two 2-d flux
> arrays, as arises quite commonly in finite-difference/finite-volume
> methods.  Ex:
> 
> a = arange(64)
> a.shape = (8,8)
> 
> I want to do create a new array, b, of shape such that
> 
> b[i,j] = a[i,j] - a[i-1,j]
> 
> for 1 <= i < 8
>     0 <= i < 8

Try
b = a[1:] - a[:-1]

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter


From zingale at ucolick.org  Tue Jul 13 13:42:02 2004
From: zingale at ucolick.org (Mike Zingale)
Date: Tue Jul 13 13:42:02 2004
Subject: [Numpy-discussion] differencing numarray arrays.
In-Reply-To: <40F44766.9010009@pfdubois.com>
References: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
 <40F44766.9010009@pfdubois.com>
Message-ID: <Pine.GSO.4.53.0407131340330.21046@mambo.ucolick.org>

thanks, all these responses helped.  I guess I was still a little
unclear with the slicing abilities in numarray.

Mike


On Tue, 13 Jul 2004, Paul Dubois wrote:

> Two of the responses to your question, while correct, might have seemed
> mysterious to a beginner.
>
> a[1:] - a[:-1]
>
> is actually shorthand for:
>
> a[1:, :] - a[:-1, :]
>
> Or to be even more explicit:
>
> n = 8
> a[1:n, 0:n] - a[0:(n-1), 0:n]
>
> If you had wanted the difference in the second index, you have to use
> the more explicit forms.
>
>
>


From rowen at u.washington.edu  Tue Jul 13 17:11:49 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Tue Jul 13 17:11:49 2004
Subject: [Numpy-discussion] differencing numarray arrays.
In-Reply-To: <Pine.GSO.4.53.0407131340330.21046@mambo.ucolick.org>
References: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
 <40F44766.9010009@pfdubois.com>
 <Pine.GSO.4.53.0407131340330.21046@mambo.ucolick.org>
Message-ID: <p0611040dbd1a28a3624f@[128.95.99.44]>

At 1:41 PM -0700 2004-07-13, Mike Zingale wrote:
>thanks, all these responses helped.  I guess I was still a little
>unclear with the slicing abilities in numarray...

Also note that there is a shift function: numarray.nd_image.shift

In your case I suspect slicing is better, but there are times when 
one really does want to shift the data (e.g. when one wants the 
resulting array to be the same shape as the original).

-- Russell


From kyeser at earthlink.net  Tue Jul 13 19:35:39 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Tue Jul 13 19:35:39 2004
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?
Message-ID: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net>

Hi.  I wrote a program to calculate sums of every possible combinations 
of two indices of a list.  The main body of the program looks something 
like this:

r = [0,2,5,6,8]
l = []

for x in range(0, len(r)):
     for y in range(0, len(r)):
         k = r[x]+r[y]
         l.append(k)
print l

1. I've heard that it's not a good idea to have a 'for' loop within 
another 'for' loop, and I was wondering if there is a more efficient 
way to do this.

2. Does anyone know if there is a built-in function or module that 
would do the above task in NumPy or Numarray (or even in Python)?

I would really appreciate it if anyone could let me know.

Thanks for your help!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 715 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040713/94401aa2/attachment.bin>

From focke at slac.stanford.edu  Tue Jul 13 22:02:08 2004
From: focke at slac.stanford.edu (Warren Focke)
Date: Tue Jul 13 22:02:08 2004
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?
In-Reply-To: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net>
References: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net>
Message-ID: <Pine.LNX.4.58.0407132159050.32034@auriga.slac.stanford.edu>

l = Numeric.add.outer(r, r).flat
oughta do the trick.  Should work for numarray, too.

On Tue, 13 Jul 2004, Hee-Seng Kye wrote:

> Hi.  I wrote a program to calculate sums of every possible combinations
> of two indices of a list.  The main body of the program looks something
> like this:
>
> r = [0,2,5,6,8]
> l = []
>
> for x in range(0, len(r)):
>      for y in range(0, len(r)):
>          k = r[x]+r[y]
>          l.append(k)
> print l
>
> 1. I've heard that it's not a good idea to have a 'for' loop within
> another 'for' loop, and I was wondering if there is a more efficient
> way to do this.
>
> 2. Does anyone know if there is a built-in function or module that
> would do the above task in NumPy or Numarray (or even in Python)?
>
> I would really appreciate it if anyone could let me know.
>
> Thanks for your help!


From eric at enthought.com  Tue Jul 13 22:09:01 2004
From: eric at enthought.com (eric jones)
Date: Tue Jul 13 22:09:01 2004
Subject: [Numpy-discussion] ANN: Reminder -- SciPy 04 is coming up
Message-ID: <40F4BF9E.8060103@enthought.com>

Hey folks,

Just a reminder that SciPy 04 is coming up.  More information is here:

http://www.scipy.org/wikis/scipy04

About the Conference and Keynote Speaker
---------------------------------------------
The 1st annual *SciPy Conference* will be held this year at Caltech, 
September 2-3, 2004.  As some of you may know, we've experienced great 
participation in two SciPy "Workshops" (with ~70 attendees in both 2002 
and 2003) and this year we're graduating to a "conference."  With the 
prestige of a conference comes the responsibility of a keynote address.  
This year, Jim Hugunin has answered the call and will be speaking to 
kickoff the meeting on Thursday September 2nd.  Jim is the creator of 
Numeric Python, Jython, and co-designer of AspectJ. Jim is currently 
working on IronPython--a fast implementation of Python for .NET and Mono.

Presenters
-----------
We still have room for a few more standard talks, and there is plenty of 
room for lightning talks. Because of this, we are extending the abstract 
deadline until July 23rd.  Please send your abstract to 
abstracts at scipy.org.  Travis Oliphant is organizing the presentations 
this year. (Thanks!)  Once accepted, papers and/or presentation slides 
are acceptable and are due by August 20, 2004. 

Registration
-------------
Early registration ($100.00) has been extended to July 23rd.  Follow the 
links off of the main conference site:

http://www.scipy.org/wikis/scipy04

After July 23rd, registration will be $150.00.  Registration includes 
breakfast and lunch Thursday & Friday and a very nice dinner Thursday 
night.  Please register as soon as possible as it will help us in 
planning for food, room sizes, etc.

Sprints
--------
As of now, we really haven't had much of a call for coding sprints for 
the 3 days prior to SciPy 04.  Below is the original announcement about 
sprints.  If you would like to suggest a topic and see if others are 
interested, please send a message to the list.  Otherwise, we'll forgo 
the sprints session this year.

    We're also planning three days of informal "Coding Sprints" prior to
    the conference -- August 30 to September 1, 2004.  Conference
    registration is not required to participate in the sprints.  Please
    email the list, however, if you plan to attend.  Topics for these
    sprints will be determined via the mailing lists as well, so please
    submit any suggestions for topics to the scipy-user list:

    list signup: http://www.scipy.org/mailinglists/
    list address: scipy-user at scipy.org


thanks,
eric


From kyeser at earthlink.net  Tue Jul 13 23:30:13 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Tue Jul 13 23:30:13 2004
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?
In-Reply-To: <Pine.LNX.4.58.0407132159050.32034@auriga.slac.stanford.edu>
References: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net> <Pine.LNX.4.58.0407132159050.32034@auriga.slac.stanford.edu>
Message-ID: <34CF38C4-D55F-11D8-8504-000393479EE8@earthlink.net>

Thank you so much.  It works beautifully!

On Jul 14, 2004, at 1:01 AM, Warren Focke wrote:

> l = Numeric.add.outer(r, r).flat
> oughta do the trick.  Should work for numarray, too.
>
> On Tue, 13 Jul 2004, Hee-Seng Kye wrote:
>
>> Hi.  I wrote a program to calculate sums of every possible 
>> combinations
>> of two indices of a list.  The main body of the program looks 
>> something
>> like this:
>>
>> r = [0,2,5,6,8]
>> l = []
>>
>> for x in range(0, len(r)):
>>      for y in range(0, len(r)):
>>          k = r[x]+r[y]
>>          l.append(k)
>> print l
>>
>> 1. I've heard that it's not a good idea to have a 'for' loop within
>> another 'for' loop, and I was wondering if there is a more efficient
>> way to do this.
>>
>> 2. Does anyone know if there is a built-in function or module that
>> would do the above task in NumPy or Numarray (or even in Python)?
>>
>> I would really appreciate it if anyone could let me know.
>>
>> Thanks for your help!
>
>
> -------------------------------------------------------
> This SF.Net email sponsored by Black Hat Briefings & Training.
> Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
> digital self defense, top technical experts, no vendor pitches,
> unmatched networking opportunities. Visit www.blackhat.com
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From falted at pytables.org  Wed Jul 14 02:37:06 2004
From: falted at pytables.org (Francesc Alted)
Date: Wed Jul 14 02:37:06 2004
Subject: [Numpy-discussion] numarray-1.0 Bug Alert
In-Reply-To: <1089740511.9509.372.camel@halloween.stsci.edu>
References: <1089740511.9509.372.camel@halloween.stsci.edu>
Message-ID: <200407141136.09436.falted@pytables.org>

A Dimarts 13 Juliol 2004 19:41, Todd Miller va escriure:
> The real fix for the bug appears to be to redefine the semantics of
> numarray's PyArrayObject ->data pointer to include ->byteoffset,
> altering the C-API. 

Oh well, I'm afraid that I'll be affected by that :(. Just to understand
that fully, you mean that real data for an array will start in the future at
narr->data, instead of narr->data+narr->byteoffset as it does now?

-- 
Francesc Alted


From jmiller at stsci.edu  Wed Jul 14 04:38:09 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jul 14 04:38:09 2004
Subject: [Numpy-discussion] numarray-1.0 Bug Alert
In-Reply-To: <200407141136.09436.falted@pytables.org>
References: <1089740511.9509.372.camel@halloween.stsci.edu>
	 <200407141136.09436.falted@pytables.org>
Message-ID: <1089805021.3741.62.camel@localhost.localdomain>

On Wed, 2004-07-14 at 05:36, Francesc Alted wrote:
> A Dimarts 13 Juliol 2004 19:41, Todd Miller va escriure:
> > The real fix for the bug appears to be to redefine the semantics of
> > numarray's PyArrayObject ->data pointer to include ->byteoffset,
> > altering the C-API. 
> 
> Oh well, I'm afraid that I'll be affected by that :(. Just to understand
> that fully, you mean that real data for an array will start in the future at
> narr->data, instead of narr->data+narr->byteoffset as it does now?

That is the current plan.  I was thinking developers could just replace
the new narr->data with (narr->data - narr->byteoffset) if needed.  I'm
assuming the planned changes will cost at most a few edits and package
redistribution, which I understand is still a major pain in the neck; 
let me know if the cost is higher than that for some reason.

Regards,
Todd


From paul at pfdubois.com  Wed Jul 14 05:57:07 2004
From: paul at pfdubois.com (Paul F. Dubois)
Date: Wed Jul 14 05:57:07 2004
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?
In-Reply-To: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net>
References: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net>
Message-ID: <40F52D8B.9050601@pfdubois.com>

 >>> add.reduce(take(r,indices([len(r),len(r)]))).flat
array([ 0,  2,  5,  6,  8,  2,  4,  7,  8, 10,  5,  7, 10, 11, 13,  6, 
8, 11, 12, 14,  8, 10, 13, 14, 16])

Always like a good challenge in the morning. God, it is like the old 
rush of writing APL.

Hee-Seng Kye wrote:

> Hi. I wrote a program to calculate sums of every possible combinations 
> of two indices of a list. The main body of the program looks something 
> like this:
> 
> r = [0,2,5,6,8]
> l = []
> 
> for x in range(0, len(r)):
> for y in range(0, len(r)):
> k = r[x]+r[y]
> l.append(k)
> print l
> 
> 1. I've heard that it's not a good idea to have a 'for' loop within 
> another 'for' loop, and I was wondering if there is a more efficient way 
> to do this.
> 
> 2. Does anyone know if there is a built-in function or module that would 
> do the above task in NumPy or Numarray (or even in Python)?
> 
> I would really appreciate it if anyone could let me know.
> 
> Thanks for your help!


From Sebastien.deMentendeHorne at electrabel.com  Wed Jul 14 08:41:09 2004
From: Sebastien.deMentendeHorne at electrabel.com (Sebastien.deMentendeHorne at electrabel.com)
Date: Wed Jul 14 08:41:09 2004
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?
Message-ID: <035965348644D511A38C00508BF7EAEB145CAF2A@seacex03.eib.electrabel.be>

I could not resist to propose an other solution:
 
r = array([0,2,5,6,8])
l = (r[:,NewAxis] + r[NewAxis,:]).flat
 
 
 -----Original Message-----
From: Hee-Seng Kye [mailto:kyeser at earthlink.net]
Sent: mercredi 14 juillet 2004 4:22
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?


Hi. I wrote a program to calculate sums of every possible combinations of
two indices of a list. The main body of the program looks something like
this: 


r = [0,2,5,6,8] 

l = [] 


for x in range(0, len(r)): 

for y in range(0, len(r)): 

k = r[x]+r[y] 

l.append(k) 

print l 


1. I've heard that it's not a good idea to have a 'for' loop within another
'for' loop, and I was wondering if there is a more efficient way to do this.


2. Does anyone know if there is a built-in function or module that would do
the above task in NumPy or Numarray (or even in Python)? 


I would really appreciate it if anyone could let me know. 


Thanks for your help!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040714/8883cf76/attachment.html>

From rowen at u.washington.edu  Wed Jul 14 08:48:07 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Wed Jul 14 08:48:07 2004
Subject: [Numpy-discussion] How to median filter a masked array?
Message-ID: <p06110410bd1b0493eef9@[128.95.99.44]>

I want to 3x3 median filter a masked array (2-d array of ints -- an 
astronomical image), where the masked data and points off the edge 
are excluded from the local median calculation. Any suggestions for 
how to do this efficiently? I suspect I have to write it in C, which 
is an unpleasant prospect.

I tried using NaN for points to mask out, but the median filter seems 
to handle those as "infinity", or something equally inappropriate.

In a related vein, has Python come along far enough that it would be 
reasonable to add support for NaN to numarray -- in the sense that 
statistics calculations, filters, etc. could be convinced to ignore 
NaNs? Obviously this support would be contingent on compiling python 
with IEEE floating point support, but I suspect that's the default on 
most platforms these days.

-- Russell


From jdhunter at ace.bsd.uchicago.edu  Wed Jul 14 09:51:12 2004
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Wed Jul 14 09:51:12 2004
Subject: [Numpy-discussion] ANN matplotlib-0.60.2: python graphs and charts
Message-ID: <m21xjek11o.fsf@mother.paradise.lost>

matplotlib is a 2D plotting library for python.  You can use
matplotlib interactively from a python shell or IDE, or embed it in
GUI applications (WX, GTK, and Tkinter).  matplotlib supports many
plot types: line plots, bar charts, log plots, images, pseudocolor
plots, legends, date plots, finance charts and more.  

What's new since matplotlib 0.50

  This is the first wide release in 5 months and there has been a
  tremendous amount of development since then, with new backends, many
  optimizations, new plotting types, new backends and enhanced text
  support. See http://matplotlib.sourceforge.net/whats_new.html for
  details.
 
 * Todd Miller's tkinter backend (tkagg) with good support for
   interactive plotting using the standard python shell, ipython or
   others.  matplotlib now runs on windows out of the box with python
   + numeric/numarry

 * Full Numeric / numarray integration with Todd Miller's numerix
   module.  Prebuilt installers for numeric and numarray on win32.
   Others, please set your numerix settings before building
   matplotlib, as described on
   http://matplotlib.sourceforge.net/faq.html#NUMARRAY

 * Mathtext: you can write TeX style math expressions anywhere in your
   figure.
   http://matplotlib.sourceforge.net/screenshots.html#mathtext_demo.

 * Images - figure and axes images with optional interpolated
   resampling, alpha blending of multiple images, and more with the
   imshow and figimage commands.  Interactive control of colormaps,
   intensity scaling and colorbars -
   http://matplotlib.sourceforge.net/screenshots.html#layer_images

 * Text: freetype2 support, newline separated strings with arbitrary
   rotations, Paul Barrett's cross platform font
   manager.
   http://matplotlib.sourceforge.net/screenshots.html#align_text

 * Jared Wahlstrand's SVG backend (alpha)

 * Support for popular financial plot types -
   http://matplotlib.sourceforge.net/screenshots.html#finance_work2

 * Many optimizations and extension code to remove performance
   bottlenecks.  pcolors and scatters are an order of magnitude
   faster.

 * GTKAgg, WXAgg, TkAgg backends for http://antigrain.com (agg)
   rendering in the GUI canvas.  Now all the major GUIs (WX, GTK, Tk)
   can be used with a common (agg) renderer.

 * Many new examples and demos - see http://matplotlib.sf.net/examples
   or download the src distribution and look in the examples dir.

Documentation and downloads available at
http://matplotlib.sourceforge.net.

John Hunter


From verveer at embl-heidelberg.de  Wed Jul 14 10:39:59 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Wed Jul 14 10:39:59 2004
Subject: [Numpy-discussion] How to median filter a masked array?
In-Reply-To: <p06110410bd1b0493eef9@[128.95.99.44]>
References: <p06110410bd1b0493eef9@[128.95.99.44]>
Message-ID: <1122AA7E-D5B4-11D8-8510-000A95C92C8E@embl-heidelberg.de>

On 14 Jul 2004, at 17:47, Russell E Owen wrote:
> I want to 3x3 median filter a masked array (2-d array of ints -- an 
> astronomical image), where the masked data and points off the edge are 
> excluded from the local median calculation. Any suggestions for how to 
> do this efficiently?

I don't think that you can do it very efficiently right now with the 
functions that are available in numarray.

>  I suspect I have to write it in C, which is an unpleasant prospect.

Yes, that is unpleasant, trust me :-) However, in version 1.0 of 
numarray in the nd_image package, I have added some support for writing 
filter functions. The generic_filter() function iterates over the array 
and applies a user-defined filter function at each element. The 
user-defined function can be written in python or in C, and is called 
at each element with the values within the filter-footprint as an 
argument. You would write a function that finds the median of these 
values, excluding the NaNs (or whatever value that flags the mask.) I 
would suggest to prototype this function in python and move that to C 
as soon as it works to your satisfaction. See the numarray manual for 
more details.

Cheers, Peter


From rowen at u.washington.edu  Wed Jul 14 10:44:39 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Wed Jul 14 10:44:39 2004
Subject: [Numpy-discussion] How to median filter a masked array?
In-Reply-To: <40F56462.2030000@pfdubois.com>
References: <p06110410bd1b0493eef9@[128.95.99.44]>
 <40F56462.2030000@pfdubois.com>
Message-ID: <p06110415bd1b19e0ecd3@[128.95.99.44]>

At 9:50 AM -0700 2004-07-14, Paul F. Dubois wrote:
>The median filter is prepared to take an argument of a numarray 
>array but ignorant of and unprepared to deal with masked  values. 
>Using the __array__ trick, both Numeric.MA and numarray.ma would 
>'know' this and therefore replace the missing values in the filter's 
>argument with the 'fill value' for that type -- a big number in the 
>case of real arrays. You could explicitly choose that value (say 
>using the overall median of the data m) by passing x.filled(m) 
>rather than x to the filter.
>
>If there is no such value, you probably do have to do it in C. If 
>you wrote it in C, how would you treat missing elements? BTW it 
>wouldn't be that hard; just pass both the array and its mask as 
>separate elements to a C routine and use SWIG to hook it up.

I already have routines that handle masked data in C to create a 
radial profiles from 2-d integer data (since I could not figure out 
how to do that in numarray). I chose to pass the mask as a separate 
array, since I could not find any C interface for numarray.ma and 
since NaN made no sense for integer data.

That code was pretty straightforward. I wish I could have found a 
simple way to support multiple array types. I thought using C++ with 
prototypes would be the ticket, but absent any examples and after 
looking through the numarray code, I gave up and took the easy way 
out. (I didn't use SWIG, though, I just hand coded everything. Maybe 
that was a mistake.)

I confess that makes me worry about the underpinnings of numarray. It 
seems an obvious candidate to be written in C++ with prototypes. I 
hate to think what the developers have to go through, instead.

In any case, writing a median filter is a bigger deal than taking a 
radial profile, and since one already existed I thought I'd ask.

>I doubt NaN would help you here; you'd still have to figure out what 
>to do in those places. Numeric did not have support for NaN because 
>there were portability problems. Probably still are. And you still 
>are stuck in a lot of cases anyway.

Well, NaN isn't very general in any case, since it's meaningless for 
integer data. So maybe that's a red herring. (Though if NaN had 
worked to mask data I would cheerfully have converted my images to 
floats to take advantage of it!).

What's really wanted is a more unified approach to masked data. I 
suppose it's pie in the sky, but I sure wish most the numarray 
functions took an optional mask array (or accepted a numarray.ma 
object -- nice for the user, but probably too painful for words under 
the hood).

I don't think there are major issues with what to do with masked 
data. Simply ignoring it works in most cases, e.g. mean, std dev, 
sum, max... In some cases one needs the new mask as output (e.g. 
matrix multiply). Filtering is a bit subtle: can masked data be 
treated the same as data off the edge? I hope so, but I'm not sure.

Anyway, I am grateful for what we do have. Without Numeric or 
numarray I would have to write all my image processing code in a 
different language.

-- Russell


From gazzar at email.com  Wed Jul 14 21:00:03 2004
From: gazzar at email.com (Gary Ruben)
Date: Wed Jul 14 21:00:03 2004
Subject: [Numpy-discussion] sum() and mean() broken?
Message-ID: <20040715035046.C8BFE1535C5@ws3-1.us4.outblaze.com>

I'm getting tracebacks on even the most basic sum() and mean() calls in numarray 1.0 under Windows. Apologies if this has already been reported.
Gary

>>> from numarray import *
>>> arange(10).sum()

Traceback (most recent call last):
  File "<pyshell#4>", line 1, in -toplevel-
    arange(10).sum()
  File "C:\APPS\PYTHON23\Lib\site-packages\numarray\numarraycore.py", line 1106, in sum
    return ufunc.add.reduce(ufunc.add.areduce(self, type=type).flat, type=type)
error: Int32asInt64: buffer not aligned on 8 byte boundary.

-- 
_______________________________________________
Talk More, Pay Less with Net2Phone Direct(R), up to 1500 minutes free! 
http://www.net2phone.com/cgi-bin/link.cgi?143 


From jmiller at stsci.edu  Thu Jul 15 06:18:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 15 06:18:04 2004
Subject: [Numpy-discussion] sum() and mean() broken?
In-Reply-To: <20040715035046.C8BFE1535C5@ws3-1.us4.outblaze.com>
References: <20040715035046.C8BFE1535C5@ws3-1.us4.outblaze.com>
Message-ID: <1089897432.2637.34.camel@halloween.stsci.edu>

numarray-1.0 is known to have problems with Windows-98, etc. (My guess
is any Pre-NT windows).  I haven't seen any problems with Windows XP or
Windows 2000 Pro.  

Which windows variant are you running?  

Does the numarray selftest pass?  It should look something like:

>>> import nuamrray.testall as testall
>>> testall.test()
numarray:                               ((0, 1178), (0, 1178))
numarray.records:                       (0, 48)
numarray.strings:                       (0, 176)
numarray.memmap:                        (0, 82)
numarray.objects:                       (0, 105)
numarray.memorytest:                    (0, 16)
numarray.examples.convolve:             ((0, 20), (0, 20), (0, 20), (0,
20))
numarray.convolve:                      (0, 52)
numarray.fft:                           (0, 75)
numarray.linear_algebra:                ((0, 46), (0, 51))
numarray.image:                         (0, 27)
numarray.nd_image:                      (0, 390)
numarray.random_array:                  (0, 53)
numarray.ma:                            (0, 671)


On Wed, 2004-07-14 at 23:50, Gary Ruben wrote:
> I'm getting tracebacks on even the most basic sum() and mean() calls in numarray 1.0 under Windows. Apologies if this has already been reported.
> Gary
> 
> >>> from numarray import *
> >>> arange(10).sum()
> 
> Traceback (most recent call last):
>   File "<pyshell#4>", line 1, in -toplevel-
>     arange(10).sum()
>   File "C:\APPS\PYTHON23\Lib\site-packages\numarray\numarraycore.py", line 1106, in sum
>     return ufunc.add.reduce(ufunc.add.areduce(self, type=type).flat, type=type)
> error: Int32asInt64: buffer not aligned on 8 byte boundary.
-- 


From mathieu.gontier at fft.be  Thu Jul 15 06:29:04 2004
From: mathieu.gontier at fft.be (Mathieu Gontier)
Date: Thu Jul 15 06:29:04 2004
Subject: [Numpy-discussion] static void** libnumarray_API
Message-ID: <200407151528.16261.mathieu.gontier@fft.be>

Hello, 

I am developping FEM bendings from a C++ code to Python with Numarray.
So, I have the following problem.

In the distribution file 'libnumarray.h', the variable 'libnumarray_API' is 
defined as a static variable (because of the symbol NO_IMPORT is not 
defined).

Then, I understand that all the examples are implemented in a unique file.

But, in my project, I must edit header files and source files in order to 
solve other problems (like cycle includes). So, I have two different source 
files which use numarray :
	- the file containing the 'init' function which call the function 
'import_libnumarray()' (which initialize 'libnumarray_API')
	- a file containing implementations, more precisely an implementation calling 
numarray functionnalities: with is 'static' state, this 'libnumarray_API' is 
NULL...

I tried to compile NumArray with the symbol 'NO_IMPORT' (see libnumarray.h) in 
order to have an extern variable. But this symbol doesn't allow to import 
numarray in the python environment.

So, does someone have a solution allowing to use NumArray API with 
header/source files ? 

Thanks, 
Mathieu Gontier


From curzio.basso at unibas.ch  Thu Jul 15 07:22:01 2004
From: curzio.basso at unibas.ch (Curzio Basso)
Date: Thu Jul 15 07:22:01 2004
Subject: [Numpy-discussion] NA.dot transposing in place
Message-ID: <40F692CC.3000103@unibas.ch>

Hi all.

I wonder if anyone noticed the following behaviour (new in 1.0) of the 
dot/matrixmultiply functions:

 >>> alpha = NA.arange(10, shape = (10,1))

 >>> beta = NA.arange(10, shape = (10,1))

 >>> NA.dot(alpha, alpha)
array([[285]])

 >>> alpha.shape # here it looks like it's doing the transpose in place
(1, 10)

 >>> NA.dot(beta, alpha)
array([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18],
        [ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27],
        [ 0,  4,  8, 12, 16, 20, 24, 28, 32, 36],
        [ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45],
        [ 0,  6, 12, 18, 24, 30, 36, 42, 48, 54],
        [ 0,  7, 14, 21, 28, 35, 42, 49, 56, 63],
        [ 0,  8, 16, 24, 32, 40, 48, 56, 64, 72],
        [ 0,  9, 18, 27, 36, 45, 54, 63, 72, 81]])

 >>> alpha.shape, beta.shape # but not the second time
((1, 10), (10, 1))

-------------------------------------------------

Can someone explain me what's going on?

thanks, curzio


From jmiller at stsci.edu  Thu Jul 15 07:36:11 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 15 07:36:11 2004
Subject: [Numpy-discussion] static void** libnumarray_API
In-Reply-To: <200407151528.16261.mathieu.gontier@fft.be>
References: <200407151528.16261.mathieu.gontier@fft.be>
Message-ID: <1089902141.2637.61.camel@halloween.stsci.edu>

On Thu, 2004-07-15 at 09:28, Mathieu Gontier wrote:
> Hello, 
> 
> I am developping FEM bendings from a C++ code to Python with Numarray.
> So, I have the following problem.
> 
> In the distribution file 'libnumarray.h', the variable 'libnumarray_API' is 
> defined as a static variable (because of the symbol NO_IMPORT is not 
> defined).
> 
> Then, I understand that all the examples are implemented in a unique file.
> 
> But, in my project, I must edit header files and source files in order to 
> solve other problems (like cycle includes). So, I have two different source 
> files which use numarray :
> 	- the file containing the 'init' function which call the function 
> 'import_libnumarray()' (which initialize 'libnumarray_API')
> 	- a file containing implementations, more precisely an implementation calling 
> numarray functionnalities: with is 'static' state, this 'libnumarray_API' is 
> NULL...
> 
> I tried to compile NumArray with the symbol 'NO_IMPORT' (see libnumarray.h) in 
> order to have an extern variable. But this symbol doesn't allow to import 
> numarray in the python environment.
> 
> So, does someone have a solution allowing to use NumArray API with 
> header/source files ? 

The good news is that the 1.0 headers, at least, work.

I intended to capture this form of multi-compilation-unit module in the
numpy_compat example... but didn't.  I think there's two "tricks"
missing in the example.  In *a* module of the several modules you're
linking together, do the following:

#define NO_IMPORT 1	   /* This prevents the definition of the static
			      version of the API var.  The extern won't
			      conflict with the real definition below. 			   */

#include "libnumarray.h"

void **libnumarray_API;    /* This defines the missing API var for *all*
				your compilation units */

This variable will be assigned the API pointer by the
import_libnumarray() call.

I fixed the numpy_compat example to demonstrate this in CVS but they
have a Numeric flavor.  The same principles apply to libnumarray.  Note
that for numarray-1.0 you must include/import both the Numeric
compatible and native numarray APIs separately if you use both.

Regards,
Todd


From gazzar at email.com  Thu Jul 15 07:37:01 2004
From: gazzar at email.com (Gary Ruben)
Date: Thu Jul 15 07:37:01 2004
Subject: [Numpy-discussion] sum() and mean() broken?
Message-ID: <20040715143500.2CD321CE306@ws3-6.us4.outblaze.com>

Thanks Todd,
It's under Win98 as you suspected and the selftest definitely doesn't pass.
Are you planning on supporting Win98? If so, I'll revert to numarray 0.9. Otherwise, I'll just use Numeric for this task and restrict playing with numarray 1.0 to my Win2k laptop.
thanks,
Gary
-- 
_______________________________________________
Talk More, Pay Less with Net2Phone Direct(R), up to 1500 minutes free! 
http://www.net2phone.com/cgi-bin/link.cgi?143 


From jmiller at stsci.edu  Thu Jul 15 07:38:00 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 15 07:38:00 2004
Subject: [Numpy-discussion] NA.dot transposing in place
In-Reply-To: <40F692CC.3000103@unibas.ch>
References: <40F692CC.3000103@unibas.ch>
Message-ID: <1089902251.2637.64.camel@halloween.stsci.edu>

On Thu, 2004-07-15 at 10:21, Curzio Basso wrote:
> Hi all.
> 
> I wonder if anyone noticed the following behaviour (new in 1.0) of the 
> dot/matrixmultiply functions:
> 
>  >>> alpha = NA.arange(10, shape = (10,1))
> 
>  >>> beta = NA.arange(10, shape = (10,1))
> 
>  >>> NA.dot(alpha, alpha)
> array([[285]])
> 
>  >>> alpha.shape # here it looks like it's doing the transpose in place
> (1, 10)
> 
>  >>> NA.dot(beta, alpha)
> array([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
>         [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
>         [ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18],
>         [ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27],
>         [ 0,  4,  8, 12, 16, 20, 24, 28, 32, 36],
>         [ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45],
>         [ 0,  6, 12, 18, 24, 30, 36, 42, 48, 54],
>         [ 0,  7, 14, 21, 28, 35, 42, 49, 56, 63],
>         [ 0,  8, 16, 24, 32, 40, 48, 56, 64, 72],
>         [ 0,  9, 18, 27, 36, 45, 54, 63, 72, 81]])
> 
>  >>> alpha.shape, beta.shape # but not the second time
> ((1, 10), (10, 1))
> 
> -------------------------------------------------
> 
> Can someone explain me what's going on?

It's a bug introduced in numarray-1.0.  It'll be fixed for 1.1 in a
couple weeks.

Regards,
Todd


From jmiller at stsci.edu  Thu Jul 15 07:49:14 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 15 07:49:14 2004
Subject: [Numpy-discussion] sum() and mean() broken?
In-Reply-To: <20040715143500.2CD321CE306@ws3-6.us4.outblaze.com>
References: <20040715143500.2CD321CE306@ws3-6.us4.outblaze.com>
Message-ID: <1089902892.2637.75.camel@halloween.stsci.edu>

On Thu, 2004-07-15 at 10:35, Gary Ruben wrote:
> Thanks Todd,
> It's under Win98 as you suspected and the selftest definitely doesn't pass.
> Are you planning on supporting Win98? 

I'm planning to debug this particular problem because I'm concerned that
it's just latent in the newer windows variants.  To the degree that
Win98 is "free" under the umbrella of win32, it will continue to be
supported.  An ongoing issue will likely be that Win98 testing doesn't
get done on a regular basis... just as problems are reported.

Regards,
Todd 


From curzio.basso at unibas.ch  Thu Jul 15 07:51:01 2004
From: curzio.basso at unibas.ch (Curzio Basso)
Date: Thu Jul 15 07:51:01 2004
Subject: [Numpy-discussion] NA.dot transposing in place
In-Reply-To: <1089902251.2637.64.camel@halloween.stsci.edu>
References: <40F692CC.3000103@unibas.ch> <1089902251.2637.64.camel@halloween.stsci.edu>
Message-ID: <40F6999C.2050101@unibas.ch>

Todd Miller wrote:

> It's a bug introduced in numarray-1.0.  It'll be fixed for 1.1 in a
> couple weeks.

Ah, ok. Is it related with the bug announced a couple of days ago?


From jmiller at stsci.edu  Thu Jul 15 08:14:10 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 15 08:14:10 2004
Subject: [Numpy-discussion] NA.dot transposing in place
In-Reply-To: <40F6999C.2050101@unibas.ch>
References: <40F692CC.3000103@unibas.ch>
	 <1089902251.2637.64.camel@halloween.stsci.edu> <40F6999C.2050101@unibas.ch>
Message-ID: <1089904417.2637.147.camel@halloween.stsci.edu>

On Thu, 2004-07-15 at 10:50, Curzio Basso wrote:
> Todd Miller wrote:
> 
> > It's a bug introduced in numarray-1.0.  It'll be fixed for 1.1 in a
> > couple weeks.
> 
> Ah, ok. Is it related with the bug announced a couple of days ago?

Only peripherally.  The Numeric compatibility layer problem was
discovered as a result of porting a bunch of Numeric functions to
numarray... ports done to try to get better small array speed. 
Similarly, the  setup for matrixmultiply was moved into C for
numarray-1.0... to try to get better small array speed.

numarray-1.0 is disappointingly buggy,  but the interest generated by
the 1.0 moniker is making the open source model work well so I think 1.1
will be much more solid as a result of strong user feedback.  So, thanks
for the report.

Regards,
Todd


From cjw at sympatico.ca  Thu Jul 15 08:22:07 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Thu Jul 15 08:22:07 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407131106.19557.falted@pytables.org>
References: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu> <200407131028.04791.falted@pytables.org> <200407131106.19557.falted@pytables.org>
Message-ID: <40F6A106.6020606@sympatico.ca>


Francesc Alted wrote:

>A Dimarts 13 Juliol 2004 10:28, Francesc Alted va escriure:
>  
>
>>A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
>>    
>>
>>>What I'm wondering about is what a single element of a record array
>>>should be. Returning a tuple has an undeniable simplicity to it.
>>>      
>>>
>>Yeah, this why I'm strongly biased toward this possibility.
>>
>>    
>>
>>>On the other hand, we've been using recarrays that allow naming the
>>>various columns (which we refer to as "fields").  If one can refer
>>>to fields of a recarray, shouldn't one be able to refer to a field
>>>(by name) of one of it's elements? Or are you proposing that basic
>>>recarrays not have that sort of capability (something added by a
>>>subclass)?
>>>      
>>>
>>Well, I'm not sure about that. But just in case most of people would like to
>>access records by field as well as by index, I would advocate for the
>>possibility that the Record instances would behave as similar as possible as
>>a tuple (or dictionary?). That include creating appropriate __str__() *and*
>>__repr__() methods as well as __getitem__() that supports both name fields
>>and indices. I'm not sure about whether providing an __getattr__() method
>>would ok, but for the sake of simplicity and in order to have (preferably)
>>only one way to do things, I would say no.
>>    
>>
>
>I've been thinking that one can made compatible to return a tuple on a
>single element of a RecArray and still being able to retrieve a field by
>name is to play with the RecArray.__getitem__ and let it to suport key names
>in addition to indices. This would be better seen as an example:
>
>Right now, one can say:
>
>  
>
>>>>r=records.array([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8")
>>>>r._fields["c1"]
>>>>        
>>>>
>array([1, 2])
>  
>
>>>>r._fields["c1"][1]
>>>>        
>>>>
>2
>
>What I propose is to be able to say:
>
>  
>
>>>>r["c1"]
>>>>        
>>>>
>array([1, 2])
>  
>
>>>>r["c1"][1]
>>>>
I would suggest going a step beyond this, so that one can have r.c1[1], 
see the script below.
I have not explored the assignment of a value to r.c1.[1], but it seems 
to be achievable.
If changes along this line are acceptable, it is suggested that fields 
be renamed cols, or some
such, to indicate its wider impact.

Colin W.

>>>>        
>>>>
>2
>
>Which would replace the notation:
>
>  
>
>>>>r[1]["c1"]
>>>>        
>>>>
>2
>
>which was recently suggested.
>
>I.e. the suggestion is to realize RecArrays as a collection of columns,
>as well as a collection of rows.
>  
>
# tRecord.py to explore RecArray

import numarray.records as _rec
import sys
#
class Rec1(_rec.RecArray):
  def __new__(cls, buffer, formats, shape=0, names=None, byteoffset=0,
                 bytestride=None, byteorder=sys.byteorder, aligned=0):
    # This calls RecArray.__init__ - reason unclear.
    # Why can't the instance be fully created by RecArray.__init__?
    return _rec.RecArray.__new__(cls, buffer, formats=formats, 
shape=shape, names=names,
                         byteorder=byteorder, aligned=aligned)

  def __init__(self, buffer, formats, shape=0, names=None, byteoffset=0,
               bytestride=None, byteorder=sys.byteorder, aligned=0):
 
    arr= _rec.array(buffer, formats=formats, shape=shape, names=names,
                   byteorder=byteorder, aligned=aligned)

    self.__setstate__(arr.__getstate__())

  def __getattr__(self, name):
    # We reach here if the attribute does not belong to the basic Rec1 set
    return self._fields[name]
       
  def __getattribute__(self, name):
    return _rec.RecArray.__getattribute__(self, name)
 
  def __repr__(self):
    return self.__class__.__name__ + _rec.RecArray.__repr__(self)[8:]

  def __setattr__(self, name, value):
    return _rec.RecArray.__setattr__(self, name, value)
 
  def __str__(self):
    return self.__class__.__name__ + _rec.RecArray.__str__(self)[8:]  
   
if __name__ == '__main__':   
  # Frances Alted 13-Jul-04 05:06
  r= _rec.array([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8")
  print r._fields["c1"]
  print r._fields["c1"][1]
  r1= Rec1([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8")
  print r1._fields["c1"]
  print r1._fields["c1"][1]
  #
  r1.zz= 99                       #  acceptable
  print r1.c1
  print r1.c1[1]
  try:
    x= r1.ugh
  except:
    print 'ugh not recognized as an attribute'
'''
  The above delivers:
[1 2]
2
[1 2]
2
[1 2]
2
ugh not recognized as an attribute
'''
 

From falted at pytables.org  Thu Jul 15 09:12:08 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Jul 15 09:12:08 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <40F6A106.6020606@sympatico.ca>
References: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu> <200407131106.19557.falted@pytables.org> <40F6A106.6020606@sympatico.ca>
Message-ID: <200407151811.20359.falted@pytables.org>

A Dijous 15 Juliol 2004 17:21, Colin J. Williams va escriure:
> >What I propose is to be able to say:
> >>>>r["c1"][1]
> I would suggest going a step beyond this, so that one can have r.c1[1], 
> see the script below.

Yeah. I've implemented something similar to access column elements for
pytables Table objects. However, the problem in this case is that there are
already attributes that "pollute" the column namespace, so that a column
named "size" collides with the size() method.

I came up with a solution by adding a new "cols" attribute to the Table
object that is an instance of a simple class named Cols with no attributes
that can pollute the namespace (except some starting by "__" or "_v_").
Then, it is just a matter of provide functionality to access the different
columns. In that case, when a reference of a column is made, another object
(instance of Column class) is returned. This Column object is basically an
accessor to column values with a __getitem__() and __setitem__() methods.
That might sound complicated, but it is not. I'm attaching part of the
relevant code below.

I personally like that solution in the context of pytables because it
extends the "natural naming" convention quite naturally. A similar approach
could be applied to RecArray objects as well, although numarray might (and
probably do) have other usage conventions.

> I have not explored the assignment of a value to r.c1.[1], but it seems 
> to be achievable.

in the schema I've just proposed the next should be feasible:

value = r.cols.c1[1]
r.cols.c1[1] = value


-- 
Francesc Alted


-----------------------------------------------------------------
class Cols(object):
    """This is a container for columns in a table

    It provides methods to get Column objects that gives access to the
    data in the column.

    Like with Group instances and AttributeSet instances, the natural
    naming is used, i.e. you can access the columns on a table like if
    they were normal Cols attributes.
    
    Instance variables:

        _v_table -- The parent table instance
        _v_colnames -- List with all column names

    Methods:
    
        __getitem__(colname)
        
    """

    def __init__(self, table):
        """Create the container to keep the column information.

        table -- The parent table
        
        """
        self.__dict__["_v_table"] = table
        self.__dict__["_v_colnames"] = table.colnames
        # Put the column in the local dictionary
        for name in table.colnames:
            self.__dict__[name] = Column(table, name)

    def __len__(self):
        return self._v_table.nrows

    def __getitem__(self, name):
        """Get the column named "name" as an item."""

        if not isinstance(name, types.StringType):
            raise TypeError, \
"Only strings are allowed as keys of a Cols instance. You passed object: %s" % name
        # If attribute does not exist, return None
        if not name in self._v_colnames:
            raise AttributeError, \
"Column name '%s' does not exist in table:\n'%s'" % (name, str(self._v_table))

        return self.__dict__[name]

    def __str__(self):
        """The string representation for this object."""
        # The pathname
        pathname = self._v_table._v_pathname
        # Get this class name
        classname = self.__class__.__name__
        # The number of columns
        ncols = len(self._v_colnames)
        return "%s.cols (%s), %s columns" % (pathname, classname, ncols)

    def __repr__(self):
        """A detailed string representation for this object."""

        out = str(self) + "\n"
        for name in self._v_colnames:
            # Get this class name
            classname = getattr(self, name).__class__.__name__
            # The shape for this column
            shape = self._v_table.colshapes[name]
            # The type
            tcol = self._v_table.coltypes[name]
            if shape == 1:
                shape = (1,)
            out += "  %s (%s%s, %s)" % (name, classname, shape, tcol) + "\n"
        return out

               
class Column(object):
    """This is an accessor for the actual data in a table column

    Instance variables:

        table -- The parent table instance
        name -- The name of the associated column

    Methods:
    
        __getitem__(key)
        
    """

    def __init__(self, table, name):
        """Create the container to keep the column information.

        table -- The parent table instance
        name -- The name of the column that is associated with this object
        
        """
        self.table = table
        self.name = name
        # Check whether an index exists or not
        iname = "_i_"+table.name+"_"+name
        self.index = None
        if iname in table._v_parent._v_indices:
            self.index = Index(where=self, name=iname,
                               expectedrows=table._v_expectedrows)
        else:
            self.index = None

    def __getitem__(self, key):
        """Returns a column element or slice

        It takes different actions depending on the type of the
        "key" parameter:

        If "key" is an integer, the corresponding element in the
        column is returned as a NumArray/CharArray, or a scalar
        object, depending on its shape. If "key" is a slice, the row
        slice determined by this slice is returned as a NumArray or
        CharArray object (whatever is appropriate).

        """
        
        if isinstance(key, types.IntType):
            if key < 0:
                # To support negative values
                key += self.table.nrows
            (start, stop, step) = processRange(self.table.nrows, key, key+1, 1)
            return self.table._read(start, stop, step, self.name, None)[0]
        elif isinstance(key, types.SliceType):
            (start, stop, step) = processRange(self.table.nrows, key.start,
                                               key.stop, key.step)
            return self.table._read(start, stop, step, self.name, None)
        else:
            raise TypeError, "'%s' key type is not valid in this context" % \
                  (key)

    def __str__(self):
        """The string representation for this object."""
        # The pathname
        pathname = self.table._v_pathname
        # Get this class name
        classname = self.__class__.__name__
        # The shape for this column
        shape = self.table.colshapes[self.name]
        if shape == 1:
            shape = (1,)
        # The type
        tcol = self.table.coltypes[self.name]
        return "%s.cols.%s (%s%s, %s)" % (pathname, self.name,
                                          classname, shape, tcol)

    def __repr__(self):
        """A detailed string representation for this object."""
        return str(self)
               

From perry at stsci.edu  Thu Jul 15 10:39:06 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Thu Jul 15 10:39:06 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407151811.20359.falted@pytables.org>
Message-ID: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>

Francesc Alted wrote:
> A Dijous 15 Juliol 2004 17:21, Colin J. Williams va escriure:
> > >What I propose is to be able to say:
> > >>>>r["c1"][1]
> > I would suggest going a step beyond this, so that one can have r.c1[1],
> > see the script below.
>
> Yeah. I've implemented something similar to access column elements for
> pytables Table objects. However, the problem in this case is that
> there are
> already attributes that "pollute" the column namespace, so that a column
> named "size" collides with the size() method.
>
The idea of mapping field names to attributes occurs to everyone
quickly, but for the reasons Francesc gives (as well as another I'll
mention) we were reluctant to implement it. The other reason is that
it would be nice to allow field names that are not legal attributes
(e.g., that include spaces or other illegal attribute characters).
There are potentially people with data in databases or other similar
formats that would like to map field name exactly. Well certainly
one can still use the attribute approach and not support all field
names (or column, or col...) it does introduce another glitch in
the user interface when it works only for a subset of legal names.

> I came up with a solution by adding a new "cols" attribute to the Table
> object that is an instance of a simple class named Cols with no attributes
> that can pollute the namespace (except some starting by "__" or "_v_").
> Then, it is just a matter of provide functionality to access the different
> columns. In that case, when a reference of a column is made,
> another object
> (instance of Column class) is returned. This Column object is basically an
> accessor to column values with a __getitem__() and __setitem__() methods.
> That might sound complicated, but it is not. I'm attaching part of the
> relevant code below.
>
> I personally like that solution in the context of pytables because it
> extends the "natural naming" convention quite naturally. A
> similar approach
> could be applied to RecArray objects as well, although numarray might (and
> probably do) have other usage conventions.
>
> > I have not explored the assignment of a value to r.c1.[1], but it seems
> > to be achievable.
>
> in the schema I've just proposed the next should be feasible:
>
> value = r.cols.c1[1]
> r.cols.c1[1] = value
>
This solution avoids name collisions but doesn't handle the other
problem. This is worth considering, but I thought I'd hear comments
about the other issue before deciding it (there is also the
"more than one way" issue as well; but this guideline seems to bend
quite often to pragmatic concerns).

We're still chewing on all the other issues and plan to start floating
some proposals, rationales and questions before long.

Perry


From falted at pytables.org  Thu Jul 15 11:21:10 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Jul 15 11:21:10 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>
Message-ID: <200407152020.00873.falted@pytables.org>

A Dijous 15 Juliol 2004 19:37, Perry Greenfield va escriure:
> formats that would like to map field name exactly. Well certainly
> one can still use the attribute approach and not support all field
> names (or column, or col...) it does introduce another glitch in
> the user interface when it works only for a subset of legal names.

Yep. I forgot that issue. My particular workaround on that was to provide an
optional trMap dictionary during Table (in our case, RecArray) creation time
to map those original names that are not valid python names by valid ones. 

That would read something like:

>>> r=records.array([(1,"as")], "1i4,1a2",
                    names=["c 1", "c2"], trMap={"c1": "c 1"})

that would indicate that the "c 1" column which is not a valid python name
(it has an space in the middle) can be accessed by using "c1" string, which
is a valid python id. That way, r.cols.c1 would access column "c 1".

And although I must admit that this solution is not very elegant, it allows
to cope with those situations where the columns are not valid python names.

-- 
Francesc Alted


From cjw at sympatico.ca  Thu Jul 15 17:22:42 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Thu Jul 15 17:22:42 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>
Message-ID: <40F71F9C.9040008@sympatico.ca>

Perry Greenfield wrote:

>Francesc Alted wrote:
>  
>
>>A Dijous 15 Juliol 2004 17:21, Colin J. Williams va escriure:
>>    
>>
>>>>What I propose is to be able to say:
>>>>        
>>>>
>>>>>>>r["c1"][1]
>>>>>>>              
>>>>>>>
>>>I would suggest going a step beyond this, so that one can have r.c1[1],
>>>see the script below.
>>>      
>>>
>>Yeah. I've implemented something similar to access column elements for
>>pytables Table objects. However, the problem in this case is that
>>there are
>>already attributes that "pollute" the column namespace, so that a column
>>named "size" collides with the size() method.
>>
>>    
>>
>The idea of mapping field names to attributes occurs to everyone
>quickly, but for the reasons Francesc gives (as well as another I'll
>mention) we were reluctant to implement it. The other reason is that
>it would be nice to allow field names that are not legal attributes
>(e.g., that include spaces or other illegal attribute characters).
>There are potentially people with data in databases or other similar
>formats that would like to map field name exactly. Well certainly
>one can still use the attribute approach and not support all field
>names (or column, or col...) it does introduce another glitch in
>the user interface when it works only for a subset of legal names.
>  
>
It would, I suggest, not be unduly restrictive to bar the existing 
attribute names but, if that's not
acceptable, Francesc has suggested the.col workaround, although I would 
prefer to avoid the
added clutter.

Incidentally, there is no current protection against wiping out an 
existing method:
[Dbg]>>> r1.size= 0
[Dbg]>>> r1.size
0
[Dbg]>>>

>  
>
>>I came up with a solution by adding a new "cols" attribute to the Table
>>object that is an instance of a simple class named Cols with no attributes
>>that can pollute the namespace (except some starting by "__" or "_v_").
>>Then, it is just a matter of provide functionality to access the different
>>columns. In that case, when a reference of a column is made,
>>another object
>>(instance of Column class) is returned. This Column object is basically an
>>accessor to column values with a __getitem__() and __setitem__() methods.
>>That might sound complicated, but it is not. I'm attaching part of the
>>relevant code below.
>>
>>I personally like that solution in the context of pytables because it
>>extends the "natural naming" convention quite naturally. A
>>similar approach
>>could be applied to RecArray objects as well, although numarray might (and
>>probably do) have other usage conventions.
>>
>>    
>>
>>>I have not explored the assignment of a value to r.c1.[1], but it seems
>>>to be achievable.
>>>      
>>>
>>in the schema I've just proposed the next should be feasible:
>>
>>value = r.cols.c1[1]
>>r.cols.c1[1] = value
>>
>>    
>>
>This solution avoids name collisions but doesn't handle the other
>problem. This is worth considering, but I thought I'd hear comments
>about the other issue before deciding it (there is also the
>"more than one way" issue as well; but this guideline seems to bend
>quite often to pragmatic concerns).
>
To allow for multi-word column names, assignment could replace a space 
by an underscore
and, in retrieval, the reverse could be done - ie. underscore would be 
banned for a column name.

Colin W.

>
>We're still chewing on all the other issues and plan to start floating
>some proposals, rationales and questions before long.
>
>Perry
>
>
>  
>


From falted at pytables.org  Fri Jul 16 02:12:11 2004
From: falted at pytables.org (Francesc Alted)
Date: Fri Jul 16 02:12:11 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <40F71F9C.9040008@sympatico.ca>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu> <40F71F9C.9040008@sympatico.ca>
Message-ID: <200407161111.41626.falted@pytables.org>

A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
> To allow for multi-word column names, assignment could replace a space 
> by an underscore
> and, in retrieval, the reverse could be done - ie. underscore would be 
> banned for a column name.

That's not so easy. What about other chars like '/&%@$()' that cannot be
part of python names? Finding a biunivocal map between them and allowed
chars would be difficult (if possible at all). Besides, the resulting
colnames might become a real mess.

Regards,

-- 
Francesc Alted


From cjw at sympatico.ca  Fri Jul 16 05:41:12 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Fri Jul 16 05:41:12 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407161111.41626.falted@pytables.org>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu> <40F71F9C.9040008@sympatico.ca> <200407161111.41626.falted@pytables.org>
Message-ID: <40F7CBC6.2030607@sympatico.ca>

Francesc Alted wrote:

>A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
>  
>
>>To allow for multi-word column names, assignment could replace a space 
>>by an underscore
>>and, in retrieval, the reverse could be done - ie. underscore would be 
>>banned for a column name.
>>    
>>
>
>That's not so easy. What about other chars like '/&%@$()' that cannot be
>part of python names? Finding a biunivocal map between them and allowed
>chars would be difficult (if possible at all). Besides, the resulting
>colnames might become a real mess.
>
>Regards,
>  
>
Yes, if the objective is to include special characters or facilitate 
multi-lingual columns names and
it probably should be, then my suggestion is quite inadequate.

Perhaps there could be a simple name -> column number mapping in place 
of _names.  References
to a column, or a field in a record, could then be through this dictionary.

Basic access to data in a record would be by position number, rather 
than name, but the dictionary
would facilitate access by name.

Data could be referenced either through the column name: r1.c2[1] or
through the record r1[1].c2, with the possibility that the index is 
multi-dimensional in either case.

Colin W.


From rowen at u.washington.edu  Fri Jul 16 10:55:23 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Fri Jul 16 10:55:23 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407161111.41626.falted@pytables.org>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>
 <40F71F9C.9040008@sympatico.ca> <200407161111.41626.falted@pytables.org>
Message-ID: <p06110402bd1db745d1bb@[128.95.99.44]>

>A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
>>  To allow for multi-word column names, assignment could replace a space
>>  by an underscore
>>  and, in retrieval, the reverse could be done - ie. underscore would be
>>  banned for a column name.
>
>That's not so easy. What about other chars like '/&%@$()' that cannot be
>part of python names? Finding a biunivocal map between them and allowed
>chars would be difficult (if possible at all). Besides, the resulting
>colnames might become a real mess.

Personally, I think the idea of allowing access to fields via 
attributes is fatally flawed. The problems raised (non-obvious 
mapping between field names with special characters and allowable 
attribute names and also the collision with existing instance 
variable and method names) clearly show it would be forced and 
non-pythonic.

The obvious solution seems to be some combination of the dict 
interface (an ordered dict that keeps its keys in original field 
order) and the list interface.

My personal leaning is:
- Offer most of the dict methods, including __get/setitem__, keys, 
values and all iterators but NOT set_default pop_item or anything 
else that adds or deletes a field.
- Offer the list version of __get/setitem__, as well, but NONE of 
list's methods.
- Make the default iterator iterate over values, not keys (field names),
   i.e have the item act like a list, not a dict when used as an iterator.

In other words, the following all work (where item is one element of 
a numarray.record array):
item[0] = 10  # set value of field 0 to 10
x = item[0:5]  # get value of fields 0 through 4
item[:] = list of replacement values
item["afield"] = 10
"%s(afield)" % item
the methods iterkeys, itervalues, iteritems, keys, values, has_key all work
the method update might work, but it's an error to add new fields

-- Russell

P.S. Folks are welcome to use my ordered dictionary implementation 
RO.Alg.OrderedDictionary, which is part of the RO package 
<http://www.astro.washington.edu/rowen/ROPython.html>. It is fully 
standalone (despite its location in my hierarchy) and is used in 
production code.


From barrett at stsci.edu  Fri Jul 16 11:49:01 2004
From: barrett at stsci.edu (Paul Barrett)
Date: Fri Jul 16 11:49:01 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <p06110402bd1db745d1bb@[128.95.99.44]>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu> <40F71F9C.9040008@sympatico.ca> <200407161111.41626.falted@pytables.org> <p06110402bd1db745d1bb@[128.95.99.44]>
Message-ID: <40F822E0.5010406@stsci.edu>

Russell E Owen wrote:

>> A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
>>
>>>  To allow for multi-word column names, assignment could replace a space
>>>  by an underscore
>>>  and, in retrieval, the reverse could be done - ie. underscore would be
>>>  banned for a column name.
>>
>>
>> That's not so easy. What about other chars like '/&%@$()' that cannot be
>> part of python names? Finding a biunivocal map between them and allowed
>> chars would be difficult (if possible at all). Besides, the resulting
>> colnames might become a real mess.
>
>
> Personally, I think the idea of allowing access to fields via 
> attributes is fatally flawed. The problems raised (non-obvious mapping 
> between field names with special characters and allowable attribute 
> names and also the collision with existing instance variable and 
> method names) clearly show it would be forced and non-pythonic.

+1

It also make it difficult to do the following:

a = item[:10, ('age', 'surname', 'firstname')]

where field (or column) 1 is 'firstname, field 2 is 'surname', and field 
10 is 'age'.

 -- Paul

-- 
Paul Barrett, PhD      Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Branch
FAX:   410-338-4767    Baltimore, MD 21218


From jmiller at stsci.edu  Fri Jul 16 12:43:02 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jul 16 12:43:02 2004
Subject: [Numpy-discussion] I move your "Bugs" reports...
Message-ID: <1090006936.7264.66.camel@halloween.stsci.edu>

Not infrequently even very experienced numarray contributors file bug
reports in the numpy "Bugs" tracker.  Because numpy is a shared SF
project with both Numeric and numarray,  numarray bugs are actually
tracked in the "Numarray Bugs" tracker, here:

http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse

"Numarray Bugs" can also be found through the "Tracker" link at the top
of any numpy SF web page.  

So, don't worry, your painstaking reports are not getting deleted,
they're getting relocated to a place where *only* numarray bugs live. 
There's probably a better way to do this,  but until I find it or
someone tells me about it,  I thought I should tell everyone what's
going on.  Thanks to everybody who takes the time to fill out bug
reports to make numarray better...

Regards,
Todd


From hsu at stsci.edu  Fri Jul 16 13:19:00 2004
From: hsu at stsci.edu (Jin-chung Hsu)
Date: Fri Jul 16 13:19:00 2004
Subject: [Numpy-discussion] multidimensional record arrays
Message-ID: <200407162018.ANW09710@donner.stsci.edu>

There have been a number of questions and suggestions about
how the record array facility in numarray could be improved.
We've been talking about these internally and thought it would
be useful to air some proposals along with discussions of the
rationale behind each proposal as well discussions of drawbacks,
and some remaining open questions. Rather than do this in one
long message, we will do this in pieces. The first addresses
how to improve handling multidimensional record arrays.

These will not discuss how or when we implement the proposed
enhancements or changes. We first want to come to some
consensus (or lacking that, decision) first about what the
target should be.

*********************************************************

Proposal for records module enhancement, to handle record arrays of
dimension (rank) higher than 1.

Background:

The current records module in numarray doesn't handle record arrays of
dimension higher than one well.  Even though most of the infrastructure
for higher dimensionality is already in place, the current implementation
for the record arrays was based on the implicit assumption that record
arrays are 1-D. This limitation is reflected in the areas of input user
interface, indexing, and output.

The indexing and output are more straightforward to modify, so I'll
discuss it first.

Although it is possible to create a multi-dimensional record array,
indexing does not work properly for 2 or more dimensions.  For example,
for a 2-D record array r, r[i,j] does not give correct result (but r[i][j]
does). This will be fixed.

At present, a user cannot print record arrays higher than 1-D.  This will
also be fixed as well as incorporating some numarray features (e.g.,
printing only the beginning and end of an array for large arrays--as is done
for numarrays now).

Input Interface:

There are currently several different ways to construct the record array
using the array() function These include setting the buffer argument to:

(1) None
(2) File object
(3) String object or appropriate buffer object (i.e., binary data)
(4) a list of records (in the form of sequences),
    for example:  [(1,'abc', 2.3), (2,'xyz', 2.4)]
(5) a list of numarrays/chararrays for each field (e.g., effectively
 'zipping' the arrays into records)

The first three types of input are very general and can be used to generate
multi-dimensional record arrays in the current implementation.  All these
options need to specify the "shape" argument.

The input options that do not work for multi-dimensional record arrays now
are the last two.

Option 4 (sequence of 'records')

If a user has a multi-dimensional record array and if one or more field is
also a multidimensional array, using this option is potentially confusing
since there can be ambiguity regarding what part of a nested sequence
structure is the structure of the record array and what should be considered
part of the record since record elements themselves may be arrays. (Some of
the same issues arise for object arrays)

As an example:

--> r=rec.array([([1,2],[3,4]),([11,12],[13,14])])

could be interpreted as a 1-D record array, where each cell is an
(num)array:

RecArray[
(array([1, 2]), array([3, 4])),
(array([11, 12]), array([13, 14]))
]

or a 2-D record array, where each cell is just a number:

RecArray(
             [[(1, 2),
              (3, 4)],

             [(11, 12),
              (13, 14)]])

Thus we propose a new argument "rank" (following the convention used in
object arrays) to specify the dimensionality of the output record array.  In
the first example above, rank is 1, and the second example rank=2.  If rank
is set to None, the highest possible rank will be assumed (in this example,
2).

We propose to eventually generalize that to accept any sequence object for
the array structure (though there will be the same requirement that exist
for
other arrays that the nested sequences be of the same type). As would be
expected, strings are not permitted as the enclosing sequence. In this
future implementation the record 'item' itself must either be:

1) A tuple
2) A subclass of tuple
3) A Record object (this may be taken care of by 2 if we make Record
   a subclass of tuple; this will be discussed in a subsequent proposal.

This requirement allows distinguishing the sequence of records from Option 5
below. For tuples (or tuple derived elements), the items of the tuple must
be one of the following: basic data types such as int, float, boolean, or
string; a numarray or chararray; or an object that can be converted to a
numarray or chararray.

Option 5 (List of Arrays)

Using a list of arrays to construct an N-D record array should be easier
Than using the previous option.  The input syntax is simply:

[array1, array2, array3,...]

The shape of the record array will be determined from the shape of the input
arrays as described below. All the user needs to do is to construct the
arrays in the list.  There is, similar to option 4, a possible ambiguity:
if all the arrays are of the shape, say, (2,3), then the user may intend a
1-D record array of 2 rows while each cell is an array of shape 3, or a 2-D
record array of shape (2,3) while each cell is a single number of string.
Thus, the user must either explicitly specify the "shape" or "rank".

We propose the following behavior via examples:

Example 1:

given:

array1.shape=(2,3,4,5)
array2.shape=(2,3,4)
array3.shape=(2,3)

Rank can only be specified as rank=1 (the record array's shape will then be
(2,)) or rank=2 (the record array's shape will then be (2,3)). For rank=None
the record shape will be (2,3), i.e. the "highest common denominator": each
cell in the first field will be an array of shape (4,5), each cell in the
second field will be an array of shape (4,), and each cell in the 3rd field
will be a single number or a string.  If "shape" is specified, it will take
precedence over "rank" and its allowed value in this example will be either
2, or (2,3).

Example 2:

array1.shape=(3,4,5)
array2.shape=(4,5)

this will raise exception because the 'slowest' axes do not match.

*********

For both the sequence of records and list-of-arrays input options, we
Propose the default value for "rank" be None (current default is 1).
This gives consistent behavior with object arrays but does change the
current behavior.

Also for both cases specifying a shape inconsistent with the supplied data
will raise an exception.


From cjw at sympatico.ca  Fri Jul 16 19:46:09 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Fri Jul 16 19:46:09 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <40F822E0.5010406@stsci.edu>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu> <40F71F9C.9040008@sympatico.ca> <200407161111.41626.falted@pytables.org> <p06110402bd1db745d1bb@[128.95.99.44]> <40F822E0.5010406@stsci.edu>
Message-ID: <40F892B2.7090706@sympatico.ca>

Paul Barrett wrote:

> Russell E Owen wrote:
>
>>> A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
>>>
>>>>  To allow for multi-word column names, assignment could replace a 
>>>> space
>>>>  by an underscore
>>>>  and, in retrieval, the reverse could be done - ie. underscore 
>>>> would be
>>>>  banned for a column name.
>>>
>>>
>>>
>>> That's not so easy. What about other chars like '/&%@$()' that 
>>> cannot be
>>> part of python names? Finding a biunivocal map between them and allowed
>>> chars would be difficult (if possible at all). Besides, the resulting
>>> colnames might become a real mess.
>>
>>
>>
>> Personally, I think the idea of allowing access to fields via 
>> attributes is fatally flawed. The problems raised (non-obvious 
>> mapping between field names with special characters and allowable 
>> attribute names and also the collision with existing instance 
>> variable and method names) clearly show it would be forced and 
>> non-pythonic.
>
>
> +1

Paul,

Below, I've appended my response to Francesc's 08:36 message, it was 
copied to the list
but does not appear in the archive.

>
> It also make it difficult to do the following:
>
> a = item[:10, ('age', 'surname', 'firstname')]
>
> where field (or column) 1 is 'firstname, field 2 is 'surname', and 
> field 10 is 'age'.
>
> -- Paul

Could you clarify what you have in mind here please?  Is this a proposed
extension to records.py, as it exists in version 1.0?

Colin W.
------------------------------------------------------------------------
Yes, if the objective is to include special characters or facilitate 
multi-lingual columns names and
it probably should be, then my suggestion is quite inadequate.

Perhaps there could be a simple name -> column number mapping in place 
of _names.  References
to a column, or a field in a record, could then be through this dictionary.

Basic access to data in a record would be by position number, rather 
than name, but the dictionary
would facilitate access by name.

Data could be referenced either through the column name: r1.c2[1] or
through the record r1[1].c2, with the possibility that the index is 
multi-dimensional in either case.

Colin W.


From gerard.vermeulen at grenoble.cnrs.fr  Sun Jul 18 14:25:10 2004
From: gerard.vermeulen at grenoble.cnrs.fr (gerard.vermeulen at grenoble.cnrs.fr)
Date: Sun Jul 18 14:25:10 2004
Subject: [Numpy-discussion] Follow-up Numarray header PEP
In-Reply-To: <1088632459.7526.213.camel@halloween.stsci.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu>
Message-ID: <20040718212443.M21561@grenoble.cnrs.fr>

Hi Todd,

This is a follow-up on the 'header pep' discussion.

The attachment numnum-0.1.tar.gz contains the sources for the
extension modules pep and numnum.  At least on my systems, both
modules behave as described in the 'numarray header PEP' when the
extension modules implementing the C-API are not present (a situation
not foreseen by the macros import_array() of Numeric and especially
numarray).  IMO, my solution is 'bona fide', but requires further
testing.

The pep module shows how to handle the colliding C-APIs of the Numeric
and numarray extension modules and how to implement automagical
conversion between Numeric and numarray arrays.

For a technical reason explained in the README, the hard work of doing
the conversion between Numeric and numarray arrays has been delegated
to the numnum module.  The numnum module is useful when one needs to
convert from one array type to the other to use an extension module
which only exists for the other type (eg. combining numarray's image
processing extensions with pygame's Numeric interface):

Python 2.3+ (#1, Jan  7 2004, 09:17:35)
[GCC 3.3.1 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numnum; import Numeric as np; import numarray as na
>>> np1 = np.array([[1, 2], [3, 4]]); na1 = numnum.toNA(np1)
>>> na2 = na.array([[1, 2, 3], [4, 5, 6]]); np2 = numnum.toNP(na2)
>>> print type(np1); np1; type(np2); np2
<type 'array'>
array([[1, 2],
       [3, 4]])
<type 'array'>
array([[1, 2, 3],
       [4, 5, 6]],'i')
>>> print type(na1); na1; type(na2); na2
<class 'numarray.numarraycore.NumArray'>
array([[1, 2],
       [3, 4]])
<class 'numarray.numarraycore.NumArray'>
array([[1, 2, 3],
       [4, 5, 6]])
>>>

The pep module shows how to implement array processing functions which
use the Numeric, numarray or Sequence C-API:

static PyObject *
wysiwyg(PyObject *dummy, PyObject *args)
{
    PyObject *seq1, *seq2;
    PyObject *result;

    if (!PyArg_ParseTuple(args, "OO", &seq1, &seq2))
        return NULL;

    switch(API) {
    case NumericAPI:
    {
        PyObject *np1 = NN_API->toNP(seq1);
        PyObject *np2 = NN_API->toNP(seq2);
        result = np_wysiwyg(np1, np2);
        Py_XDECREF(np1);
        Py_XDECREF(np2);
        break;
    }
    case NumarrayAPI:
    {
        PyObject *na1 = NN_API->toNA(seq1);
        PyObject *na2 = NN_API->toNA(seq2);
        result = na_wysiwyg(na1, na2);
        Py_XDECREF(na1);
        Py_XDECREF(na2);
        break;
    }
    case SequenceAPI:
        result = seq_wysiwyg(seq1, seq2);
        break;
    default:
        PyErr_SetString(PyExc_RuntimeError, "Should never happen");
        return 0;
    }

    return result;
}

See the README for an example session using the pep module showing that
it is possible pass a mix of Numeric and numarray arrays to pep.wysiwyg().

Notes:

- it is straightforward to adapt pep and numnum so that the conversion
  functions are linked into pep instead of imported.

- numnum is still 'proof of concept'.  I am thinking about methods to
  make those techniques safer if the numarray (and Numeric?) header
  files make it never into the Python headers (or make it safer to
  use those techniques with Python < 2.4).  In particular it would
  be helpful if the numerical C-APIs export an API version number,
  similar to the versioning scheme of shared libraries -- see the
  libtool->versioning info pages. 

I am considering three possibilities to release a more polished
version of numnum (3rd party extension writers may prefer to link
rather than import numnum's functionality):

1. release it from PyQwt's project page
2. register an independent numnum project at SourceForge
3. hand numnum over to the Numerical Python project (frees me from
   worrying about API changes).


Regards -- Gerard Vermeulen

-------------- next part --------------
A non-text attachment was scrubbed...
Name: numnum-0.1.tar.gz
Type: application/gzip
Size: 12851 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040718/cf48f61d/attachment.bin>

From jmiller at stsci.edu  Tue Jul 20 05:49:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul 20 05:49:04 2004
Subject: [Numpy-discussion] Follow-up Numarray header PEP
In-Reply-To: <20040718212443.M21561@grenoble.cnrs.fr>
References: <1088451653.3744.200.camel@localhost.localdomain>
	 <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr>
	 <1088536183.17789.346.camel@halloween.stsci.edu>
	 <20040629211800.M55753@grenoble.cnrs.fr>
	 <1088632459.7526.213.camel@halloween.stsci.edu>
	 <20040718212443.M21561@grenoble.cnrs.fr>
Message-ID: <1090327693.3749.257.camel@localhost.localdomain>

On Sun, 2004-07-18 at 17:24, gerard.vermeulen at grenoble.cnrs.fr wrote:
> Hi Todd,
> 
> This is a follow-up on the 'header pep' discussion.

Great!  I was afraid you were going to disappear back into the ether.

Sorry I didn't respond to this yesterday...  I saw it but accidentally
marked it as "read" and then forgot about it as the day went on.

> The attachment numnum-0.1.tar.gz contains the sources for the
> extension modules pep and numnum.  At least on my systems, both
> modules behave as described in the 'numarray header PEP' when the
> extension modules implementing the C-API are not present (a situation
> not foreseen by the macros import_array() of Numeric and especially
> numarray).  

For numarray,  this was *definitely* foreseen at some point,  so I'm
wondering what doesn't work now...

> IMO, my solution is 'bona fide', but requires further
> testing.

I'll look it over today or tomorrow and comment more then.

> The pep module shows how to handle the colliding C-APIs of the Numeric
> and numarray extension modules and how to implement automagical
> conversion between Numeric and numarray arrays.

Nice;  the conversion code sounds like a good addition to me.  

> For a technical reason explained in the README, the hard work of doing
> the conversion between Numeric and numarray arrays has been delegated
> to the numnum module.  The numnum module is useful when one needs to
> convert from one array type to the other to use an extension module
> which only exists for the other type (eg. combining numarray's image
> processing extensions with pygame's Numeric interface):
> 
> Python 2.3+ (#1, Jan  7 2004, 09:17:35)
> [GCC 3.3.1 (SuSE Linux)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numnum; import Numeric as np; import numarray as na
> >>> np1 = np.array([[1, 2], [3, 4]]); na1 = numnum.toNA(np1)
> >>> na2 = na.array([[1, 2, 3], [4, 5, 6]]); np2 = numnum.toNP(na2)
> >>> print type(np1); np1; type(np2); np2
> <type 'array'>
> array([[1, 2],
>        [3, 4]])
> <type 'array'>
> array([[1, 2, 3],
>        [4, 5, 6]],'i')
> >>> print type(na1); na1; type(na2); na2
> <class 'numarray.numarraycore.NumArray'>
> array([[1, 2],
>        [3, 4]])
> <class 'numarray.numarraycore.NumArray'>
> array([[1, 2, 3],
>        [4, 5, 6]])
> >>>
> 
> The pep module shows how to implement array processing functions which
> use the Numeric, numarray or Sequence C-API:
> 
> static PyObject *
> wysiwyg(PyObject *dummy, PyObject *args)
> {
>     PyObject *seq1, *seq2;
>     PyObject *result;
> 
>     if (!PyArg_ParseTuple(args, "OO", &seq1, &seq2))
>         return NULL;
> 
>     switch(API) {

We'll definitely need to cover API in the PEP.  There is a design choice
here which needs to be discussed some and any resulting consensus
documented.  I haven't looked at the attachment yet.

>     case NumericAPI:
>     {
>         PyObject *np1 = NN_API->toNP(seq1);
>         PyObject *np2 = NN_API->toNP(seq2);
>         result = np_wysiwyg(np1, np2);
>         Py_XDECREF(np1);
>         Py_XDECREF(np2);
>         break;
>     }
>     case NumarrayAPI:
>     {
>         PyObject *na1 = NN_API->toNA(seq1);
>         PyObject *na2 = NN_API->toNA(seq2);
>         result = na_wysiwyg(na1, na2);
>         Py_XDECREF(na1);
>         Py_XDECREF(na2);
>         break;
>     }
>     case SequenceAPI:
>         result = seq_wysiwyg(seq1, seq2);
>         break;
>     default:
>         PyErr_SetString(PyExc_RuntimeError, "Should never happen");
>         return 0;
>     }
> 
>     return result;
> }
> 
> See the README for an example session using the pep module showing that
> it is possible pass a mix of Numeric and numarray arrays to pep.wysiwyg().
> 
> Notes:
> 
> - it is straightforward to adapt pep and numnum so that the conversion
>   functions are linked into pep instead of imported.
> 
> - numnum is still 'proof of concept'.  I am thinking about methods to
>   make those techniques safer if the numarray (and Numeric?) header
>   files make it never into the Python headers (or make it safer to
>   use those techniques with Python < 2.4).  In particular it would
>   be helpful if the numerical C-APIs export an API version number,
>   similar to the versioning scheme of shared libraries -- see the
>   libtool->versioning info pages. 

I've thought about this a few times;  there's certainly a need for it in
numarray anyway... and I'm always one release too late.  Thanks for the
tip on libtool->versioning.

> I am considering three possibilities to release a more polished
> version of numnum (3rd party extension writers may prefer to link
> rather than import numnum's functionality):
> 
> 1. release it from PyQwt's project page
> 2. register an independent numnum project at SourceForge
> 3. hand numnum over to the Numerical Python project (frees me from
>    worrying about API changes).
> 
> Regards -- Gerard Vermeulen

(3) sounds best to me, for the same reason that numarray is a part of
the numpy project and because numnum is a Numeric/numarray tool.  There
is a small issue of sub-project organization (seperate bug tracking,
etc.),  but I figure if SF can handle Python,  it can handle Numeric,
numarray, and probably a number of other packages as well.  Something
like numnum should not be a problem and so to promote it, it would be
good to keep it where people can find it without having to look too
hard.

For now,  I'm again marking your post as "unread" and will revisit it
later this week.  In the meantime, thanks very much for your efforts
with numnum and the PEP.

Regards,
Todd


From perry at stsci.edu  Tue Jul 20 09:05:02 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Jul 20 09:05:02 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest of the story
In-Reply-To: <BD22BA00.E9E8%perry@stsci.edu>
Message-ID: <BD22BAC0.E9EB%perry@stsci.edu>


We now turn to the behavior of Records. We'll note that many of the current
proposals had been considered in the past but not implemented with more of a
'wait and see' attitude towards what was really necessary and a desire to
prevent too many ways of doing the same thing without seeing that there was
a real call for them.

This proposal deals with the behavior of record array 'items', i.e., what we
call Record objects now.

The primary issues that have been raised with regard to Record behavior are
summarized as follows:

1) Items should be tuples instead of Records
2) Items should be objects, but present tuple and/or dictionary consistent
behavior.
3) Field (or column) names should be accessible as Record (and record
array) attributes.

Issue 1: Should record array items be tuples instead of Records?

Francesc Alted made this suggestion recently. Essentially the argument is
that tuples are a natural way of representing records. Unfortunately, tuples
do not provide a means of accessing fields of a record by name, but only by
number. For this reason alone, tuples don't appear to be adequate. Francesc
proposed allowing dictionary-like indexing to record arrays to facilitate
the field access to tuple entries by name. However, it seems that if
rarr is a record array, that both rarr['column 1'][2] and rarr[2]['column
1'] should work, not just the former. So the short answer is "No".

It should be noted that using tuples will force another change in current
behavior. Note that the current Record objects are actually views into the
record array. Changing the value within a record object changes the record
array. Use of tuples won't allow that since tuples are not mutable. Whole
records must be changed in their entirety if single elements of record
arrays were set by and returned from tuples.

But his comments (and well as those of others) do point out a number of
problems with the current implementation that could be improved, and making
the Record object support tuple behaviors is quite reasonable. Hence:

Issue 2: Should record array items present tuple and/or dictionary
compatible behaviors?

The short answer is, yes, we do agree that they should. This includes many
of the proposals made including:

1) supporting all Tuple capabilities with the following differences:
    a) fields are mutable (unlike tuple items) so long as the assigned value
is coerceable to the expected type. For example the current methods of doing
so are:

>>> cell = oneRec.field(1)
>>> oneRec.setfield(1, newValue)

This proposal would allow:

>>> cell = oneRec[1]
>>> oneRec[1] = newValue

    b) slice assignments are permitted so long as it doesn't change the size
of the record (i.e., no insertion of extra items) and the items can be
assigned as permitted for a. E.g.,

OneCell[2:4] = (3, 'abc')

    c) __str__ will result in a display looking like that for tuples,
__repr__ will show a Record constructor

>>> print oneRec # as is currently implemented
(1.1, 2, 'abc', 3)
>>> oneRec
Record((1.1, 2, 'abc', 3), formats=['1Float32', '1Int16', '1a3', '1Int32'])
    names=['abc', 'c2', 'xyz', 'c4'])

(note that how best to handle formats is still being thought about)

2) supporting all Dictionary capabilities with the following differences:
    a) keys and items are ordered.
    b) keys are restricted to being integers or strings only
    c) new keys cannot be dynamically added or deleted as for dictionaries
    d) no support for any other dictionary capabilities that can change the
number or names of items
    e) __str__ will not show a result looking like a dictionary (see 1c)
    f) values must meet Record object required type (or be coerceable to it)
    
For example the current

>>> cell = onRec.field('c2')
>>> oneRec.setfield('c2', newValue)

And the proposed added indexing capability:

>>> cell = oneRec['c2']
>>> oneRec['c2'] = newValue

Issue 3: Field (or column) names should be accessible as Record (and record
array) attributes.

As much as the attribute approach has appeal for simple usage, the problems
of name collisions and mismatches between acceptable field names
and attribute names strikes us as it does Russell Owen as being very
problematic. The technique of using a special attribute as Francesc suggests
(in his case, cols) that contains the field name attributes solves the name
collision problem, but not the legality issue (particularly with regard to
illegal characters, it's hard to imagine easily remembered mappings between
legal attribute representations and the actual field name. We are inclined
to try to pass (for now anyway) on mapping fields to attributes in any way.
It seems to us that indexing by name should be convenient enough, as well as
fully flexible to really satisfy all needs (and is needed in any case since
attributes are a clumsy way to use field access when using a variable to
specify the field (yes, one can use  getattr(), but it's clumsy)

*******************************************

Record array behavior changes:

1) It will be possible to assign any sequence to a record array item so long
as the sequence contains the right number of fields, and each item of the
sequence can be coerced to what the record array expects for the
corresponding field of the record. (addressing numarray feature request
928473 by Russell Owen).

I.e.,

>>> recArr[1] = (2, 3.2, 'xyz', 3)

2) One may assign a record to a record array so long as the record matches
the format of the record format of the record array (current behavior).
3) Easier construction and initialization of recarrays with default field
values as requested in numarray bug report 928479)
4) Support for lists of field names and formats as detailed in numarray bug
report 928488.
5) Field name indexing for record arrays. It will be possible to index
record arrays with a field name, i.e., if the index is a string, then what
will be returned is a numarray/chararray for that column. (Note that it
won't be possible to index record arrays by field number for obvious
reasons).

I.e. Currently

>>> col = recArr.field('doc')

Can also be

>>> col = recArr['abc']

But the current

>>> col = recArr.field(1)

Cannot become

>>> col = recArr[1]

On the other hand, it will not be permitted to mix a field index with an
array index in the same brackets, e.g., rarr[10, 'column 2'] will not be
supported. Allowing indexing to have two different interpretations is a bit
worrying. But if record array items may be indexed in this manner, it seems
natural to permit the same indexing for the record array. Mixing the two
kinds of indexing in one index seems of limited usefulness in the first
place and it makes inheriting the existing indexing machinery for NDArrays
more complicated (any efficiency gains in avoiding the intermediate object
creation by using two separate index operations will likely be offset by the
slowness of handling much more complicated mixed indices). Perhaps someone
can argue for why mixing field indices with array indices is important, but
for now we will prohibit this mode of indexing.

This does point to a possible enhancement for the field indexing, namely
being able to provide the equivalent of index arrays (e.g., a list of field
names) to generate a new  record array with a subset of fields.

Are there any other issues that should be addressed for improving record
arrays?


From rowen at u.washington.edu  Tue Jul 20 10:15:05 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Tue Jul 20 10:15:05 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest
 of the story
In-Reply-To: <BD22BAC0.E9EB%perry@stsci.edu>
References: <BD22BAC0.E9EB%perry@stsci.edu>
Message-ID: <p06110401bd22f702c420@[128.95.99.44]>

At 12:04 PM -0400 2004-07-20, Perry Greenfield wrote:
>...(a detailed summary of proposed changes to numarray record arrays)

+1 on all of it with one exception noted below. This sounds like a 
first-rate overhaul and is much appreciated.

Will it be possible, when creating a new records array, to specify 
types of a record array as a list of normal numarray types? Currently 
one has to specify the types as a "formats" string, which is 
nonstandard.

I'm unhappy about one proposal:
>...
>Record array behavior changes:
>...
>5) Field name indexing for record arrays. It will be possible to index
>record arrays with a field name, i.e., if the index is a string, then what
>will be returned is a numarray/chararray for that column. (Note that it
>won't be possible to index record arrays by field number for obvious
>reasons).
>
>I.e. Currently
>
>>>>  col = recArr.field('doc')
>
>Can also be
>
>>>>  col = recArr['abc']
>
>But the current
>
>>>>  col = recArr.field(1)
>
>Cannot become
>
>>>>  col = recArr[1]

I think recarray[field name] is too easily confused with 
recarray[index] and is unnecessary.

I suggest one of two solutions:
- Do nothing. Make users use field(field name or index)
or
- Allow access to the fields via an indexable entity. Simplest for 
the user would be to use "field" itself:
   recArr.field[1]
   recArr.field["abc"]
(i.e. field becomes an object that can be called or can be accessed 
via __getitem__)

This could easily support index arrays (a topic you brought up and 
that sound appealing to me):
   recArr.field[index array]
and it might even be practical to support:
   recArr.field[sequence of field indices and/or names]
e.g.
   recArr.field[(ind 1, field name 2, ind 3...)]

You asked about other issues. One that comes to mind is record arrays 
of record arrays. Should they be allowed? My gut reaction is yes if 
it's not too hard. Folks always seem to find a use for generality if 
it's offered. On the other hand, if it's hard, it's not worth the 
effort. If they are allowed, users are going to want some efficient 
way to get to a particular field (i.e. in one call even if the field 
is several recArrays deep). That could get messy.

Thanks for a great posting. The improvements to record arrays sound first-rate.

-- Russell


From hsu at stsci.edu  Wed Jul 21 11:53:40 2004
From: hsu at stsci.edu (Jin-chung Hsu)
Date: Wed Jul 21 11:53:40 2004
Subject: [Numpy-discussion] formats in record array
Message-ID: <200407211850.AOO09987@donner.stsci.edu>

> From: Russell E Owen <rowen at u.washington.edu>
> Subject: Re: [Numpy-discussion] Proposed record array behavior: the rest
> of the story


> Will it be possible, when creating a new records array, to specify 
> types of a record array as a list of normal numarray types? Currently 
> one has to specify the types as a "formats" string, which is 
> nonstandard.

In theory it is easy to do that except you can't specify cell arrays, i.e.
how do you specify the equivalent of:

formats=['3Int16', '(4,5)Float32']

with the numarray type instances?

JC Hsu


From rlw at stsci.edu  Wed Jul 21 12:23:07 2004
From: rlw at stsci.edu (Rick White)
Date: Wed Jul 21 12:23:07 2004
Subject: [Numpy-discussion] formats in record array
In-Reply-To: <200407211850.AOO09987@donner.stsci.edu>
Message-ID: <Pine.GSO.4.44.0407211513280.29474-100000@sundog.stsci.edu>

On Wed, 21 Jul 2004, Jin-chung Hsu wrote:

> > From: Russell E Owen <rowen at u.washington.edu>
> > Subject: Re: [Numpy-discussion] Proposed record array behavior: the rest
> > of the story
> >
> > Will it be possible, when creating a new records array, to specify
> > types of a record array as a list of normal numarray types? Currently
> > one has to specify the types as a "formats" string, which is
> > nonstandard.
>
> In theory it is easy to do that except you can't specify cell arrays, i.e.
> how do you specify the equivalent of:
>
> formats=['3Int16', '(4,5)Float32']
>
> with the numarray type instances?
>
> JC Hsu

Well, how about one (or both) of these:

formats = 3*(Int16,), 4*(5*(Float32,),)

formats = (3,Int16), ((4,5), Float32)


From kyeser at earthlink.net  Wed Jul 21 18:19:07 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Wed Jul 21 18:19:07 2004
Subject: [Numpy-discussion] Is there a better way to do this?
Message-ID: <16A7C641-DB7D-11D8-A37A-000393479EE8@earthlink.net>

My question is not directly related to NumPy, but since many people 
here deal with numbers, I was wondering if I could get some help; it 
would be even better if there is a NumPy (or Numarray) function that 
takes care of what I want!

I'm trying to write a program that computes six-digit numbers, in which 
the left digit is always smaller than its following digit (i.e., it's 
always ascending).  The best I could do was to have many embedded 'for' 
statement:

c = 1
for p0 in range(0, 7):
   for p1 in range(1, 12):
     for p2 in range(2, 12):
       for p3 in range(3, 12):
         for p4 in range(4, 12):
           for p5 in range(5, 12):
             if p0 < p1 < p2 < p3 < p4 < p5:
               print repr(c).rjust(3), "\t",
               print "%X %X %X %X %X %X" % (p0, p1, p2, p3, p4, p5)
               c += 1
print "...Done"

This works, except that it's very slow.  I need to get it up to 
nine-digit numbers, in which case it's significantly slow.  I was 
wondering if there is a more efficient way to do this.

I would highly appreciate it if anyone could help.

Many thanks.

-Kye


From jcollins_boulder at earthlink.net  Wed Jul 21 18:49:10 2004
From: jcollins_boulder at earthlink.net (Jeffery D. Collins)
Date: Wed Jul 21 18:49:10 2004
Subject: [Numpy-discussion] Is there a better way to do this?
In-Reply-To: <16A7C641-DB7D-11D8-A37A-000393479EE8@earthlink.net>
References: <16A7C641-DB7D-11D8-A37A-000393479EE8@earthlink.net>
Message-ID: <40FF1D11.8090606@earthlink.net>

Hee-Seng Kye wrote:

> My question is not directly related to NumPy, but since many people 
> here deal with numbers, I was wondering if I could get some help; it 
> would be even better if there is a NumPy (or Numarray) function that 
> takes care of what I want!
>
> I'm trying to write a program that computes six-digit numbers, in 
> which the left digit is always smaller than its following digit (i.e., 
> it's always ascending).  The best I could do was to have many embedded 
> 'for' statement:
>
> c = 1
> for p0 in range(0, 7):
>   for p1 in range(1, 12):
>     for p2 in range(2, 12):
>       for p3 in range(3, 12):
>         for p4 in range(4, 12):
>           for p5 in range(5, 12):
>             if p0 < p1 < p2 < p3 < p4 < p5:
>               print repr(c).rjust(3), "\t",
>               print "%X %X %X %X %X %X" % (p0, p1, p2, p3, p4, p5)
>               c += 1
> print "...Done"
>
> This works, except that it's very slow.  I need to get it up to 
> nine-digit numbers, in which case it's significantly slow.  I was 
> wondering if there is a more efficient way to do this.
>
> I would highly appreciate it if anyone could help.

This appears to give the same results and is significantly faster.

def vers1():
    c = 1
    for p0 in range(0, 7):
        for p1 in range(p0+1, 12):
            for p2 in range(p1+1, 12):
                for p3 in range(p2+1, 12):
                    for p4 in range(p3+1, 12):
                        for p5 in range(p4+1, 12):
                            print repr(c).rjust(3), "\t",
                            print "%X %X %X %X %X %X" % (p0, p1, p2, p3, 
p4, p5)
                            c += 1
    print "...Done"


>
> Many thanks.
>
> -Kye
>
--
Jeff


From rlw at stsci.edu  Wed Jul 21 22:03:03 2004
From: rlw at stsci.edu (Rick White)
Date: Wed Jul 21 22:03:03 2004
Subject: [Numpy-discussion] Is there a better way to do this?
In-Reply-To: <16A7C641-DB7D-11D8-A37A-000393479EE8@earthlink.net>
Message-ID: <Pine.GSO.4.44.0407212226350.3927-100000@sundog.stsci.edu>

On Wed, 21 Jul 2004, Hee-Seng Kye wrote:

> I'm trying to write a program that computes six-digit numbers, in which
> the left digit is always smaller than its following digit (i.e., it's
> always ascending).

Here's another version that is a little faster still:

def f3():
  c = 1
  for p0 in range(0, 7):
    for p1 in range(p0+1, 8):
      for p2 in range(p1+1, 9):
        for p3 in range(p2+1, 10):
          for p4 in range(p3+1, 11):
            for p5 in range(p4+1, 12):
              print repr(c).rjust(3), "\t",
              print "%X %X %X %X %X %X" % (p0, p1, p2, p3, p4, p5)
              c += 1
  print "...Done"

This is plenty fast even for 9-digit numbers.  In fact it gets
a little faster for larger numbers of digits.

This problem is completely equivalent to the problem of finding
all combinations of 6 numbers chosen from the digits 0..11.
If you sort the digits of each combination in ascending order,
you get your numbers.  So if you search for something like
"Python permutations combinations" you can find other algorithms
that work.  Here's a recursive version:

def f4(n, digits=range(12)):
    if n==0:
        return [[]]
    rv = []
    for i in range(len(digits)):
        for cc in f4(n-1,digits[i+1:]):
            rv.append([digits[i]]+cc)
    return rv

That returns a list of all the number sets having n digits.
It's slower than the loop version but is more general.  There
are fast C versions of this sort of thing out there, I think.
				Rick White


From falted at pytables.org  Thu Jul 22 02:47:27 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Jul 22 02:47:27 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest of the story
In-Reply-To: <p06110401bd22f702c420@[128.95.99.44]>
References: <BD22BAC0.E9EB%perry@stsci.edu> <p06110401bd22f702c420@[128.95.99.44]>
Message-ID: <200407221146.41319.falted@pytables.org>

Hi,

I agree that numarray team's overhaul of RecArray access modes is very good
and I agree most of it.

A Dimarts 20 Juliol 2004 19:14, Russell E Owen va escriure:
> I think recarray[field name] is too easily confused with 
> recarray[index] and is unnecessary.

Yeah, maybe you are right.

> I suggest one of two solutions:
> - Do nothing. Make users use field(field name or index)
> or
> - Allow access to the fields via an indexable entity. Simplest for 
> the user would be to use "field" itself:
>    recArr.field[1]
>    recArr.field["abc"]
> (i.e. field becomes an object that can be called or can be accessed 
> via __getitem__)

I prefer the second one. Although I know that you don't like the __getattr__
method, the field object can be used to host one. The main advantage I see
having such a __getattr__ method is that I'm very used to press TAB twice in
the python console with its completion capabilities activated. It would be a
very nice way of interactively discovering the fields of a RecArray object.
I don't know whether this feature is used a lot or not out there, but for me
is just great.  I understand, however, that having to include a map to
suport non-vbalid python names for field names can be quite inconvenient.

Regards,

-- 
Francesc Alted


From cjw at sympatico.ca  Thu Jul 22 05:22:01 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Thu Jul 22 05:22:01 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest of
 the story
In-Reply-To: <200407221146.41319.falted@pytables.org>
References: <BD22BAC0.E9EB%perry@stsci.edu> <p06110401bd22f702c420@[128.95.99.44]> <200407221146.41319.falted@pytables.org>
Message-ID: <40FFB132.10103@sympatico.ca>

Francesc Alted wrote:

>Hi,
>
>I agree that numarray team's overhaul of RecArray access modes is very good
>and I agree most of it.
>
>A Dimarts 20 Juliol 2004 19:14, Russell E Owen va escriure:
>  
>
>>I think recarray[field name] is too easily confused with 
>>recarray[index] and is unnecessary.
>>    
>>
>
>Yeah, maybe you are right.
>
>  
>
>>I suggest one of two solutions:
>>- Do nothing. Make users use field(field name or index)
>>or
>>- Allow access to the fields via an indexable entity. Simplest for 
>>the user would be to use "field" itself:
>>   recArr.field[1]
>>   recArr.field["abc"]
>>(i.e. field becomes an object that can be called or can be accessed 
>>via __getitem__)
>>    
>>
>
>I prefer the second one. Although I know that you don't like the __getattr__
>method, the field object can be used to host one. The main advantage I see
>having such a __getattr__ method is that I'm very used to press TAB twice in
>the python console with its completion capabilities activated. It would be a
>very nice way of interactively discovering the fields of a RecArray object.
>I don't know whether this feature is used a lot or not out there, but for me
>is just great.  I understand, however, that having to include a map to
>suport non-vbalid python names for field names can be quite inconvenient.
>
>Regards,
>  
>
Perry's issue 3.

Perhaps there is a need to separate the name or identifier of a column 
in a RecArray or a field in a Record from its label.  The labels, for 
display purposes, would default to the column names.  The column names 
would default, as at present, to the Cn form.

I like the use of attributes for the column names, it avoids the problem 
Russell Owen mentioned above.
Suppose we have a simple RecArray with the fields "name" and "age", it's 
much simpler to write rec.name or rec.age that rec["name"] or rec["age"].

The problems with the use of attributes, which must be Python names, are 
(1) they cannot have accented or special characters eg ?, ?, @, & * 
etc.  and (2) there is a danger of conflict with existing properties or 
attributes.  My guess is that the special characters would be required 
primarily for display purposes.  Thus, the label could meet that need.

The danger of conflict could be addressed by raising an exception.  
There remains a possible problem where identifiers are passed on from 
some other system, perhaps a database. 

Thus, the primary identifier of a row in a RecArray would be an integer 
index and that of a column or field would be a standard Python 
identifer.  Although, at times, it would be useful to be able to index 
the individual fields (or columns) as part of the usual indexing 
scheme.  Thus rec[2, 3, 4] could identify a record and rec[2, 3, 4].age 
or rec[2, 3, 4, 5] could identify the sixth field in that record.

The use of attributes raises the possibility that one could have nested 
records.  For example, suppose one has an address record:

addressRecord
   streetNumber
   streetName
   postalCode
   ...

There could then be a personal record:
personRecord
   ...
   officeAddress
   homeAddress
   ...

One could address a component as rec.homeAddress.postalCode.

Finally, there was mention, earlier in the discussion, of facilitating 
the indexing of a RecArray.  I hope that some way will be found to do this.

Colin W.


From kyeser at earthlink.net  Thu Jul 22 13:24:06 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Thu Jul 22 13:24:06 2004
Subject: [Numpy-discussion] Is there a better way to do this?
In-Reply-To: <Pine.GSO.4.44.0407212226350.3927-100000@sundog.stsci.edu>
References: <Pine.GSO.4.44.0407212226350.3927-100000@sundog.stsci.edu>
Message-ID: <FF40270C-DC1C-11D8-B198-000393479EE8@earthlink.net>

Thanks a lot everyone for suggestions.  On my slow machine (667 MHz), 
inefficient programs run even slower, and when I expand the program to 
calculate 9-digit numbers, there is almost a 2-minute difference!

Thanks again.

Best,

Kye


From sag at hydrosphere.com  Thu Jul 22 15:34:11 2004
From: sag at hydrosphere.com (sag at hydrosphere.com)
Date: Thu Jul 22 15:34:11 2004
Subject: [Numpy-discussion] Unpickling python 2.2 UserArray objs in python 2.3
Message-ID: <40FFF0A2.26467.FBF2E27@localhost>

I have a large bunch of objects that subclass UserArray from Numeric 
22.  These objects were created and pickled in binary mode in 
Python2.2 and stored in a mysql database on Red hat 8.  

Using Python2.2, I can easily retrieve and unpickle the objects.

I have just upgraded the system to Fedora Core 2 which supplies 
Python 2.3.3.  After much hassle, I have been able to compile Numeric 
1.0 (ver 23) and have tried to unpickle these objects.  Now, I get a 
failure in the loads call.  

The code is:
<retrieve blob from mysql>
import cPickle
obj = cPickle.loads(str(blob))

When this is called, the python interpreter (via IDLE) goes into a 
loop in the UserArray __getattr__ function.(line 198):
return getattr(self.array,attr)

>> File "/usr/lib/python2.3/site-packages/Numeric/UserArray.py" line 
198, in __getattr__
>>    return getattr(self.array,attr)

No other error is reported, just a stack full of these lines.

It seems that at this point, UserArray doesn't know that it has an 
'array' attr.

This worked just fine in Python2.2.

Has something changed in Python2.3 cPickle functions or in how 
Numeric 23 handles pickle/unpickle that would make my Python2.2 blobs 
unusable in Python 2.3?  Is there a solution for this, other than 
remaking my blobs (not an option - there are literally millions of 
them), or must I figure out how to access python2.2 for this code?

So far as I can tell, the string I get back is exactly the same for 
both versions.

Any help you can give me would be appreciated.  Thanks

sue giller


From kyeser at earthlink.net  Fri Jul 23 07:31:07 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Fri Jul 23 07:31:07 2004
Subject: [Numpy-discussion] A bit long, but would appreciate anyone's help, if time permits!
Message-ID: <E1393D5C-DCB4-11D8-A3FC-000393479EE8@earthlink.net>

Hi.  Like my previous post, my question is not directly related to 
Numpy, but I couldn't help posting it since many people here deal with 
numbers.  I have a question that requires a bit of explanation.  I 
would highly appreciate it if anyone could read this and offer any 
suggestions, whenever time permits.

I'm trying to write a program that 1) gives all possible rotations of 
an ordered list, 2) chooses the ordering that has the smallest 
difference from first to last element of the rotation, and 3) continues 
to compare the difference from first to second-to-last element, and so 
on, if there was a tie in step 2.

The following is the output of a function I wrote.  The first 6 lines 
are all possible rotations of [0,1,3,6,7,10], and this takes care of 
step 1 mentioned above.  The last line provides the differences (mod 
12).  If the last line were denoted as r, r[0] lists the differences 
from first to last element of each rotation (p0 through p5), r[1] the 
differences from first to second-to-last element, and so on.

 >>> from normal import normal
 >>> normal([0,1,3,6,7,10])
[0, 1, 3, 6, 7, 10]	#p0
[1, 3, 6, 7, 10, 0]	#p1
[3, 6, 7, 10, 0, 1]	#p2
[6, 7, 10, 0, 1, 3]	#p3
[7, 10, 0, 1, 3, 6]	#p4
[10, 0, 1, 3, 6, 7]	#p5

[[10, 11, 10, 9, 11, 9], [7, 9, 9, 7, 8, 8], [6, 6, 7, 6, 6, 5], [3, 5, 
4, 4, 5, 3], [1, 2, 3, 1, 3, 2]]     #r

Here is my question.  I'm having trouble realizing step 2 (and 3, if 
necessary).  In the above case, the smallest number in r[0] is 9, which 
is present in both r[0][3] and r[0][5].  This means that p3 and p5 and 
only p3 and p5 need to be further compared.  r[1][3] is 7, and r[1][5] 
is 8, so the comparison ends here, and the final result I'm looking for 
is p3, [6,7,10,0,1,3] (the final 'n' value for 'pn' corresponds to the 
final 'y' value for 'r[x][y]').

How would I find the smallest values of a list r[0], take only those 
values (r[0][3] and r[0][5]) for further comparison (r[1][3] and 
r[1][5]), and finally print a p3?

Thanks again for reading this.  If there is anything unclear, please 
let me know.

Best,
Kye

My code begins here:
#normal.py
def normal(s):
	s.sort()
	r = []
	q = []
	v = []

	for x in range(0, len(s)):
		k = s[x:]+s[0:x]
		r.append(k)

	for y in range(0, len(s)):
		print r[y], '\t'
		d = []
		for yy in range(len(s)-1, 0, -1):
			w = (r[y][yy]-r[y][0])%12
			d.append(w)
		q.append(d)

	for z in range(0, len(s)-1):
		d = []
		for zz in range(0, len(s)):
			w = q[zz][z]
			d.append(w)
		v.append(d)
	print '\n', v


From sag at hydrosphere.com  Fri Jul 23 10:09:11 2004
From: sag at hydrosphere.com (sag at hydrosphere.com)
Date: Fri Jul 23 10:09:11 2004
Subject: [Numpy-discussion] re: Unpickling python 2.2 userArray objs in python 2.3
Message-ID: <4100F5DD.17007.13BB9C82@localhost>

I have further information on my problem of unpickling an object that 
is based on Numeric.UserArray class.

I can recreate the endless getattr loop with the following code, 
which is a small subsection of my class:

data = Numeric.ones(31,savespace=1)
ua = UserArray(data)
blob = cPickle.dumps(ua)
obj = cPickle.loads(blob)      <-- fails here

If you pickle the data obj, everything works.  This code works in 
Python2.2.

Is this a bug?  Is it fixable?

sue


From jmiller at stsci.edu  Fri Jul 23 10:30:15 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jul 23 10:30:15 2004
Subject: [Numpy-discussion] Follow-up Numarray header PEP
In-Reply-To: <20040718212443.M21561@grenoble.cnrs.fr>
References: <1088451653.3744.200.camel@localhost.localdomain>
	 <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr>
	 <1088536183.17789.346.camel@halloween.stsci.edu>
	 <20040629211800.M55753@grenoble.cnrs.fr>
	 <1088632459.7526.213.camel@halloween.stsci.edu>
	 <20040718212443.M21561@grenoble.cnrs.fr>
Message-ID: <1090603727.7138.33.camel@halloween.stsci.edu>

Hi Gerard,

I finally got to your numnum stuff today... awesome work!  You've got
lots of good suggestions.  Here are some comments:

1. Thanks for catching the early return problem with numarray's
import_array().  It's not just bad, it's wrong.  It'll be fixed for 1.1.

2. That said,  I think expanding the macros in-line in numnum is a
mistake.  It seems to me that "import_array(); PyErr_Clear();" or
something like it ought to be enough...  after numarray-1.1 anyway.  

3. I think there's a problem in numnum.toNP() because of numarray's
array "behavior" issues.  A test needs to be done to ensure that the
incoming array is not byteswapped or misaligned;  if it is, the easy fix
is to make a numarray copy of the array before copying it to Numeric.

4. Kudos for the LP64 stuff.  numconfig is a thorn in the side of the
PEP,  so I'll put your techniques into numarray for 1.1.  HAS_FLOAT128
is not currently used,  so it might be time to ditch it.  Anyway,
thanks!

5. PyArray_Present() and isArray() are superfluous *now*.  I was
planning to add them to Numeric. 

6. The LGPL may be a problem for us and is probably an issue if we ever
try to get numnum into the Python distribution.  It would be better to
release numnum under the modified BSD license,  same as numarray.

7. Your API struct was very clean.  Eventually I'll regenerate numarray
like that.

8. I logged your comments and bug reports on Source Forge and eventually
they'll get fixed.

A to Z the numnum/pep code is beautiful.  Next stop, header PEP update.

Regards,
Todd

On Sun, 2004-07-18 at 17:24, gerard.vermeulen at grenoble.cnrs.fr wrote: 
> Hi Todd,
> 
> This is a follow-up on the 'header pep' discussion.
> 
> The attachment numnum-0.1.tar.gz contains the sources for the
> extension modules pep and numnum.  At least on my systems, both
> modules behave as described in the 'numarray header PEP' when the
> extension modules implementing the C-API are not present (a situation
> not foreseen by the macros import_array() of Numeric and especially
> numarray).  IMO, my solution is 'bona fide', but requires further
> testing.
> 
> The pep module shows how to handle the colliding C-APIs of the Numeric
> and numarray extension modules and how to implement automagical
> conversion between Numeric and numarray arrays.
> 
> For a technical reason explained in the README, the hard work of doing
> the conversion between Numeric and numarray arrays has been delegated
> to the numnum module.  The numnum module is useful when one needs to
> convert from one array type to the other to use an extension module
> which only exists for the other type (eg. combining numarray's image
> processing extensions with pygame's Numeric interface):
> 
> Python 2.3+ (#1, Jan  7 2004, 09:17:35)
> [GCC 3.3.1 (SuSE Linux)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numnum; import Numeric as np; import numarray as na
> >>> np1 = np.array([[1, 2], [3, 4]]); na1 = numnum.toNA(np1)
> >>> na2 = na.array([[1, 2, 3], [4, 5, 6]]); np2 = numnum.toNP(na2)
> >>> print type(np1); np1; type(np2); np2
> <type 'array'>
> array([[1, 2],
>        [3, 4]])
> <type 'array'>
> array([[1, 2, 3],
>        [4, 5, 6]],'i')
> >>> print type(na1); na1; type(na2); na2
> <class 'numarray.numarraycore.NumArray'>
> array([[1, 2],
>        [3, 4]])
> <class 'numarray.numarraycore.NumArray'>
> array([[1, 2, 3],
>        [4, 5, 6]])
> >>>
> 
> The pep module shows how to implement array processing functions which
> use the Numeric, numarray or Sequence C-API:
> 
> static PyObject *
> wysiwyg(PyObject *dummy, PyObject *args)
> {
>     PyObject *seq1, *seq2;
>     PyObject *result;
> 
>     if (!PyArg_ParseTuple(args, "OO", &seq1, &seq2))
>         return NULL;
> 
>     switch(API) {
>     case NumericAPI:
>     {
>         PyObject *np1 = NN_API->toNP(seq1);
>         PyObject *np2 = NN_API->toNP(seq2);
>         result = np_wysiwyg(np1, np2);
>         Py_XDECREF(np1);
>         Py_XDECREF(np2);
>         break;
>     }
>     case NumarrayAPI:
>     {
>         PyObject *na1 = NN_API->toNA(seq1);
>         PyObject *na2 = NN_API->toNA(seq2);
>         result = na_wysiwyg(na1, na2);
>         Py_XDECREF(na1);
>         Py_XDECREF(na2);
>         break;
>     }
>     case SequenceAPI:
>         result = seq_wysiwyg(seq1, seq2);
>         break;
>     default:
>         PyErr_SetString(PyExc_RuntimeError, "Should never happen");
>         return 0;
>     }
> 
>     return result;
> }
> 
> See the README for an example session using the pep module showing that
> it is possible pass a mix of Numeric and numarray arrays to pep.wysiwyg().
> 
> Notes:
> 
> - it is straightforward to adapt pep and numnum so that the conversion
>   functions are linked into pep instead of imported.
> 
> - numnum is still 'proof of concept'.  I am thinking about methods to
>   make those techniques safer if the numarray (and Numeric?) header
>   files make it never into the Python headers (or make it safer to
>   use those techniques with Python < 2.4).  In particular it would
>   be helpful if the numerical C-APIs export an API version number,
>   similar to the versioning scheme of shared libraries -- see the
>   libtool->versioning info pages. 
> 
> I am considering three possibilities to release a more polished
> version of numnum (3rd party extension writers may prefer to link
> rather than import numnum's functionality):
> 
> 1. release it from PyQwt's project page
> 2. register an independent numnum project at SourceForge
> 3. hand numnum over to the Numerical Python project (frees me from
>    worrying about API changes).
> 
> 
> Regards -- Gerard Vermeulen


-- 


From eric at enthought.com  Fri Jul 23 10:56:07 2004
From: eric at enthought.com (eric jones)
Date: Fri Jul 23 10:56:07 2004
Subject: [Numpy-discussion] ANN: SciPy04 -- Last day for abstracts and early registration!
Message-ID: <4101510B.9050005@enthought.com>

Hey Group,

Just a reminder that this is the last day to submit abstracts for 
SciPy04.  It is also the last day for early registration.

More information is here:

http://www.scipy.org/wikis/scipy04

About the Conference and Keynote Speaker
---------------------------------------------
The 1st annual *SciPy Conference* will be held this year at Caltech, 
September 2-3, 2004.  As some of you may know, we've experienced great 
participation in two SciPy "Workshops" (with ~70 attendees in both 2002 
and 2003) and this year we're graduating to a "conference."  With the 
prestige of a conference comes the responsibility of a keynote address.  
This year, Jim Hugunin has answered the call and will be speaking to 
kickoff the meeting on Thursday September 2nd.  Jim is the creator of 
Numeric Python, Jython, and co-designer of AspectJ. Jim is currently 
working on IronPython--a fast implementation of Python for .NET and Mono.

Presenters
-----------
We still have room for a few more standard talks, and there is plenty of 
room for lightning talks. Because of this, we are extending the abstract 
deadline until July 23rd.  Please send your abstract to 
abstracts at scipy.org.  Travis Oliphant is organizing the presentations 
this year. (Thanks!)  Once accepted, papers and/or presentation slides 
are acceptable and are due by August 20, 2004.

Registration
-------------
Early registration ($100.00) has been extended to July 23rd.  Follow the 
links off of the main conference site:

http://www.scipy.org/wikis/scipy04

After July 23rd, registration will be $150.00.  Registration includes 
breakfast and lunch Thursday & Friday and a very nice dinner Thursday 
night.  Please register as soon as possible as it will help us in 
planning for food, room sizes, etc.

Sprints
--------
As of now, we really haven't had much of a call for coding sprints for 
the 3 days prior to SciPy 04.  Below is the original announcement about 
sprints.  If you would like to suggest a topic and see if others are 
interested, please send a message to the list.  Otherwise, we'll forgo 
the sprints session this year.

   We're also planning three days of informal "Coding Sprints" prior to
   the conference -- August 30 to September 1, 2004.  Conference
   registration is not required to participate in the sprints.  Please
   email the list, however, if you plan to attend.  Topics for these
   sprints will be determined via the mailing lists as well, so please
   submit any suggestions for topics to the scipy-user list:

   list signup: http://www.scipy.org/mailinglists/
   list address: scipy-user at scipy.org


thanks,
eric


From cjw at sympatico.ca  Sat Jul 24 07:18:04 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Sat Jul 24 07:18:04 2004
Subject: [Numpy-discussion] A bit long, but would appreciate anyone's
 help, if time permits!
In-Reply-To: <E1393D5C-DCB4-11D8-A3FC-000393479EE8@earthlink.net>
References: <E1393D5C-DCB4-11D8-A3FC-000393479EE8@earthlink.net>
Message-ID: <41026F91.3090706@sympatico.ca>

Hee-Seng Kye wrote:

> Hi.  Like my previous post, my question is not directly related to Numpy, 

True, but numarray can be of help.

> but I couldn't help posting it since many people here deal with 
> numbers.  I have a question that requires a bit of explanation.  I 
> would highly appreciate it if anyone could read this and offer any 
> suggestions, whenever time permits.
>
> I'm trying to write a program that 1) gives all possible rotations of 
> an ordered list, 2) chooses the ordering that has the smallest 
> difference from first to last element of the rotation, and 3) 
> continues to compare the difference from first to second-to-last 
> element, and so on, if there was a tie in step 2.
>
> The following is the output of a function I wrote.  The first 6 lines 
> are all possible rotations of [0,1,3,6,7,10], and this takes care of 
> step 1 mentioned above.  The last line provides the differences (mod 
> 12).  If the last line were denoted as r, r[0] lists the differences 
> from first to last element of each rotation (p0 through p5), r[1] the 
> differences from first to second-to-last element, and so on.
>
> >>> from normal import normal
> >>> normal([0,1,3,6,7,10])
> [0, 1, 3, 6, 7, 10]    #p0
> [1, 3, 6, 7, 10, 0]    #p1
> [3, 6, 7, 10, 0, 1]    #p2
> [6, 7, 10, 0, 1, 3]    #p3
> [7, 10, 0, 1, 3, 6]    #p4
> [10, 0, 1, 3, 6, 7]    #p5
>
> [[10, 11, 10, 9, 11, 9], [7, 9, 9, 7, 8, 8], [6, 6, 7, 6, 6, 5], [3, 
> 5, 4, 4, 5, 3], [1, 2, 3, 1, 3, 2]]     #r
>
> Here is my question.  I'm having trouble realizing step 2 (and 3, if 
> necessary).  In the above case, the smallest number in r[0] is 9, 
> which is present in both r[0][3] and r[0][5].  This means that p3 and 
> p5 and only p3 and p5 need to be further compared.  r[1][3] is 7, and 
> r[1][5] is 8, so the comparison ends here, and the final result I'm 
> looking for is p3, [6,7,10,0,1,3] (the final 'n' value for 'pn' 
> corresponds to the final 'y' value for 'r[x][y]').
>
> How would I find the smallest values of a list r[0], take only those 
> values (r[0][3] and r[0][5]) for further comparison (r[1][3] and 
> r[1][5]), and finally print a p3?
>
> Thanks again for reading this.  If there is anything unclear, please 
> let me know.
>
> Best,
> Kye
>
> My code begins here: 

[snip]
The following reproduces your result, but I'm not sure that it does what 
you want to do.

Best wishes.

Colin W.

# Kye.py
#normal.py
def normal(s):
    s.sort()
    r = []
    q = []
    v = []

    for x in range(0, len(s)):
        k = s[x:]+s[0:x]
        r.append(k)

    for y in range(0, len(s)):
        print r[y], '\t'
        d = []
        for yy in range(len(s)-1, 0, -1):
            w = (r[y][yy]-r[y][0])%12
            d.append(w)
        q.append(d)

    for z in range(0, len(s)-1):
        d = []
        for zz in range(0, len(s)):
            w = q[zz][z]
            d.append(w)
        v.append(d)
    print '\n', v

def findMinima(i, lst):
  global diff
  print 'lst:', lst, 'i:', i
  res= []
  dataRow= diff[i].take(lst)
  fnd= dataRow.argmin()
  val= val0= dataRow[fnd]
  while val == val0:
    fndRes= lst[fnd]                  # This will become the result iff 
no dupicate found
    res.append(fnd)
    dataRow[fnd]= 100
    fnd= dataRow.argmin()
    val0= dataRow[fnd]
  if len(res) == 1:
    return fndRes
  else:
    ret= findMinima(i-1, res) 
    return ret

def normal1(s):
  import numarray.numarraycore as _num
  import numarray.numerictypes as _nt
  global diff
  s= _num.array(s)
  s.sort()
  rl= len(s)
  r= _num.zeros(shape= (rl, rl), type= _nt.Int)
  for i in range(rl):
    r[i, 0:rl-i]= s[i:]
    if i:
      r[i, rl-i:]= s[0:i]
  subtr= r[0].repeat(5, 1).resize(6, 5)
  subtr.transpose()
  neg= r[1:] < subtr
  diff= r[1:]-subtr + 12 * neg
 
  return 'The selectect rotation is:', r[findMinima(diff._shape[0]-1, 
range(diff._shape[1]))]

if __name__ == '__main__':
  print normal1([0,1,3,6,7,10])


>
> #normal.py
> def normal(s):
>     s.sort()
>     r = []
>     q = []
>     v = []
>
>     for x in range(0, len(s)):
>         k = s[x:]+s[0:x]
>         r.append(k)
>
>     for y in range(0, len(s)):
>         print r[y], '\t'
>         d = []
>         for yy in range(len(s)-1, 0, -1):
>             w = (r[y][yy]-r[y][0])%12
>             d.append(w)
>         q.append(d)
>
>     for z in range(0, len(s)-1):
>         d = []
>         for zz in range(0, len(s)):
>             w = q[zz][z]
>             d.append(w)
>         v.append(d)
>     print '\n', v
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by BEA Weblogic Workshop
> FREE Java Enterprise J2EE developer tools!
> Get your free copy of BEA WebLogic Workshop 8.1 today.
> http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From riiuwjjnivge at yahoo.com  Sat Jul 24 08:38:04 2004
From: riiuwjjnivge at yahoo.com (riiuwjjnivge at yahoo.com)
Date: Sat Jul 24 08:38:04 2004
Subject: [Numpy-discussion] Hot Stock Newsflash, ARMM expecting Mass|ve M0nday Ga1ns R753KT98
Message-ID: <249974lbl4oi11j$1so1q6g39$95a678wba@airmen.yahoo.com>

E.fficiency Technologies, Inc.'s New Centrif.ugal Chiller Efficiency and Management Tool Can He.lp S.ave Industry Bi.llions in Energy C.osts 

ARMM lau.nch n.ew s.ervice (EffHVAC)

D.ont miss this g.reat inves.tment issue! ARMM is another ho.t public tr.aded comp.any that is set to so.ar on Monday, July 26th..


BIG PR camp.aign sta.rting on 26th of July for ARMM - S.t0ck will e.xpl0de - Just read the news 

---------------------
P.rice on Friday: 10Cents
In our o.pinion N.ext 3 days p.otential p.rice: 35Cents
In our o.pinion N.ext 10 days p.otential p.rice: 45Cents
---------------------

G.et on B.oard with ARMM and e.njoy some i.ncredible p.rofits in the n.ext 3-10 days_!_!

ALL T.ECHNICAL I.NDICATORS SAY - B.U.Y ARMM @ up to 35cents!

Significant short term t.rading p.rofits in ARMM are being p.redicted, great n.ews a.lready issued by the c.ompany and big PR c.ampaign on the way in the n.ext few days.
 

C.OMPANY P.ROFILE
-------------->
American Resource Management, Inc., through its w.holly-owned s.ubsidiary, E.fficiency T.echnologies, Inc. ("EffTec") is a Tulsa, Oklahoma based c.ompany d.edicated to developing energy efficiency m.onitoring programs for c.ommercial/i.ndustrial HVAC systems principally made up of c.entrifugal chillers and boilers.  Centrifugal chillers are the single largest energy-using components in most facilities and can typically consume more than 50% of the total electrical usage.  Centrifugal chillers running inefficiently result in substantially higher e.nergy c.osts, decreased equipment reliability and shortened l.ifespan.


EffTec has developed a p.owerful, easy-to-use, online d.iagnostic s.ervice called EffHVAC that gives f.acilities the a.bility to document, m.onitor, e.valuate and m.anage c.entrifugal c.hiller system p.erformance.  EffHVAC c.reated detailed reports that contain a w.ealth of i.nformation that can be used to improve operations and save t.housands of d.ollars in u.tility c.osts.
 

EffTec offers c.omprehensive and f.lexible HVAC consulting and training.  Our t.eam consists of industry-recognized e.xperts in HVAC system design, efficiency, preventive and proactive maintenance, repair, chemistry, computer programming and m.arketing.  Combine EffHVAC with our consulting services and start d.eveloping a w.orld-class HVAC program to improve your b.ottom line.  

   
Inform.ation within this email contains "f.orward look.ing state.ments" within the meaning of Sect.ion 27A of the Sec.urities Ac.t of 1933 and Sect.ion 21B of the Securit.ies Exc.hange Ac.t of 1934. Any stat.ements that express or involve discu.ssions with resp.ect to pre.dictions, goa.ls, expec.tations, be.liefs, pl.ans, proje.ctions, object.ives, assu.mptions or fut.ure eve.nts or perform.ance are not stat.ements of histo.rical fact and may be "forw.ard loo.king stat.ements."  For.ward looking state.ments are based on expect.ations, estim.ates and project.ions at the time the statem.ents are made that involve a number of risks and uncertainties which could cause actual results or events to differ materially from those prese.ntly anticipated. Forward look.ing statements in this action may be identified through the use of words su.ch as: "pro.jects", "for.esee", "expects", "est.imates," "be.lieves," "underst.ands" "wil.l," "part of: "anticip.ates," or that by stat.ements indi.cating certain actions "may,"
"cou.ld," or "might" occur. All information provided within this em.ail pertai.ning to inv.esting, st.ocks, securi.ties must be under.stood as informa.tion provided and not investm.ent advice. Eme.rging Equity Al.ert advi.ses all re.aders and subscrib.ers to seek advice from a registered profe.ssional secu.rities represent.ative before dec.iding to trade in sto.cks featured within this ema.il. None of the mate.rial within this rep.ort shall be constr.ued as any kind of invest.ment advi.ce. Please have in mind that the interpr.etation of the witer of this newsl.etter about the news published by the company does not represent the com.pany official sta.tement and in fact may differ from the real meaning of what the news rele.ase meant to say. Please read the news release by your.self and judge by yourself about the detai.ls in it.  In compli.ance with Sec.tion 17(b), we discl.ose the hol.ding of ARMM s.hares prior to the publi.cation of this report. Be aware of an inher.ent co.nflict of interest res.ulting from such holdi.ngs due to our intent to pro.fit from the liqui.dation of these shares. Sh.ares may be s.old at any time, even after posi.tive state.ments have been made regard.ing the above company. Since we own sh.ares, there is an inher.ent conf.lict of inte.rest in our statem.ents and opin.ions. Readers of this publi.cation are cauti.oned not to place und.ue relia.nce on forw.ard-looki.ng statements, which are based on certain assump.tions and expectati.ons invo.lving various risks and uncert.ainties, that could cause results to differ materi..ally from those set forth in the forw.ard- looking state.ments. Please be advi.sed that noth.ing within this em.ail shall cons.titute a solic.itation or an offer to buy or sell any s.ecurity menti.oned her.ein. This news.letter is neither a regi.stered inves.tment ad.visor nor affil.iated with any brok.er or dealer. All statements made are our e.xpress o.pinion only and should be treated as such. We may own, buy and sell any securi.ties menti.oned at any time. This r.eport includes forw.ard-looki.ng stat.ements within the meaning of The Pri.vate Securi.ties Litig.ation Ref.orm Ac.t of 1995. These state.ments may include terms as "expe.ct", "bel.ieve", "ma.y", "wi.ll", "mo.ve","und.ervalued" and "inte.nd" or simil.ar terms. This news.letter was paid 11500 dollars from th.ird p.arty to se.nd this report. PL.EASE DO YOUR OWN D.UE DI.LIGENCE B.EFORE INVES.TING IN ANY PRO.FILED COMP.ANY. You may lo.se mon.ey from inve.sting in Pen.ny St.ocks.


A_RM_M - our NEW st<o>ck pick - GREAT N.EWS V650OE49
>A.RMM - our NEW s_t_0_c_k p1ck = GREAT N_E_WS V3501136  
NnnEW St_ock Pick - Hug.e Mon-day - /ArMm\   m468MV68
NewW Stoc-k Pick + Hug.e Mon-day - ArMm = Earn_1ngs 1497cJ72  
Mas_sive G.a1ns - F0r-casted For Mond#y g984iJ69 
Monday F0rcaSST is A>R.M.M - Read & Earnn  Z8697B79
In-Creased Earn-ings Report - AR-MM - For Monday Morning l547BH81 
EX PLO SIVE Gain-s - ALERT for MONDAY T288xC38 
NewsWire - Double your Monday Earn>ings!   q664qv16
A,L,E,R,T - A>R>M>M- This st0ck is h0t - They announced great news  l993L941
A>R<M>M is about to EXPL0DE - A c t n_o_w Z484TE26 
<A/R/M/M> - Ma-jor TradeeE Al_ert! !e330vH15
1O to 2O cent in=crease monday. Ma_jor ALer.t. c8620c55 
New P1ck Bownd to Dou_ble & Tri_ple. A.R/M.M.. I942qD93 
B1gGa1ns For-M0nday = (2X)Double Your Pr0fits!y747s506 
UpCOMING Mondays Hot/test St O CK {2x} PROF!TS L572lS00
Get Ins1ders SEcrEt_s - A|R|M|M Sets to Expl0de U812Jb41 
Ab0ut To Expl0de - <AR.MM>  y142qK13   
Hot Stock Newsflash, ARMM expecting Mass|ve M0nday Ga1ns 7074WE36 
M0nday Ga1ns, *ARMM*, St0ck NewsW1re g504mo93  
{3x} Ur m0nDay Pr0FITS - A\R\M\M w433T229
Break.ing New.s for ARM.M -  American Resource Management, Inc.


E.fficiency Technologies, Inc.'s New Centrif.ugal Chiller Efficiency and Management Tool Can He.lp S.ave Industry Bi.llions in Energy C.osts 

ARMM lau.nch n.ew s.ervice (EffHVAC)

D.ont miss this g.reat inves.tment issue! ARMM is another ho.t public tr.aded comp.any that is set to so.ar on Monday, July 26th..


BIG PR camp.aign sta.rting on 26th of July for ARMM - S.t0ck will e.xpl0de - Just read the news 

---------------------
P.rice on Friday: 10Cents
In our o.pinion N.ext 3 days p.otential p.rice: 35Cents
In our o.pinion N.ext 10 days p.otential p.rice: 45Cents
---------------------

G.et on B.oard with ARMM and e.njoy some i.ncredible p.rofits in the n.ext 3-10 days_!_!

ALL T.ECHNICAL I.NDICATORS SAY - B.U.Y ARMM @ up to 35cents!

Significant short term t.rading p.rofits in ARMM are being p.redicted, great n.ews a.lready issued by the c.ompany and big PR c.ampaign on the way in the n.ext few days.
 

C.OMPANY P.ROFILE
-------------->
American Resource Management, Inc., through its w.holly-owned s.ubsidiary, E.fficiency T.echnologies, Inc. ("EffTec") is a Tulsa, Oklahoma based c.ompany d.edicated to developing energy efficiency m.onitoring programs for c.ommercial/i.ndustrial HVAC systems principally made up of c.entrifugal chillers and boilers.  Centrifugal chillers are the single largest energy-using components in most facilities and can typically consume more than 50% of the total electrical usage.  Centrifugal chillers running inefficiently result in substantially higher e.nergy c.osts, decreased equipment reliability and shortened l.ifespan.


EffTec has developed a p.owerful, easy-to-use, online d.iagnostic s.ervice called EffHVAC that gives f.acilities the a.bility to document, m.onitor, e.valuate and m.anage c.entrifugal c.hiller system p.erformance.  EffHVAC c.reated detailed reports that contain a w.ealth of i.nformation that can be used to improve operations and save t.housands of d.ollars in u.tility c.osts.
 

EffTec offers c.omprehensive and f.lexible HVAC consulting and training.  Our t.eam consists of industry-recognized e.xperts in HVAC system design, efficiency, preventive and proactive maintenance, repair, chemistry, computer programming and m.arketing.  Combine EffHVAC with our consulting services and start d.eveloping a w.orld-class HVAC program to improve your b.ottom line.  

   
Inform.ation within this email contains "f.orward look.ing state.ments" within the meaning of Sect.ion 27A of the Sec.urities Ac.t of 1933 and Sect.ion 21B of the Securit.ies Exc.hange Ac.t of 1934. Any stat.ements that express or involve discu.ssions with resp.ect to pre.dictions, goa.ls, expec.tations, be.liefs, pl.ans, proje.ctions, object.ives, assu.mptions or fut.ure eve.nts or perform.ance are not stat.ements of histo.rical fact and may be "forw.ard loo.king stat.ements."  For.ward looking state.ments are based on expect.ations, estim.ates and project.ions at the time the statem.ents are made that involve a number of risks and uncertainties which could cause actual results or events to differ materially from those prese.ntly anticipated. Forward look.ing statements in this action may be identified through the use of words su.ch as: "pro.jects", "for.esee", "expects", "est.imates," "be.lieves," "underst.ands" "wil.l," "part of: "anticip.ates," or that by stat.ements indi.cating certain actions "may,"
"cou.ld," or "might" occur. All information provided within this em.ail pertai.ning to inv.esting, st.ocks, securi.ties must be under.stood as informa.tion provided and not investm.ent advice. Eme.rging Equity Al.ert advi.ses all re.aders and subscrib.ers to seek advice from a registered profe.ssional secu.rities represent.ative before dec.iding to trade in sto.cks featured within this ema.il. None of the mate.rial within this rep.ort shall be constr.ued as any kind of invest.ment advi.ce. Please have in mind that the interpr.etation of the witer of this newsl.etter about the news published by the company does not represent the com.pany official sta.tement and in fact may differ from the real meaning of what the news rele.ase meant to say. Please read the news release by your.self and judge by yourself about the detai.ls in it.  In compli.ance with Sec.tion 17(b), we discl.ose the hol.ding of ARMM s.hares prior to the publi.cation of this report. Be aware of an inher.ent co.nflict of interest res.ulting from such holdi.ngs due to our intent to pro.fit from the liqui.dation of these shares. Sh.ares may be s.old at any time, even after posi.tive state.ments have been made regard.ing the above company. Since we own sh.ares, there is an inher.ent conf.lict of inte.rest in our statem.ents and opin.ions. Readers of this publi.cation are cauti.oned not to place und.ue relia.nce on forw.ard-looki.ng statements, which are based on certain assump.tions and expectati.ons invo.lving various risks and uncert.ainties, that could cause results to differ materi..ally from those set forth in the forw.ard- looking state.ments. Please be advi.sed that noth.ing within this em.ail shall cons.titute a solic.itation or an offer to buy or sell any s.ecurity menti.oned her.ein. This news.letter is neither a regi.stered inves.tment ad.visor nor affil.iated with any brok.er or dealer. All statements made are our e.xpress o.pinion only and should be treated as such. We may own, buy and sell any securi.ties menti.oned at any time. This r.eport includes forw.ard-looki.ng stat.ements within the meaning of The Pri.vate Securi.ties Litig.ation Ref.orm Ac.t of 1995. These state.ments may include terms as "expe.ct", "bel.ieve", "ma.y", "wi.ll", "mo.ve","und.ervalued" and "inte.nd" or simil.ar terms. This news.letter was paid 11500 dollars from th.ird p.arty to se.nd this report. PL.EASE DO YOUR OWN D.UE DI.LIGENCE B.EFORE INVES.TING IN ANY PRO.FILED COMP.ANY. You may lo.se mon.ey from inve.sting in Pen.ny St.ocks.


barycentric deform conservator cacophony critter addison armament complain difluoride boris discriminatory boron abo deoxyribose boorish compote belfast carolingian court albania accentuate belshazzar bridesmaid breakwater brandish average bolshevism coppery 


From riiuwjjnivge at yahoo.com  Sat Jul 24 08:42:06 2004
From: riiuwjjnivge at yahoo.com (riiuwjjnivge at yahoo.com)
Date: Sat Jul 24 08:42:06 2004
Subject: [Numpy-discussion] Hot Stock Newsflash, ARMM expecting Mass|ve M0nday Ga1ns R753KT98
Message-ID: <249974lbl4oi11j$1so1q6g39$95a678wba@airmen.yahoo.com>

E.fficiency Technologies, Inc.'s New Centrif.ugal Chiller Efficiency and Management Tool Can He.lp S.ave Industry Bi.llions in Energy C.osts 

ARMM lau.nch n.ew s.ervice (EffHVAC)

D.ont miss this g.reat inves.tment issue! ARMM is another ho.t public tr.aded comp.any that is set to so.ar on Monday, July 26th..


BIG PR camp.aign sta.rting on 26th of July for ARMM - S.t0ck will e.xpl0de - Just read the news 

---------------------
P.rice on Friday: 10Cents
In our o.pinion N.ext 3 days p.otential p.rice: 35Cents
In our o.pinion N.ext 10 days p.otential p.rice: 45Cents
---------------------

G.et on B.oard with ARMM and e.njoy some i.ncredible p.rofits in the n.ext 3-10 days_!_!

ALL T.ECHNICAL I.NDICATORS SAY - B.U.Y ARMM @ up to 35cents!

Significant short term t.rading p.rofits in ARMM are being p.redicted, great n.ews a.lready issued by the c.ompany and big PR c.ampaign on the way in the n.ext few days.
 

C.OMPANY P.ROFILE
-------------->
American Resource Management, Inc., through its w.holly-owned s.ubsidiary, E.fficiency T.echnologies, Inc. ("EffTec") is a Tulsa, Oklahoma based c.ompany d.edicated to developing energy efficiency m.onitoring programs for c.ommercial/i.ndustrial HVAC systems principally made up of c.entrifugal chillers and boilers.  Centrifugal chillers are the single largest energy-using components in most facilities and can typically consume more than 50% of the total electrical usage.  Centrifugal chillers running inefficiently result in substantially higher e.nergy c.osts, decreased equipment reliability and shortened l.ifespan.


EffTec has developed a p.owerful, easy-to-use, online d.iagnostic s.ervice called EffHVAC that gives f.acilities the a.bility to document, m.onitor, e.valuate and m.anage c.entrifugal c.hiller system p.erformance.  EffHVAC c.reated detailed reports that contain a w.ealth of i.nformation that can be used to improve operations and save t.housands of d.ollars in u.tility c.osts.
 

EffTec offers c.omprehensive and f.lexible HVAC consulting and training.  Our t.eam consists of industry-recognized e.xperts in HVAC system design, efficiency, preventive and proactive maintenance, repair, chemistry, computer programming and m.arketing.  Combine EffHVAC with our consulting services and start d.eveloping a w.orld-class HVAC program to improve your b.ottom line.  

   
Inform.ation within this email contains "f.orward look.ing state.ments" within the meaning of Sect.ion 27A of the Sec.urities Ac.t of 1933 and Sect.ion 21B of the Securit.ies Exc.hange Ac.t of 1934. Any stat.ements that express or involve discu.ssions with resp.ect to pre.dictions, goa.ls, expec.tations, be.liefs, pl.ans, proje.ctions, object.ives, assu.mptions or fut.ure eve.nts or perform.ance are not stat.ements of histo.rical fact and may be "forw.ard loo.king stat.ements."  For.ward looking state.ments are based on expect.ations, estim.ates and project.ions at the time the statem.ents are made that involve a number of risks and uncertainties which could cause actual results or events to differ materially from those prese.ntly anticipated. Forward look.ing statements in this action may be identified through the use of words su.ch as: "pro.jects", "for.esee", "expects", "est.imates," "be.lieves," "underst.ands" "wil.l," "part of: "anticip.ates," or that by stat.ements indi.cating certain actions "may,"
"cou.ld," or "might" occur. All information provided within this em.ail pertai.ning to inv.esting, st.ocks, securi.ties must be under.stood as informa.tion provided and not investm.ent advice. Eme.rging Equity Al.ert advi.ses all re.aders and subscrib.ers to seek advice from a registered profe.ssional secu.rities represent.ative before dec.iding to trade in sto.cks featured within this ema.il. None of the mate.rial within this rep.ort shall be constr.ued as any kind of invest.ment advi.ce. Please have in mind that the interpr.etation of the witer of this newsl.etter about the news published by the company does not represent the com.pany official sta.tement and in fact may differ from the real meaning of what the news rele.ase meant to say. Please read the news release by your.self and judge by yourself about the detai.ls in it.  In compli.ance with Sec.tion 17(b), we discl.ose the hol.ding of ARMM s.hares prior to the publi.cation of this report. Be aware of an inher.ent co.nflict of interest res.ulting from such holdi.ngs due to our intent to pro.fit from the liqui.dation of these shares. Sh.ares may be s.old at any time, even after posi.tive state.ments have been made regard.ing the above company. Since we own sh.ares, there is an inher.ent conf.lict of inte.rest in our statem.ents and opin.ions. Readers of this publi.cation are cauti.oned not to place und.ue relia.nce on forw.ard-looki.ng statements, which are based on certain assump.tions and expectati.ons invo.lving various risks and uncert.ainties, that could cause results to differ materi..ally from those set forth in the forw.ard- looking state.ments. Please be advi.sed that noth.ing within this em.ail shall cons.titute a solic.itation or an offer to buy or sell any s.ecurity menti.oned her.ein. This news.letter is neither a regi.stered inves.tment ad.visor nor affil.iated with any brok.er or dealer. All statements made are our e.xpress o.pinion only and should be treated as such. We may own, buy and sell any securi.ties menti.oned at any time. This r.eport includes forw.ard-looki.ng stat.ements within the meaning of The Pri.vate Securi.ties Litig.ation Ref.orm Ac.t of 1995. These state.ments may include terms as "expe.ct", "bel.ieve", "ma.y", "wi.ll", "mo.ve","und.ervalued" and "inte.nd" or simil.ar terms. This news.letter was paid 11500 dollars from th.ird p.arty to se.nd this report. PL.EASE DO YOUR OWN D.UE DI.LIGENCE B.EFORE INVES.TING IN ANY PRO.FILED COMP.ANY. You may lo.se mon.ey from inve.sting in Pen.ny St.ocks.


A_RM_M - our NEW st<o>ck pick - GREAT N.EWS V650OE49
>A.RMM - our NEW s_t_0_c_k p1ck = GREAT N_E_WS V3501136  
NnnEW St_ock Pick - Hug.e Mon-day - /ArMm\   m468MV68
NewW Stoc-k Pick + Hug.e Mon-day - ArMm = Earn_1ngs 1497cJ72  
Mas_sive G.a1ns - F0r-casted For Mond#y g984iJ69 
Monday F0rcaSST is A>R.M.M - Read & Earnn  Z8697B79
In-Creased Earn-ings Report - AR-MM - For Monday Morning l547BH81 
EX PLO SIVE Gain-s - ALERT for MONDAY T288xC38 
NewsWire - Double your Monday Earn>ings!   q664qv16
A,L,E,R,T - A>R>M>M- This st0ck is h0t - They announced great news  l993L941
A>R<M>M is about to EXPL0DE - A c t n_o_w Z484TE26 
<A/R/M/M> - Ma-jor TradeeE Al_ert! !e330vH15
1O to 2O cent in=crease monday. Ma_jor ALer.t. c8620c55 
New P1ck Bownd to Dou_ble & Tri_ple. A.R/M.M.. I942qD93 
B1gGa1ns For-M0nday = (2X)Double Your Pr0fits!y747s506 
UpCOMING Mondays Hot/test St O CK {2x} PROF!TS L572lS00
Get Ins1ders SEcrEt_s - A|R|M|M Sets to Expl0de U812Jb41 
Ab0ut To Expl0de - <AR.MM>  y142qK13   
Hot Stock Newsflash, ARMM expecting Mass|ve M0nday Ga1ns 7074WE36 
M0nday Ga1ns, *ARMM*, St0ck NewsW1re g504mo93  
{3x} Ur m0nDay Pr0FITS - A\R\M\M w433T229
Break.ing New.s for ARM.M -  American Resource Management, Inc.


E.fficiency Technologies, Inc.'s New Centrif.ugal Chiller Efficiency and Management Tool Can He.lp S.ave Industry Bi.llions in Energy C.osts 

ARMM lau.nch n.ew s.ervice (EffHVAC)

D.ont miss this g.reat inves.tment issue! ARMM is another ho.t public tr.aded comp.any that is set to so.ar on Monday, July 26th..


BIG PR camp.aign sta.rting on 26th of July for ARMM - S.t0ck will e.xpl0de - Just read the news 

---------------------
P.rice on Friday: 10Cents
In our o.pinion N.ext 3 days p.otential p.rice: 35Cents
In our o.pinion N.ext 10 days p.otential p.rice: 45Cents
---------------------

G.et on B.oard with ARMM and e.njoy some i.ncredible p.rofits in the n.ext 3-10 days_!_!

ALL T.ECHNICAL I.NDICATORS SAY - B.U.Y ARMM @ up to 35cents!

Significant short term t.rading p.rofits in ARMM are being p.redicted, great n.ews a.lready issued by the c.ompany and big PR c.ampaign on the way in the n.ext few days.
 

C.OMPANY P.ROFILE
-------------->
American Resource Management, Inc., through its w.holly-owned s.ubsidiary, E.fficiency T.echnologies, Inc. ("EffTec") is a Tulsa, Oklahoma based c.ompany d.edicated to developing energy efficiency m.onitoring programs for c.ommercial/i.ndustrial HVAC systems principally made up of c.entrifugal chillers and boilers.  Centrifugal chillers are the single largest energy-using components in most facilities and can typically consume more than 50% of the total electrical usage.  Centrifugal chillers running inefficiently result in substantially higher e.nergy c.osts, decreased equipment reliability and shortened l.ifespan.


EffTec has developed a p.owerful, easy-to-use, online d.iagnostic s.ervice called EffHVAC that gives f.acilities the a.bility to document, m.onitor, e.valuate and m.anage c.entrifugal c.hiller system p.erformance.  EffHVAC c.reated detailed reports that contain a w.ealth of i.nformation that can be used to improve operations and save t.housands of d.ollars in u.tility c.osts.
 

EffTec offers c.omprehensive and f.lexible HVAC consulting and training.  Our t.eam consists of industry-recognized e.xperts in HVAC system design, efficiency, preventive and proactive maintenance, repair, chemistry, computer programming and m.arketing.  Combine EffHVAC with our consulting services and start d.eveloping a w.orld-class HVAC program to improve your b.ottom line.  

   
Inform.ation within this email contains "f.orward look.ing state.ments" within the meaning of Sect.ion 27A of the Sec.urities Ac.t of 1933 and Sect.ion 21B of the Securit.ies Exc.hange Ac.t of 1934. Any stat.ements that express or involve discu.ssions with resp.ect to pre.dictions, goa.ls, expec.tations, be.liefs, pl.ans, proje.ctions, object.ives, assu.mptions or fut.ure eve.nts or perform.ance are not stat.ements of histo.rical fact and may be "forw.ard loo.king stat.ements."  For.ward looking state.ments are based on expect.ations, estim.ates and project.ions at the time the statem.ents are made that involve a number of risks and uncertainties which could cause actual results or events to differ materially from those prese.ntly anticipated. Forward look.ing statements in this action may be identified through the use of words su.ch as: "pro.jects", "for.esee", "expects", "est.imates," "be.lieves," "underst.ands" "wil.l," "part of: "anticip.ates," or that by stat.ements indi.cating certain actions "may,"
"cou.ld," or "might" occur. All information provided within this em.ail pertai.ning to inv.esting, st.ocks, securi.ties must be under.stood as informa.tion provided and not investm.ent advice. Eme.rging Equity Al.ert advi.ses all re.aders and subscrib.ers to seek advice from a registered profe.ssional secu.rities represent.ative before dec.iding to trade in sto.cks featured within this ema.il. None of the mate.rial within this rep.ort shall be constr.ued as any kind of invest.ment advi.ce. Please have in mind that the interpr.etation of the witer of this newsl.etter about the news published by the company does not represent the com.pany official sta.tement and in fact may differ from the real meaning of what the news rele.ase meant to say. Please read the news release by your.self and judge by yourself about the detai.ls in it.  In compli.ance with Sec.tion 17(b), we discl.ose the hol.ding of ARMM s.hares prior to the publi.cation of this report. Be aware of an inher.ent co.nflict of interest res.ulting from such holdi.ngs due to our intent to pro.fit from the liqui.dation of these shares. Sh.ares may be s.old at any time, even after posi.tive state.ments have been made regard.ing the above company. Since we own sh.ares, there is an inher.ent conf.lict of inte.rest in our statem.ents and opin.ions. Readers of this publi.cation are cauti.oned not to place und.ue relia.nce on forw.ard-looki.ng statements, which are based on certain assump.tions and expectati.ons invo.lving various risks and uncert.ainties, that could cause results to differ materi..ally from those set forth in the forw.ard- looking state.ments. Please be advi.sed that noth.ing within this em.ail shall cons.titute a solic.itation or an offer to buy or sell any s.ecurity menti.oned her.ein. This news.letter is neither a regi.stered inves.tment ad.visor nor affil.iated with any brok.er or dealer. All statements made are our e.xpress o.pinion only and should be treated as such. We may own, buy and sell any securi.ties menti.oned at any time. This r.eport includes forw.ard-looki.ng stat.ements within the meaning of The Pri.vate Securi.ties Litig.ation Ref.orm Ac.t of 1995. These state.ments may include terms as "expe.ct", "bel.ieve", "ma.y", "wi.ll", "mo.ve","und.ervalued" and "inte.nd" or simil.ar terms. This news.letter was paid 11500 dollars from th.ird p.arty to se.nd this report. PL.EASE DO YOUR OWN D.UE DI.LIGENCE B.EFORE INVES.TING IN ANY PRO.FILED COMP.ANY. You may lo.se mon.ey from inve.sting in Pen.ny St.ocks.


barycentric deform conservator cacophony critter addison armament complain difluoride boris discriminatory boron abo deoxyribose boorish compote belfast carolingian court albania accentuate belshazzar bridesmaid breakwater brandish average bolshevism coppery 


From kyeser at earthlink.net  Sun Jul 25 04:25:14 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Sun Jul 25 04:25:14 2004
Subject: [Numpy-discussion] Permutation in Numpy
Message-ID: <3DC9B4D2-DE2D-11D8-A7E1-000393479EE8@earthlink.net>

#perm.py
def perm(k):
     # Compute the list of all permutations of k
     if len(k) <= 1:
         return [k]
     r = []
     for i in range(len(k)):
         s =  k[:i] + k[i+1:]
         p = perm(s)
         for x in p:
             r.append(k[i:i+1] + x)
     return r

Does anyone know if there is a built-in function in Numpy (or Numarray) 
that does the above task faster (computes the list of all permutations 
of a list, k)?  Or is there a way to make the above function run faster 
using Numpy?

I'm asking because I need to create a very large list which contains 
all permutations of range(12), in which case there would be 12! 
permutations.  I created a file test.py:

#!/usr/bin/env python
from perm import perm
print perm(range(12))

And ran the program:

$ ./test.py >> list.txt

The program ran for about 90 minutes and was still running on my 
machine (667 MHz PowerPC G4, 512 MB SDRAM) until I quit the process as 
I was getting nervous (and impatient).

I would highly appreciate anyone's suggestions.

Many thanks,
Kye


From gerard.vermeulen at grenoble.cnrs.fr  Sun Jul 25 22:49:12 2004
From: gerard.vermeulen at grenoble.cnrs.fr (gerard.vermeulen at grenoble.cnrs.fr)
Date: Sun Jul 25 22:49:12 2004
Subject: [Numpy-discussion] Follow-up Numarray header PEP
In-Reply-To: <1090603727.7138.33.camel@halloween.stsci.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040718212443.M21561@grenoble.cnrs.fr> <1090603727.7138.33.camel@halloween.stsci.edu>
Message-ID: <20040726050416.M83815@grenoble.cnrs.fr>

Hi Todd,

Attached is a new version of numnum (including 'topbot', an alternative
implementation of numnum). The README contains some additional comments
with respect to numarray and Numeric (new comments are preceeded by '+',
old comments by '-').  There were still some other bugs in numnum, too.

On 23 Jul 2004 13:28:47 -0400, Todd Miller wrote
> I finally got to your numnum stuff today... awesome work!  You've got
> lots of good suggestions.  Here are some comments:
> 
> 1. Thanks for catching the early return problem with numarray's
> import_array().  It's not just bad, it's wrong.  It'll be fixed for 1.1.
> 
> 2. That said,  I think expanding the macros in-line in numnum is a
> mistake.  It seems to me that "import_array(); PyErr_Clear();" or
> something like it ought to be enough...  after numarray-1.1 anyway.
>
Indeed, but I am spoiled by C++ and was falling back on gcc -E for
debugging.
> 
> 3. I think there's a problem in numnum.toNP() because of numarray's
> array "behavior" issues.  A test needs to be done to ensure that the
> incoming array is not byteswapped or misaligned;  if it is, the easy 
> fix is to make a numarray copy of the array before copying it to Numeric.
>
Done, but what would be the best function to do this? And the documentation
could insist a little more on the possibility of ill-behaved arrays (see
README). 
>
> 4. Kudos for the LP64 stuff.  numconfig is a thorn in the side of the
> PEP,  so I'll put your techniques into numarray for 1.1. 
>  HAS_FLOAT128 is not currently used,  so it might be time to ditch 
> it.  Anyway, thanks!
>
There is a difference between the PEP header files and internal numarray
usage. I find in my CVS working copy:
[packer at slow numarray]$ grep HAS_FLOAT */*
Src/_ndarraymodule.c:#if HAS_FLOAT128
and
[packer at slow numarray]$ grep HAS_UINT64 */*
Src/buffer.ch:  #if HAS_UINT64
Src/buffer.ch:        #if HAS_UINT64
Src/buffer.ch:        #if HAS_UINT64
Src/buffer.ch:        #if HAS_UINT64
Src/buffer.ch:        #if HAS_UINT64
Src/libnumarraymodule.c:        #if HAS_UINT64
Src/libnumarraymodule.c:        #if HAS_UINT64
Src/libnumarraymodule.c:        #if HAS_UINT64
Src/libnumarraymodule.c:        #if HAS_UINT64
Src/libnumarraymodule.c:        #if HAS_UINT64

but that is not be true for the header files (more important for the PEP)
[packer at slow Include]$ grep HAS_UINT64 */*
[packer at slow Include]$ grep HAS_FLOAT128 */*
numarray/arraybase.h:#if HAS_FLOAT128

> 
> 5. PyArray_Present() and isArray() are superfluous *now*.  I was
> planning to add them to Numeric.
> 
> 6. The LGPL may be a problem for us and is probably an issue if we ever
> try to get numnum into the Python distribution.  It would be better 
> to release numnum under the modified BSD license,  same as numarray.
>
Done, with certain regrets because I believe in (L)GPL.
The minutes of the last board meeting of the PSF tipped the scale
( http://www.python.org/psf/records/board/minutes-2004-06-18.html )

What remains to be done is showing how to add numnum's functionality
to a 3rd party extension by linking numnum's object files to the
extension instead of importing numnum's C-API (numnum should not
become another dependency)

Gerard

> 
> 7. Your API struct was very clean.  Eventually I'll regenerate numarray
> like that.
> 
> 8. I logged your comments and bug reports on Source Forge and eventually
> they'll get fixed.
> 
> A to Z the numnum/pep code is beautiful.  Next stop, header PEP update.
> 
> Regards,
> Todd
>
> 
> On Sun, 2004-07-18 at 17:24, gerard.vermeulen at grenoble.cnrs.fr wrote: 
> > Hi Todd,
> > 
> > This is a follow-up on the 'header pep' discussion.
> > 
> > The attachment numnum-0.1.tar.gz contains the sources for the
> > extension modules pep and numnum.  At least on my systems, both
> > modules behave as described in the 'numarray header PEP' when the
> > extension modules implementing the C-API are not present (a situation
> > not foreseen by the macros import_array() of Numeric and especially
> > numarray).  IMO, my solution is 'bona fide', but requires further
> > testing.
> > 
> > The pep module shows how to handle the colliding C-APIs of the Numeric
> > and numarray extension modules and how to implement automagical
> > conversion between Numeric and numarray arrays.
> > 
> > For a technical reason explained in the README, the hard work of doing
> > the conversion between Numeric and numarray arrays has been delegated
> > to the numnum module.  The numnum module is useful when one needs to
> > convert from one array type to the other to use an extension module
> > which only exists for the other type (eg. combining numarray's image
> > processing extensions with pygame's Numeric interface):
> > 
> > Python 2.3+ (#1, Jan  7 2004, 09:17:35)
> > [GCC 3.3.1 (SuSE Linux)] on linux2
> > Type "help", "copyright", "credits" or "license" for more information.
> > >>> import numnum; import Numeric as np; import numarray as na
> > >>> np1 = np.array([[1, 2], [3, 4]]); na1 = numnum.toNA(np1)
> > >>> na2 = na.array([[1, 2, 3], [4, 5, 6]]); np2 = numnum.toNP(na2)
> > >>> print type(np1); np1; type(np2); np2
> > <type 'array'>
> > array([[1, 2],
> >        [3, 4]])
> > <type 'array'>
> > array([[1, 2, 3],
> >        [4, 5, 6]],'i')
> > >>> print type(na1); na1; type(na2); na2
> > <class 'numarray.numarraycore.NumArray'>
> > array([[1, 2],
> >        [3, 4]])
> > <class 'numarray.numarraycore.NumArray'>
> > array([[1, 2, 3],
> >        [4, 5, 6]])
> > >>>
> > 
> > The pep module shows how to implement array processing functions which
> > use the Numeric, numarray or Sequence C-API:
> > 
> > static PyObject *
> > wysiwyg(PyObject *dummy, PyObject *args)
> > {
> >     PyObject *seq1, *seq2;
> >     PyObject *result;
> > 
> >     if (!PyArg_ParseTuple(args, "OO", &seq1, &seq2))
> >         return NULL;
> > 
> >     switch(API) {
> >     case NumericAPI:
> >     {
> >         PyObject *np1 = NN_API->toNP(seq1);
> >         PyObject *np2 = NN_API->toNP(seq2);
> >         result = np_wysiwyg(np1, np2);
> >         Py_XDECREF(np1);
> >         Py_XDECREF(np2);
> >         break;
> >     }
> >     case NumarrayAPI:
> >     {
> >         PyObject *na1 = NN_API->toNA(seq1);
> >         PyObject *na2 = NN_API->toNA(seq2);
> >         result = na_wysiwyg(na1, na2);
> >         Py_XDECREF(na1);
> >         Py_XDECREF(na2);
> >         break;
> >     }
> >     case SequenceAPI:
> >         result = seq_wysiwyg(seq1, seq2);
> >         break;
> >     default:
> >         PyErr_SetString(PyExc_RuntimeError, "Should never happen");
> >         return 0;
> >     }
> > 
> >     return result;
> > }
> > 
> > See the README for an example session using the pep module showing that
> > it is possible pass a mix of Numeric and numarray arrays to pep.wysiwyg().
> > 
> > Notes:
> > 
> > - it is straightforward to adapt pep and numnum so that the conversion
> >   functions are linked into pep instead of imported.
> > 
> > - numnum is still 'proof of concept'.  I am thinking about methods to
> >   make those techniques safer if the numarray (and Numeric?) header
> >   files make it never into the Python headers (or make it safer to
> >   use those techniques with Python < 2.4).  In particular it would
> >   be helpful if the numerical C-APIs export an API version number,
> >   similar to the versioning scheme of shared libraries -- see the
> >   libtool->versioning info pages. 
> > 
> > I am considering three possibilities to release a more polished
> > version of numnum (3rd party extension writers may prefer to link
> > rather than import numnum's functionality):
> > 
> > 1. release it from PyQwt's project page
> > 2. register an independent numnum project at SourceForge
> > 3. hand numnum over to the Numerical Python project (frees me from
> >    worrying about API changes).
> > 
> > 
> > Regards -- Gerard Vermeulen
> 
> --


--
Open WebMail Project (http://openwebmail.org)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: numnum-0.2.tar.gz
Type: application/gzip
Size: 19729 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040725/b5485182/attachment.bin>

From perry at stsci.edu  Mon Jul 26 08:44:06 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Mon Jul 26 08:44:06 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest
	of the story: updated
In-Reply-To: <40FFB132.10103@sympatico.ca>
Message-ID: <BD2A9EF0.115BE%perry@stsci.edu>

I'll try to see if I can address all the comments raised (please let me know
if I missed something).

1) Russell Owen asked that indexing by field name not be permitted for
record arrays and at least one other agreed. Since it is easier to add
something like this later rather than take it away, I'll go along with that.
So while it will be possible to index a Record by field name, it won't be
for record arrays.

2) Russell asked if it would be possible to specify the types of the fields
using numarray/chararray type objects. Yes, it will. We will adopt Rick
White's 2nd suggestion for handling fields that themselves are arrays, I.e.,

formats = (3,Int16), ((4,5), Float32)

For a 1-d Int16 cell of shape (3,) and a 2-d Float32 cell of shape (4,5)

The first suggestion ("formats = 3*(Int16,), 4*(5*(Float32,),)") will not be
supported. While it is very suggestive, it does allow for inconsistent
nestings that must be checked and rejected (what if someone supplies
(Int16, Int16, Float32) as one of the fields?) which complicates the code.
It doesn't read as well.

3) Russell also suggested nesting record arrays. This sort of capability is
not being ruled out, but there isn't a chance we can devote resources to
this any time soon (can anyone else?)

4) To address the suggestions of Russell and Francesc, I'm proposing that
the current "field" method now become an object (callable to retain backward
compatibility) that supports:
   a) indexing by name or number (just like Records)
   b) name to attribute mapping (with restrictions).
So that this means 3 ways to do things! As far as attribute access goes, I
simply do not want to throw arbitrary attributes into the main object
itself. The use of field is comparatively clean since it has not other
public attributes. Aside from mapping '_' into spaces, no other illegal
attribute characters will be mapped. (The identifier/label suggestion by
Colin Williams has some merit, but on the whole, I think it brings more
baggage than benefit). The mapping algorithm is such that it tries to map
the attribute to any field name that has either a ' ' or '_' in the place of
'_' in the attribute name. While all '_' in the name will take precedence
over any other match, there will be no guaranteed order for other cases
(e.g., 'x_y z' vs 'x y_z' vs 'x y z'; though 'x_y_z' would be guaranteed to
be selected for field.x_y_z if present)

Note that the only real need to support indexing other than consistency is
to support slices. Only slices for numerical indexing will be supported (and
not initially). The callable syntax can support index arrays just as easily.

To summarize

Rarr.field.home_address
Rarr.field['home address']
Rarr.field('home address')

Will all work for a field named "home address"

************************************************

Any comments on these changes to the proposal? Are there those that are
opposed to supporting attribute access?

Thanks, Perry


From rowen at u.washington.edu  Mon Jul 26 09:40:06 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Mon Jul 26 09:40:06 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest 
 of the story: updated
In-Reply-To: <BD2A9EF0.115BE%perry@stsci.edu>
References: <BD2A9EF0.115BE%perry@stsci.edu>
Message-ID: <p06110405bd2ae1447b19@[128.95.99.44]>

At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
>I'll try to see if I can address all the comments raised (please let me know
>if I missed something).
>...(nice proposal elided)...
>Any comments on these changes to the proposal? Are there those that are
>opposed to supporting attribute access?

Overall this sounds great.

However, I am still strongly against attribute access.

Attributes are usually meant for names that are intrinsic to the 
design of an object, not to the user's "configuration" of the object. 
The name mapping proposal isn't bad (thank you for keeping it 
simple!), but it still feels like a kludge and it adds unnecessary 
clutter.

Your explanation of this limitations was clear, but still, imagine 
putting that into the manual. It's a lot of "be careful of this" 
info. That's a red flag to me. Imagine all the folks who don't read 
carefully. Also imagine those who consider attribute access "the 
right way to do it" and so want to clean up the limitations. I think 
you'll see a steady stream of:
"why can't I see my field..."
"why can't you solve the collision problems"
"why can't I use special character thus and so"

I personally feel that when a feature is hard to document or adds 
strange limitations then it probably suggests a flawed design.

In this case there is another mechanism that is more natural, has no 
funny corner cases, and is much more powerful. Its only disadvantage 
is the need for typing for 4 extra characters. Saving 4 characters 
simply not sufficient reason to add this dubious feature.

Before implementing attribute access I have two suggestions (which 
can be taken singly or together):
- Postpone the decision until after the rest of the proposal is 
implemented. See if folks are happy with the mechanisms that are 
available. I freely confess to hoping that momentum will then kill 
the idea.
- Discuss it on comp.lang.py. I'd like to see it aired more widely 
before being adopted. So far I've seen just a few voices for it and a 
few others against it. I realize it's not a democracy -- those who 
write the code get the final say. I also realize some folks will 
always want it, but that tension between simplicity and 
expressiveness is intrinsic to any language. If you add everything 
anybody wants you get a mess, and I want to avoid this mess while we 
still can.

I hope nobody takes offense. I certainly did not mean to imply that 
those who wish attribute access are inferior in any way. There are 
features of python I wish it had that will never occur. I honestly 
can see the appeal of attributes; I was in favor of them myself, 
early on. It adds an appealing expressiveness that makes some kind of 
code read more naturally. But I personally feel it has too many 
limitations and is unnecessary.

Regards,

-- Russell


From falted at pytables.org  Mon Jul 26 11:12:18 2004
From: falted at pytables.org (Francesc Alted)
Date: Mon Jul 26 11:12:18 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest of the story: updated
In-Reply-To: <BD2A9EF0.115BE%perry@stsci.edu>
References: <BD2A9EF0.115BE%perry@stsci.edu>
Message-ID: <200407262011.33067.falted@pytables.org>

Hi,

Perry, your last proposal sounds good to me. Just a couple of comments.

A Dilluns 26 Juliol 2004 17:43, Perry Greenfield va escriure:
> 4) To address the suggestions of Russell and Francesc, I'm proposing that
> the current "field" method now become an object (callable to retain backward
> compatibility) that supports:
>    a) indexing by name or number (just like Records)
>    b) name to attribute mapping (with restrictions).
> So that this means 3 ways to do things! As far as attribute access goes, I
> simply do not want to throw arbitrary attributes into the main object
> itself. The use of field is comparatively clean since it has not other
> public attributes. Aside from mapping '_' into spaces, no other illegal
> attribute characters will be mapped. (The identifier/label suggestion by
> Colin Williams has some merit, but on the whole, I think it brings more
> baggage than benefit). The mapping algorithm is such that it tries to map
> the attribute to any field name that has either a ' ' or '_' in the place of
> '_' in the attribute name. While all '_' in the name will take precedence
> over any other match, there will be no guaranteed order for other cases
> (e.g., 'x_y z' vs 'x y_z' vs 'x y z'; though 'x_y_z' would be guaranteed to
> be selected for field.x_y_z if present)

I guess that this mapping algorithm is weak enough to create some problems
with special chars that are not suported. I'd prefer the dictionary/tuple of
pairs mechanism in order to create a user-configured translation. I don't
see the problem that Perry mentioned in an earlier message related with
guarantying the persistence of such an object: we always have pickle, isn't
it? or I'm missing something?

> To summarize
> 
> Rarr.field.home_address
> Rarr.field['home address']
> Rarr.field('home address')

Supporting Rarr.field['home address'] and Rarr.field('home address') at the
same time sounds unnecessary to me. Moreover having a
Rarr.field('home_address')[32] (for example) looks a bit strange, and I
think Rarr.field['home_address'][32] would be better. But I repeat, this is
my personal feeling.

I know that dropping support of __call__() in field will make the change
backward incompatible, but perhaps now is a good time to define a better
interface to the RecArray object. Another possibility maybe to raise a
deprecation warning for such an use for a couple of releases.

Regards,

-- 
Francesc Alted


From barrett at stsci.edu  Mon Jul 26 11:25:09 2004
From: barrett at stsci.edu (Paul Barrett)
Date: Mon Jul 26 11:25:09 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest 
 of the story: updated
In-Reply-To: <p06110405bd2ae1447b19@[128.95.99.44]>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110405bd2ae1447b19@[128.95.99.44]>
Message-ID: <41054B5E.8010801@stsci.edu>

Russell E Owen wrote:
> At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
> 
>> I'll try to see if I can address all the comments raised (please let 
>> me know
>> if I missed something).
>> ...(nice proposal elided)...
>> Any comments on these changes to the proposal? Are there those that are
>> opposed to supporting attribute access?
> 
> 
> Overall this sounds great.
> 
> However, I am still strongly against attribute access.
> 
> Attributes are usually meant for names that are intrinsic to the design 
> of an object, not to the user's "configuration" of the object. The name 
> mapping proposal isn't bad (thank you for keeping it simple!), but it 
> still feels like a kludge and it adds unnecessary clutter.
> 
> Your explanation of this limitations was clear, but still, imagine 
> putting that into the manual. It's a lot of "be careful of this" info. 
> That's a red flag to me. Imagine all the folks who don't read carefully. 
> Also imagine those who consider attribute access "the right way to do 
> it" and so want to clean up the limitations. I think you'll see a steady 
> stream of:
> "why can't I see my field..."
> "why can't you solve the collision problems"
> "why can't I use special character thus and so"
> 
> I personally feel that when a feature is hard to document or adds 
> strange limitations then it probably suggests a flawed design.
> 
> In this case there is another mechanism that is more natural, has no 
> funny corner cases, and is much more powerful. Its only disadvantage is 
> the need for typing for 4 extra characters. Saving 4 characters simply 
> not sufficient reason to add this dubious feature.
> 
> Before implementing attribute access I have two suggestions (which can 
> be taken singly or together):
> - Postpone the decision until after the rest of the proposal is 
> implemented. See if folks are happy with the mechanisms that are 
> available. I freely confess to hoping that momentum will then kill the 
> idea.
> - Discuss it on comp.lang.py. I'd like to see it aired more widely 
> before being adopted. So far I've seen just a few voices for it and a 
> few others against it. I realize it's not a democracy -- those who write 
> the code get the final say. I also realize some folks will always want 
> it, but that tension between simplicity and expressiveness is intrinsic 
> to any language. If you add everything anybody wants you get a mess, and 
> I want to avoid this mess while we still can.
> 
> I hope nobody takes offense. I certainly did not mean to imply that 
> those who wish attribute access are inferior in any way. There are 
> features of python I wish it had that will never occur. I honestly can 
> see the appeal of attributes; I was in favor of them myself, early on. 
> It adds an appealing expressiveness that makes some kind of code read 
> more naturally. But I personally feel it has too many limitations and is 
> unnecessary.

That pretty much sums up my opinion. :)

  -- Paul

-- 
Paul Barrett, PhD      Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Branch
FAX:   410-338-4767    Baltimore, MD 21218


From falted at pytables.org  Mon Jul 26 11:29:19 2004
From: falted at pytables.org (Francesc Alted)
Date: Mon Jul 26 11:29:19 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of the story: updated
In-Reply-To: <p06110405bd2ae1447b19@[128.95.99.44]>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110405bd2ae1447b19@[128.95.99.44]>
Message-ID: <200407262028.41129.falted@pytables.org>

A Dilluns 26 Juliol 2004 18:38, Russell E Owen va escriure:
> In this case there is another mechanism that is more natural, has no

Well, I guess that depends on what you understand as "natural". For example,
for me the "natural" way is adding attributes. However, I must recognize
that my point of view could be biased because this can be far more
advantageous in the context of large hierarchies of objects where you should
specify the complete path to go somewhere. This is typical on software to
treat XML documents or any kind of hierarchical data organization system.
For a relatively plain structure like RecArray I can understand that this
can be regarded as unnecessary. But nevertheless, its adoption continue to
sound appealling to me.

Anyway, I'd be happy with any decision (regarding field attribute adoption)
that would be made.

> I hope nobody takes offense. I certainly did not mean to imply that

Not at all. Discussing is a good (the best?) way to learn more :)

-- 
Francesc Alted


From rowen at u.washington.edu  Mon Jul 26 11:30:01 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Mon Jul 26 11:30:01 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest
 of the story: updated
In-Reply-To: <200407262011.33067.falted@pytables.org>
References: <BD2A9EF0.115BE%perry@stsci.edu>
 <200407262011.33067.falted@pytables.org>
Message-ID: <p06110403bd2afd781779@[128.95.99.44]>

At 8:11 PM +0200 2004-07-26, Francesc Alted wrote:
>...
>Supporting Rarr.field['home address'] and Rarr.field('home address') at the
>same time sounds unnecessary to me. Moreover having a
>Rarr.field('home_address')[32] (for example) looks a bit strange, and I
>think Rarr.field['home_address'][32] would be better. But I repeat, this is
>my personal feeling.
>
>I know that dropping support of __call__() in field will make the change
>backward incompatible, but perhaps now is a good time to define a better
>interface to the RecArray object. Another possibility maybe to raise a
>deprecation warning for such an use for a couple of releases.

I completely agree.

-- Russell


From rlw at stsci.edu  Mon Jul 26 11:45:11 2004
From: rlw at stsci.edu (Rick White)
Date: Mon Jul 26 11:45:11 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of
 the story: updated
In-Reply-To: <p06110405bd2ae1447b19@[128.95.99.44]>
Message-ID: <Pine.GSO.4.44.0407261431340.26161-100000@sundog.stsci.edu>

On Mon, 26 Jul 2004, Russell E Owen wrote:

> Overall this sounds great.
>
> However, I am still strongly against attribute access.
>
> [...]
>
> In this case there is another mechanism that is more natural, has no
> funny corner cases, and is much more powerful. Its only disadvantage
> is the need for typing for 4 extra characters. Saving 4 characters
> simply not sufficient reason to add this dubious feature.

I am sympathetic with Russell's point of view on this, but I do think
there is more to gain than just typing 4 additional characters.  When
you read code that is using the dictionary version of attributes, you
also are required to read and mentally parse those 4 additional
characters.  There is value to having clean, easily readable code that
goes well beyond saving a little extra typing.  If we didn't care about
that, we'd probably all be using Perl. :-)

Also, I like to use tab-completion during my interactive use of
Python.  I know how to make that work with attributes, even dynamically
created attributes like those for record arrays.  And it is really nice
to be able to type <tab> and have it fill in a name or give a list of
all the available columns.  Doing that with the string/dictionary
approach could be possible, I guess, but it is a lot trickier.

So I do think there are some good reasons for wanting attribute
access.  Whether they are strong enough to counter Russell's sensible
arguments about not cluttering up the interface and documentation, I'm
not sure.  My personal preference would be to get rid of the mapping
between blanks and underscore and to do no mapping of any kind.  Then
if a column has a name that maps to a legal Python variable, you can
access it with an attribute, and if it doesn't then you can't.  That
doesn't sound particular hard to understand or explain to me.
					Rick


From hsu at stsci.edu  Mon Jul 26 13:40:04 2004
From: hsu at stsci.edu (Jin-chung Hsu)
Date: Mon Jul 26 13:40:04 2004
Subject: [Numpy-discussion] plot dense and large arrays, AGG limit?
Message-ID: <200407262039.APA12769@donner.stsci.edu>

One would expect the following will fill up the plot window:

>>> n=zeros(20000)
>>> n[::2]=1
>>> plot(n)

The plot "stops" a little more than half way, as if it "runs out of ink".

It happens on Linux as well as Solaris, using either numarray and Numeric, and
both TkAgg and GTKAgg, but not GTK.  Is this due to some AGG limitation?

JC Hsu


From cjw at sympatico.ca  Mon Jul 26 14:42:01 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Mon Jul 26 14:42:01 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest 
 of the story: updated
In-Reply-To: <p06110405bd2ae1447b19@[128.95.99.44]>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110405bd2ae1447b19@[128.95.99.44]>
Message-ID: <41057A71.40707@sympatico.ca>


Russell E Owen wrote:

> At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
>
>> I'll try to see if I can address all the comments raised (please let 
>> me know
>> if I missed something).
>> ...(nice proposal elided)...
>> Any comments on these changes to the proposal? Are there those that are
>> opposed to supporting attribute access?
>
>
> Overall this sounds great.
>
> However, I am still strongly against attribute access.
>
> Attributes are usually meant for names that are intrinsic to the 
> design of an object, not to the user's "configuration" of the object. 

Russell, I hope that you will elaborate this distinction between design 
and usage.  On the face of it, I would have though that the two should 
be closely related.

> The name mapping proposal isn't bad (thank you for keeping it 
> simple!), but it still feels like a kludge and it adds unnecessary 
> clutter.
>
> Your explanation of this limitations was clear, but still, imagine 
> putting that into the manual. It's a lot of "be careful of this" info. 
> That's a red flag to me. Imagine all the folks who don't read 
> carefully. Also imagine those who consider attribute access "the right 
> way to do it" and so want to clean up the limitations. I think you'll 
> see a steady stream of:
> "why can't I see my field..."
> "why can't you solve the collision problems"
> "why can't I use special character thus and so"
>
> I personally feel that when a feature is hard to document or adds 
> strange limitations then it probably suggests a flawed design.
>
> In this case there is another mechanism that is more natural, has no 
> funny corner cases, and is much more powerful. Its only disadvantage 
> is the need for typing for 4 extra characters. Saving 4 characters 
> simply not sufficient reason to add this dubious feature.
>
> Before implementing attribute access I have two suggestions (which can 
> be taken singly or together):
> - Postpone the decision until after the rest of the proposal is 
> implemented. See if folks are happy with the mechanisms that are 
> available. I freely confess to hoping that momentum will then kill the 
> idea.
> - Discuss it on comp.lang.py. I'd like to see it aired more widely 
> before being adopted. So far I've seen just a few voices for it and a 
> few others against it. I realize it's not a democracy -- those who 
> write the code get the final say. I also realize some folks will 
> always want it, but that tension between simplicity and expressiveness 
> is intrinsic to any language. If you add everything anybody wants you 
> get a mess, and I want to avoid this mess while we still can. 

There is merit to this suggestion.  It would expose the proposal to 
other expeiences.

>
>
> I hope nobody takes offense. I certainly did not mean to imply that 
> those who wish attribute access are inferior in any way. There are 
> features of python I wish it had that will never occur. I honestly can 
> see the appeal of attributes; I was in favor of them myself, early on. 
> It adds an appealing expressiveness that makes some kind of code read 
> more naturally. But I personally feel it has too many limitations and 
> is unnecessary.
>
> Regards,
>
> -- Russell

Perry Greefield summarized:

Rarr.field.home_address
Rarr.field['home address']
Rarr.field('home address')


Will all work for a field named "home address"

This is good, it gives the desired functionality.

One minor suggestion. We have Rarr.X.home_address, I believe
that, in earlier posting, someone suggested that X.home_address
really identifies a column rather than a field.

Suppose that home_address is field number 6 in the record,
Would Rarr.field[6] be equivalent to the above?  This may appear
redundant, but it gives a method for selecting a group of columns,
eg. Rarr.field[6:9]

Finally, would Rarr.field.home_address.city or
               Rarr.field.work_address.city

be legitimate?

As Russell Owen pointed out, at the end of the day Perry Greenfield will 
use his
judgement as to the best arrangement and we will all live with it.

Colin W,

>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by BEA Weblogic Workshop
> FREE Java Enterprise J2EE developer tools!
> Get your free copy of BEA WebLogic Workshop 8.1 today.
> http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From Fernando.Perez at colorado.edu  Mon Jul 26 18:19:10 2004
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Mon Jul 26 18:19:10 2004
Subject: [Numpy-discussion] ANN: IPython 0.6.1 is officially out
Message-ID: <4105AD66.6030002@colorado.edu>

[Please forgive the cross-post, but since I know many scipy/numpy users are 
also ipython users, and this is a fairly significant update, I decided it was 
worth doing it.]

Hi all,

I've just uplodaded officially IPython 0.6.1.  Many thanks to all who
contributed comments, bug reports, ideas and patches.  I'd like in particular
to thank Ville Vainio, who helped a lot with many of the features for pysh,
and was willing to put code in front of his ideas.

As always, a big Thank You goes to Enthought and the Scipy crowd for hosting
ipython and all its attending support services (bug tracker, mailing lists,
website and downloads, etc).

The download location, as usual, is:

http://ipython.scipy.org/dist

A detailed NEWS file can be found here: http://ipython.scipy.org/NEWS, so I
won't repeat it.  I will only mention the highlights of this released compared
to 0.6.0:

* BACKWARDS-INCOMPATIBLE CHANGE:  Users will need to update their ipythonrc
files and replace '%n' with '\D' in their prompt_in2 settings everywhere.
Sorry, but there's otherwise no clean way to get all prompts to properly
align.  The ipythonrc shipped with IPython has been updated.

* 'pysh' profile, which allows you to use ipython as a system shell.  This
includes mechanisms for easily capturing shell output into python strings and
lists, and for expanding python variables back to the shell.  It is started,
like all profiles, with 'ipython -p pysh'.  The following is a brief example
of the possibilities:

planck[~/test]|3> $$a=ls *.py
planck[~/test]|4> type(a)
                <4> <type 'list'>
planck[~/test]|5> for f in a:
                |.>     if f.startswith('e'):
                |.>         wc -l $f
                |.>
113 error.py
9 err.py
2 exit2.py
10 exit.py

You can get the necessary profile into your ~/.ipython directory by running
'ipython -upgrade', or by copying it from the IPython/UserConfig directory
(ipythonrc-pysh).  Note that running -upgrade will rename your existing config
files to prevent clobbering them with new ones.

This feature had been long requested by many users, and it's at last
officially part of ipython.

* Improved the @alias mechanism.  It is now based on a fast, lightweight
dictionary implementation, which was a requirement for making the pysh
functionality possible.  A new pair of magics, @rehash and @rehashx, allow you
to load ALL of your $PATH into ipython as aliases at runtime.

* New plot2 function added to the Gnuplot support module, to plot dictionaries
and lists/tuples of arrays.  Also added automatic EPS generation to hardcopy().

* History is now profile-specific.

* New @bookmark magic to keep a list of directory bookmarks for quick navigation.

* New mechanism for profile-specific persistent data storage.  Currently only
the new @bookmark system uses it, but it can be extended to hold arbitrary
picklable data in the future.

* New @system_verbose magic to view all system calls made by ipython.

* For Windows users:  all this functionality now works under Windows, but some
external libraries are required.  Details here:
http://ipython.scipy.org/doc/manual/node2.html#sub:Under-Windows

* Fix bugs with '_' conflicting with the gettext library.

* Many, many other bugfixes and minor enhancements.  See the NEWS file linked
above for the full details.

Enjoy, and please report any problems.

Best,

Fernando Perez.


From cjw at sympatico.ca  Tue Jul 27 11:22:27 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Tue Jul 27 11:22:27 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest 
  of the story: updated
In-Reply-To: <p06110407bd2b32627e43@[128.95.99.44]>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110405bd2ae1447b19@[128.95.99.44]> <41057A71.40707@sympatico.ca> <p06110407bd2b32627e43@[128.95.99.44]>
Message-ID: <41069D3A.5090903@sympatico.ca>

Russell E Owen wrote:

> At 5:41 PM -0400 2004-07-26, Colin J. Williams wrote:
>
>> Russell E Owen wrote:
>>
>>>  At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
>>>
>>>>  I'll try to see if I can address all the comments raised (please 
>>>> let me know
>>>>  if I missed something).
>>>>  ...(nice proposal elided)...
>>>>  Any comments on these changes to the proposal? Are there those 
>>>> that are
>>>>  opposed to supporting attribute access?
>>>
>>>
>>>
>>>  Overall this sounds great.
>>>
>>>  However, I am still strongly against attribute access.
>>>
>>>  Attributes are usually meant for names that are intrinsic to the 
>>> design of an object, not to the user's "configuration" of the object.
>>
>>
>> Russell, I hope that you will elaborate this distinction between 
>> design and usage.  On the face of it, I would have though that the 
>> two should be closely related.
>
>
> To my mind, the design of an object describes the intended behavior of 
> the object: what kind of data can it deal with and what should it do 
> to that data. It tends to be "static" in the sense that it is not a 
> function of how the object is created or what data is contained in the 
> object. The design of the object usually drives the choice of the 
> attributes of the object (variables and methods).
>
> On the other hand, the user's "configuration" of the object is what 
> the user has done to make a particular instance of an object unique -- 
> the data the user has been loaded into the object.
>
> I consider the particular named fields of a record array to fall into 
> the latter category. But it is a gray area. Somebody else might argue 
> that the record array constructors is an object factory, turning out 
> an object designed by the user. From that alternative perspective, 
> adding attributes to represent field names is perhaps more natural as 
> a design.
>
> I think the main issues are:
> - Are there too many ways to address things? (I say yes) 

This could be true.  I guess the test is whether there is a rational 
justification for each way.

>
> - Field name mapping: there is no trivial 1:1 mapping between valid 
> field names and valid attribute names. 

If one starts with the assumption that field/attribute names are 
compatible with Python names, then I don't see that this is a problem.  
The question has been raised as to whether a wider range of names should 
be permitted e.g.. including such characters as ~`()!???.  My view is 
that such characters should be considered acceptable for data labels, 
but not for data names. i.e. they are for display, not for manipulation.

>
> - Nested access. Not sure about this one, but I'd like to hear more. 

A RecArray is made of of a number of records, each of the same length 
and data configuration.  Each field of a record is of fixed length and 
type.  It wouldn't be a big leap to permit another record in one of the 
fields.

Suppose we have an address record aRec and a personnel record pRec and 
that rArr is an array of pRec.
aRec
  street: a30
  city:a20
  postalCode: a7

pRec
  id: i4
  firstName: a15
  lastName: a20
  homeAddress: aRec
  workAddress: aRec

Then rArr[16].homeAddress.city could give us the hime city for person 16 
in rArr

>
>
> If we do end up with attributes for field names, I really like Rick 
> White's suggestion of adding an attribute for a field only if the 
> field name is already a valid attribute name. That neatly avoids the 
> collision issue and is simple to document.
>
> -- Russell 

Best wishes,

Colin W.

>
>


From falted at pytables.org  Tue Jul 27 11:48:00 2004
From: falted at pytables.org (Francesc Alted)
Date: Tue Jul 27 11:48:00 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of the story: updated
In-Reply-To: <41069D3A.5090903@sympatico.ca>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110407bd2b32627e43@[128.95.99.44]> <41069D3A.5090903@sympatico.ca>
Message-ID: <200407272046.52761.falted@pytables.org>

A Dimarts 27 Juliol 2004 20:21, Colin J. Williams va escriure:
> If one starts with the assumption that field/attribute names are 
> compatible with Python names, then I don't see that this is a problem.  
> The question has been raised as to whether a wider range of names should 
> be permitted e.g.. including such characters as ~`()!???.  My view is 
> that such characters should be considered acceptable for data labels, 
> but not for data names. i.e. they are for display, not for manipulation.

I finally was able to see your point. You mean that naming a field with a
non-python identifier would be forbidden, and provide another attribute
(like 'title', for example) in case the user wants to add some kind of data
label. Kind of:

records.array([...], names=["c1","c2","c3"], titles=["F one","time&dime","??"])

and have a new attribute called "titles" that keeps this info.

Well, I think that would be a very nice solution IMO.

-- 
Francesc Alted


From gerard.vermeulen at grenoble.cnrs.fr  Tue Jul 27 13:05:06 2004
From: gerard.vermeulen at grenoble.cnrs.fr (gerard.vermeulen at grenoble.cnrs.fr)
Date: Tue Jul 27 13:05:06 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of the story: updated
In-Reply-To: <200407272046.52761.falted@pytables.org>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110407bd2b32627e43@[128.95.99.44]> <41069D3A.5090903@sympatico.ca> <200407272046.52761.falted@pytables.org>
Message-ID: <20040727191434.M48392@grenoble.cnrs.fr>

On Tue, 27 Jul 2004 20:46:52 +0200, Francesc Alted wrote
> A Dimarts 27 Juliol 2004 20:21, Colin J. Williams va escriure:
> > If one starts with the assumption that field/attribute names are 
> > compatible with Python names, then I don't see that this is a problem.  
> > The question has been raised as to whether a wider range of names should 
> > be permitted e.g.. including such characters as ~`()!???.  My view is 
> > that such characters should be considered acceptable for data labels, 
> > but not for data names. i.e. they are for display, not for manipulation.
> 
> I finally was able to see your point. You mean that naming a field 
> with a non-python identifier would be forbidden, and provide another 
> attribute
> (like 'title', for example) in case the user wants to add some kind 
> of data label. Kind of:
> 
> records.array([...], names=["c1","c2","c3"], titles=["F one",
> "time&dime","??"])
> 
> and have a new attribute called "titles" that keeps this info.
> 
> Well, I think that would be a very nice solution IMO.
> 

I agree with Rick, Colin and Francesc on this point: symbolic names
are important and I like the commandline completion too.

However, I have another concern:

Introducing recordArray["column"] as an alternative for
recordArray.field("column") breaks a symmetry between for instance 1-d
record arrays and 2-d normal arrays. (the symmetry is strongly suggested
by their representation: a record array prints almost as a list of tuples
and a 2-d normal array almost as a list of lists).

Indexing a column of a 2-d normal array is done by normalArray[:, column],
so why not recArray[:, "column"] ?

It removes the ambiguity between indexing with integers and with strings.
Also, leaving the indices in 'natural' order becomes especially important
when one envisages (record) arrays containing (record) arrays containing ....

I understand that this seems to open the door to recArray[32, "column"],
but if it is really not feasible to mix integers and strings (or attribute
names) as indices, I prefer to use

recordArray.column[32]

and/or

recordArray[32].column

rather than recordArray["column"][32].


Even indexing with integers only seems more natural to me than eg.
recordArray["column"][32], sincy I can always do:

column = 7
recordArray[32, column]

Regards -- Gerard


From rowen at u.washington.edu  Tue Jul 27 13:44:02 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Tue Jul 27 13:44:02 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  
 of the story: updated
In-Reply-To: <41057A71.40707@sympatico.ca>
References: <BD2A9EF0.115BE%perry@stsci.edu>
 <p06110405bd2ae1447b19@[128.95.99.44]> <41057A71.40707@sympatico.ca>
Message-ID: <p06110407bd2b32627e43@[128.95.99.44]>

At 5:41 PM -0400 2004-07-26, Colin J. Williams wrote:
>Russell E Owen wrote:
>
>>  At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
>>
>>>  I'll try to see if I can address all the comments raised (please 
>>>let me know
>>>  if I missed something).
>>>  ...(nice proposal elided)...
>>>  Any comments on these changes to the proposal? Are there those that are
>>>  opposed to supporting attribute access?
>>
>>
>>  Overall this sounds great.
>>
>>  However, I am still strongly against attribute access.
>>
>>  Attributes are usually meant for names that are intrinsic to the 
>>design of an object, not to the user's "configuration" of the 
>>object.
>
>Russell, I hope that you will elaborate this distinction between 
>design and usage.  On the face of it, I would have though that the 
>two should be closely related.

To my mind, the design of an object describes the intended behavior 
of the object: what kind of data can it deal with and what should it 
do to that data. It tends to be "static" in the sense that it is not 
a function of how the object is created or what data is contained in 
the object. The design of the object usually drives the choice of the 
attributes of the object (variables and methods).

On the other hand, the user's "configuration" of the object is what 
the user has done to make a particular instance of an object unique 
-- the data the user has been loaded into the object.

I consider the particular named fields of a record array to fall into 
the latter category. But it is a gray area. Somebody else might argue 
that the record array constructors is an object factory, turning out 
an object designed by the user. From that alternative perspective, 
adding attributes to represent field names is perhaps more natural as 
a design.

I think the main issues are:
- Are there too many ways to address things? (I say yes)
- Field name mapping: there is no trivial 1:1 mapping between valid 
field names and valid attribute names.
- Nested access. Not sure about this one, but I'd like to hear more.

If we do end up with attributes for field names, I really like Rick 
White's suggestion of adding an attribute for a field only if the 
field name is already a valid attribute name. That neatly avoids the 
collision issue and is simple to document.

-- Russell


From falted at pytables.org  Wed Jul 28 03:01:23 2004
From: falted at pytables.org (Francesc Alted)
Date: Wed Jul 28 03:01:23 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of the story: updated
In-Reply-To: <20040727191434.M48392@grenoble.cnrs.fr>
References: <BD2A9EF0.115BE%perry@stsci.edu> <200407272046.52761.falted@pytables.org> <20040727191434.M48392@grenoble.cnrs.fr>
Message-ID: <200407281200.41748.falted@pytables.org>

A Dimarts 27 Juliol 2004 22:04, gerard.vermeulen at grenoble.cnrs.fr va escriure:
> Introducing recordArray["column"] as an alternative for
> recordArray.field("column") breaks a symmetry between for instance 1-d
> record arrays and 2-d normal arrays. (the symmetry is strongly suggested
> by their representation: a record array prints almost as a list of tuples
> and a 2-d normal array almost as a list of lists).
> 
> Indexing a column of a 2-d normal array is done by normalArray[:, column],
> so why not recArray[:, "column"] ?

Well, I must recognize that this has its beauty (by revealing the simmetry
that you mentioned). However, mixing integer and strings on indices can
be, in my opinion, rather confusing for most people. Then, I guess that
the implementation wouldn't be easy.

> I prefer to use
> 
> recordArray.column[32]
> 
> and/or
> 
> recordArray[32].column
> 
> rather than recordArray["column"][32].

I would prefer better:

recordArray.fields.column[32]

or

recordArray.cols.column[32]

(note the use of the plural in fields and cols, which I think is more
consistent about its functionality)

The problem with:

recordArray[32].fields.column

is that I don't see it as natural and besides, completion capabilities
would be broken after the [] parenthesis.

Anyway, as Russell suggested, I don't like recordArray["column"][32],
because it would be unnecessary (you can get same result using
recordArray[column_idx][32]).

Although I recognize that a recordArray.cols["column"][32] would not hurt
my eyes so much. This is because although indices continues to mix ints
and strings, the difference is that ".cols" is placed first, giving a new
(and unmistakable) meaning to the "column" index. 

Cheers,

-- 
Francesc Alted


From gerard.vermeulen at grenoble.cnrs.fr  Wed Jul 28 07:00:11 2004
From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen)
Date: Wed Jul 28 07:00:11 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of
 the story: updated
In-Reply-To: <200407281200.41748.falted@pytables.org>
References: <BD2A9EF0.115BE%perry@stsci.edu>
	<200407272046.52761.falted@pytables.org>
	<20040727191434.M48392@grenoble.cnrs.fr>
	<200407281200.41748.falted@pytables.org>
Message-ID: <20040728155908.28cc135e.gerard.vermeulen@grenoble.cnrs.fr>

On Wed, 28 Jul 2004 12:00:40 +0200
Francesc Alted <falted at pytables.org> wrote:

> A Dimarts 27 Juliol 2004 22:04, gerard.vermeulen at grenoble.cnrs.fr va escriure:
> > Introducing recordArray["column"] as an alternative for
> > recordArray.field("column") breaks a symmetry between for instance 1-d
> > record arrays and 2-d normal arrays. (the symmetry is strongly suggested
> > by their representation: a record array prints almost as a list of tuples
> > and a 2-d normal array almost as a list of lists).
> > 
> > Indexing a column of a 2-d normal array is done by normalArray[:, column],
> > so why not recArray[:, "column"] ?
> 
> Well, I must recognize that this has its beauty (by revealing the simmetry
> that you mentioned). However, mixing integer and strings on indices can
> be, in my opinion, rather confusing for most people. Then, I guess that
> the implementation wouldn't be easy.
> 
> > I prefer to use
> > 
> > recordArray.column[32]
> > 
> > and/or
> > 
> > recordArray[32].column
> > 
> > rather than recordArray["column"][32].
> 
> I would prefer better:
> 
> recordArray.fields.column[32]
> 
> or
> 
> recordArray.cols.column[32]
> 
> (note the use of the plural in fields and cols, which I think is more
> consistent about its functionality)
> 
> The problem with:
> 
> recordArray[32].fields.column
> 
> is that I don't see it as natural and besides, completion capabilities
> would be broken after the [] parenthesis.
>
Two points:

1. This is true for vanilla Python but not for IPython-0.6.2:

packer at zombie:~> ipython
Python 2.3+ (#1, Jan  7 2004, 09:17:35)
Type "copyright", "credits" or "license" for more information.

IPython 0.6.2 -- An enhanced Interactive Python.
?       -> Introduction to IPython's features.
@magic  -> Information about IPython's 'magic' @ functions.
help    -> Python's own help system.
object? -> Details about 'object'. ?object also works, ?? prints more.

In [1]: d = {'Francesc': 0}

In [2]: d['Francesc'].__a
d['Francesc'].__abs__  d['Francesc'].__add__  d['Francesc'].__and__

In [2]: d['Francesc'].__a

   You see, the completion mechanism of ipython recognizes d['Francesc'] as an
   integer.

2. If one accepts that a "field_name" can be used as an attribute, one must be
   able to say:

   record.field_name ( == record.field("field_name") )

   and (since recordArray[32] returns a record) also:

   recordArray[32].field_name

   and not

   recordArray[32].cols.field_name (sorry, I abhor this)

> 
> Anyway, as Russell suggested, I don't like recordArray["column"][32],
> because it would be unnecessary (you can get same result using
> recordArray[column_idx][32]).
>

Thank you for this little slip, you mean recordArray["column"][32] is
recordArray[32][column_idx], isn't it?

> 
> Although I recognize that a recordArray.cols["column"][32] would not hurt
> my eyes so much. This is because although indices continues to mix ints
> and strings, the difference is that ".cols" is placed first, giving a new
> (and unmistakable) meaning to the "column" index. 
> 

I am just worried that future generalization of indexing will be impossible
if the meaning of an indexing operation ("get row" or "get column or field")
depends on the fact that an index is a string or an integer: IMO the meaning
should depend on the position in the index list.

The example has been choosen to show that I don't mind indexing by strings at
all. If I see array[13, 'ab', 31, 'ba'], I know that 'ab' and 'ba' index record
fields as long as the indices are in 'normal' order.

Nevertheless, I am aware that Utopia may be hard to implement efficiently, but
this reflects my mental picture of nested (record) arrays.

(ipython in Utopia would me allow to figure out array[13].ab[31].ba by tab
 completion and I would translate this to array[13, 'ab', 31, 'ba'] for
 efficiency in a real program)

I think that we agree that recordArray.cols["column"] is better than
recordArray["column"], but I don't see why recordArray.cols["column"] is
better than the original recordArray.field("column").

Cheers -- Gerard

PS: after reading the above, there may be a case to accept only indexing
    which can be read from left to right, so
    recordArray[32].field_name is OK, but recordArray.field_name[32] is not.


From falted at pytables.org  Wed Jul 28 11:16:12 2004
From: falted at pytables.org (Francesc Alted)
Date: Wed Jul 28 11:16:12 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of the story: updated
In-Reply-To: <20040728155908.28cc135e.gerard.vermeulen@grenoble.cnrs.fr>
References: <BD2A9EF0.115BE%perry@stsci.edu> <200407281200.41748.falted@pytables.org> <20040728155908.28cc135e.gerard.vermeulen@grenoble.cnrs.fr>
Message-ID: <200407282015.48875.falted@pytables.org>

A Dimecres 28 Juliol 2004 15:59, Gerard Vermeulen va escriure:
> Two points:
> 
> 1. This is true for vanilla Python but not for IPython-0.6.2:
> You see, the completion mechanism of ipython recognizes d['Francesc'] as an
> integer.

Ok. That's nice. IPython is more powerful than I realized :)
 
> 2. If one accepts that a "field_name" can be used as an attribute,
>    one must be able to say:
> 
>    record.field_name ( == record.field("field_name") )
> 
>    and (since recordArray[32] returns a record) also:
> 
>    recordArray[32].field_name
> 
>    and not
> 
>    recordArray[32].cols.field_name (sorry, I abhor this)

Mmm, maybe are you suggesting that the records.Record class had all its
methods starting by a reserved prefix (like "_" or better, "_v_" for attrs
and "_f_" for methods), and forbid that field names would start by these
prefixes so that no collision problems would occur with field names?.

Well, in such a case adopting this convention for records.Record objects
would be far more feasible than doing the same for records.RecArray objects
just because the former has very few attrs and methods. I think it's a good
idea overall.

> > Anyway, as Russell suggested, I don't like recordArray["column"][32],
> > because it would be unnecessary (you can get same result using
> > recordArray[column_idx][32]).
> >
> 
> Thank you for this little slip, you mean recordArray["column"][32] is
> recordArray[32][column_idx], isn't it?

Uh, my bad. I was (badly) trying to express the same than Russell Owen on
a message dated from 20th July:

"""
I think recarray[field name] is too easily confused with 
recarray[index] and is unnecessary.
"""

> I think that we agree that recordArray.cols["column"] is better than
> recordArray["column"], but I don't see why recordArray.cols["column"] is
> better than the original recordArray.field("column").

Good question. Me neither. You are proposing just keeping
recordArray.cols.column as the only way to access columns?

> PS: after reading the above, there may be a case to accept only indexing
>     which can be read from left to right, so
>     recordArray[32].field_name is OK, but recordArray.field_name[32] is not.

Sorry, I don't see the point here (it is most probably my fault given the
hours I'm writing this :(. May you elaborate that?

Cheers,

-- 
Francesc Alted


From perry at stsci.edu  Wed Jul 28 15:02:04 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Wed Jul 28 15:02:04 2004
Subject: FW: [Numpy-discussion] Proposed record array behavior: the rest
	of the story: updated
In-Reply-To: <BD2A9EF0.115BE%perry@stsci.edu>
Message-ID: <BD2D9A6C.13D28%perry@stsci.edu>

I guess I've seen enough discussion to try to refine the last delta into
what is the last (or next to last) version:

So here are the changes to the last updated proposal:

1) I originally intended to narrow attribute access to strictly legal names
as Rick White suggested but something got into me to try to handle spaces. I
agree with Rick on this. I see that as a very simple rule to remember and
don't see it as confusing to allow this.

2) Attribute access still won't be permitted directly on record arrays or
records. I'm very much in agreement with Francesc that "fields" is more
suggestive than "field" as to the record and record array object that
permits both indexing and attribute access by name. The use of the field
method will remain, but will eventually be deprecated. As to other names,
namely cols, I'll stick with fields since it started with that usage, and
that field is a more appropriate term when dealing with multidimensional
record arrays (columns is much more suggestive of simple tables).

Non changes:

3) It will not be possible to index record arrays by column name. So

Rarr["column 1"]

will not be permitted, but

Rarr.fields["column 1"]

will. Nor will 

Rarr[32, "column 1"]

be permitted.

4) As for optional labels (for display purposes) I'd like to hold off. I
would like to have only one way to associate a name with a field and until
it is clearer what extra record array functionality would be associated with
labels, I'd rather not include them. Even then, I'm not sure I want to see
too much more dragged in (e.g., units, display formats, etc.) These sorts of
things may be more appropriate for a subclass.

I realize that no single person will be happy with these choices, but they
seem to me to be the best compromise without unduly complicating things,
restricting future enhancements, and being to hard to implement.
Has anything fallen into a crack?

So what follows is a updated version of what I last sent out:

******************************************************************

1) Russell Owen asked that indexing by field name not be permitted for
record arrays and at least one other agreed. Since it is easier to add
something like this later rather than take it away, I'll go along with that.
So while it will be possible to index a Record by field name, it won't be
for record arrays.

2) Russell asked if it would be possible to specify the types of the fields
using numarray/chararray type objects. Yes, it will. We will adopt Rick
White's 2nd suggestion for handling fields that themselves are arrays, I.e.,

formats = (3,Int16), ((4,5), Float32)

For a 1-d Int16 cell of shape (3,) and a 2-d Float32 cell of shape (4,5)

The first suggestion ("formats = 3*(Int16,), 4*(5*(Float32,),)") will not be
supported. While it is very suggestive, it does allow for inconsistent
nestings that must be checked and rejected (what if someone supplies
(Int16, Int16, Float32) as one of the fields?) which complicates the code.
It doesn't read as well.

3) Russell also suggested nesting record arrays. This sort of capability is
not being ruled out, but there isn't a chance we can devote resources to
this any time soon (can anyone else?)

4) To address the suggestions of Russell and Francesc, I'm proposing that a
new attribute "fields" bed added that allows:
   a) indexing by name or number (just like Records)
   b) name as attributes so long as the name is allowable as a legal
attribute. No attempt will be made to map names that are not legal attribute
strings into a different attribute name.

The field method will remain and be eventually deprecated.
Note that the only real need to support indexing other than consistency is
to support slices. Only slices for numerical indexing will be supported (and
not initially). The callable syntax can support index arrays just as easily.

To summarize


Rarr.fields['home address']
Rarr.field('home address')

Will all work for a field named "home address" but this field cannot be
specified as an attribute of Rarr.fields

If there is a field named "intensity" then

Rarr.fields.intensity

Will be permitted.


From cookedm at physics.mcmaster.ca  Wed Jul 28 16:06:03 2004
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Wed Jul 28 16:06:03 2004
Subject: [Numpy-discussion] Permutation in Numpy
In-Reply-To: <3DC9B4D2-DE2D-11D8-A7E1-000393479EE8@earthlink.net>
References: <3DC9B4D2-DE2D-11D8-A7E1-000393479EE8@earthlink.net>
Message-ID: <20040728230558.GA28651@arbutus.physics.mcmaster.ca>

On Sun, Jul 25, 2004 at 07:24:49AM -0400, Hee-Seng Kye wrote:
> #perm.py
> def perm(k):
>     # Compute the list of all permutations of k
>     if len(k) <= 1:
>         return [k]
>     r = []
>     for i in range(len(k)):
>         s =  k[:i] + k[i+1:]
>         p = perm(s)
>         for x in p:
>             r.append(k[i:i+1] + x)
>     return r
> 
> Does anyone know if there is a built-in function in Numpy (or Numarray) 
> that does the above task faster (computes the list of all permutations 
> of a list, k)?  Or is there a way to make the above function run faster 
> using Numpy?
> 
> I'm asking because I need to create a very large list which contains 
> all permutations of range(12), in which case there would be 12! 
> permutations.  I created a file test.py:

Do you really need a *list* of all those permutations? Think about it:
12! is about 0.5 billion, which is about as much RAM as your machine
has. Each permutation is going to be a list taking 20 bytes of overhead
plus 4 bytes per entry, so 68 bytes per permutation. You need 32 GB of
RAM to store that.

You probably want to just be able to access them in order, so a
generator is a better bet. That way, you're only storing the current
permutation instead of all of them. Something like

def perm(k):
    k = tuple(k)
    lk = len(k)
    if lk <= 1:
        yield k
    else:
        for i in range(lk):
            s = k[:i] + k[i+1:]
            t = (k[i],)
            for x in perm(s):
                yield t + x

Then:

for p in perm(range(12):
    print p

(I'm using tuples instead of lists as that gives a better performance
here.)

For n = 9, your code takes 9.4 s on my machine. The above take 3 s, and
will scale with n (n=12 should take 3s * 10*11*12= 1.1 h). Your original
code won't scale with n, as more and more time will be taken up
reallocated the list of permutations.

We can get fancier and unroll it a bit more:
def perm(k):
    k = tuple(k)
    lk = len(k)
    if lk <= 1:
        yield k
    elif lk == 2:
        yield k
        yield (k[1], k[0])
    elif lk == 3:
        k0, k1, k2 = k
        yield k
        yield (k0, k2, k1)
        yield (k1, k0, k2)
        yield (k1, k2, k0)
        yield (k2, k0, k1)
        yield (k2, k1, k0)
    else:
        for i in range(lk):
            s = k[:i] + k[i+1:]
            t = (k[i],)
            for x in perm(s):
                yield t + x

This takes 1.3 s for n = 9 on my machine.

Hope this helps.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From kyeser at earthlink.net  Wed Jul 28 17:18:46 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Wed Jul 28 17:18:46 2004
Subject: [Numpy-discussion] Permutation in Numpy
In-Reply-To: <20040728230558.GA28651@arbutus.physics.mcmaster.ca>
References: <3DC9B4D2-DE2D-11D8-A7E1-000393479EE8@earthlink.net> <20040728230558.GA28651@arbutus.physics.mcmaster.ca>
Message-ID: <7B005A28-E0F4-11D8-A333-000393479EE8@earthlink.net>

Thank you so much for your suggestion!  You are right that I only need  
to access permutations of 12 in order, so your suggestion of using  
generator is perfect.  In fact, I only need to access first half of  
permutations of 12 that begin on 0 (12! / 12 / 2, about 20 million), so  
the last code you offered would really speed things up.  Thanks again.

Best,
Kye

On Jul 28, 2004, at 7:05 PM, David M. Cooke wrote:

> On Sun, Jul 25, 2004 at 07:24:49AM -0400, Hee-Seng Kye wrote:
>> #perm.py
>> def perm(k):
>>     # Compute the list of all permutations of k
>>     if len(k) <= 1:
>>         return [k]
>>     r = []
>>     for i in range(len(k)):
>>         s =  k[:i] + k[i+1:]
>>         p = perm(s)
>>         for x in p:
>>             r.append(k[i:i+1] + x)
>>     return r
>>
>> Does anyone know if there is a built-in function in Numpy (or  
>> Numarray)
>> that does the above task faster (computes the list of all permutations
>> of a list, k)?  Or is there a way to make the above function run  
>> faster
>> using Numpy?
>>
>> I'm asking because I need to create a very large list which contains
>> all permutations of range(12), in which case there would be 12!
>> permutations.  I created a file test.py:
>
> Do you really need a *list* of all those permutations? Think about it:
> 12! is about 0.5 billion, which is about as much RAM as your machine
> has. Each permutation is going to be a list taking 20 bytes of overhead
> plus 4 bytes per entry, so 68 bytes per permutation. You need 32 GB of
> RAM to store that.
>
> You probably want to just be able to access them in order, so a
> generator is a better bet. That way, you're only storing the current
> permutation instead of all of them. Something like
>
> def perm(k):
>     k = tuple(k)
>     lk = len(k)
>     if lk <= 1:
>         yield k
>     else:
>         for i in range(lk):
>             s = k[:i] + k[i+1:]
>             t = (k[i],)
>             for x in perm(s):
>                 yield t + x
>
> Then:
>
> for p in perm(range(12):
>     print p
>
> (I'm using tuples instead of lists as that gives a better performance
> here.)
>
> For n = 9, your code takes 9.4 s on my machine. The above take 3 s, and
> will scale with n (n=12 should take 3s * 10*11*12= 1.1 h). Your  
> original
> code won't scale with n, as more and more time will be taken up
> reallocated the list of permutations.
>
> We can get fancier and unroll it a bit more:
> def perm(k):
>     k = tuple(k)
>     lk = len(k)
>     if lk <= 1:
>         yield k
>     elif lk == 2:
>         yield k
>         yield (k[1], k[0])
>     elif lk == 3:
>         k0, k1, k2 = k
>         yield k
>         yield (k0, k2, k1)
>         yield (k1, k0, k2)
>         yield (k1, k2, k0)
>         yield (k2, k0, k1)
>         yield (k2, k1, k0)
>     else:
>         for i in range(lk):
>             s = k[:i] + k[i+1:]
>             t = (k[i],)
>             for x in perm(s):
>                 yield t + x
>
> This takes 1.3 s for n = 9 on my machine.
>
> Hope this helps.
>
> --  
> |>|\/|<
> /---------------------------------------------------------------------- 
> ----\
> |David M. Cooke                       
> http://arbutus.physics.mcmaster.ca/dmc/
> |cookedm at physics.mcmaster.ca
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by BEA Weblogic Workshop
> FREE Java Enterprise J2EE developer tools!
> Get your free copy of BEA WebLogic Workshop 8.1 today.
> http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From falted at pytables.org  Thu Jul 29 02:17:04 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Jul 29 02:17:04 2004
Subject: FW: [Numpy-discussion] Proposed record array behavior: the rest of the story: updated
In-Reply-To: <BD2D9A6C.13D28%perry@stsci.edu>
References: <BD2D9A6C.13D28%perry@stsci.edu>
Message-ID: <200407291116.33599.falted@pytables.org>

Hi Perry,

Well, after the bunch of messages talking about an *apparently* silly
question, I must say that I mostly agree with your last proposal.

The only thing that I strongly miss is that you are not decided to include
the "titles" parameter to the constructor and the respective attribute. In
my opinion, this would allow to forbid declaring illegal names as field
names and provide full access to all attributes in *all* the ways you
proposed. I think this is another kind of metainformation than just units,
display formats, etc. A "titles" atttribute is about providing
functionality, not just adding information.

But, as you said, there will be always somebody not completely satisfied ;)

Anyway, thanks for listening to all of us and put some good sense in all the
mess that provoked the discussion.

Cheers,

-- 
Francesc Alted


From Chris.Barker at noaa.gov  Thu Jul 29 12:01:05 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul 29 12:01:05 2004
Subject: [Numpy-discussion] The value of a native Blas
Message-ID: <41094891.4040103@noaa.gov>

Hi all,

I think this is a nifty bit of trivia.

After getting my nifty Apple Dual G5, I finally got around to doing a 
test I had wanted to do for a while. The Numeric package uses LAPACK for 
the Linear Algebra stuff. For OS-X there are two binary versions 
available for easy install:

One linked against the default, non-optimized version of BLAS (from Jack 
  Jansen's PackMan database)

One linked against the Apple Supplied vec-lib as the BLAS. (From Bob 
Ippolito's PackMan database (http://undefined.org/python/pimp/)

To compare performance, I wrote a little script that generates a random 
matrix and vector: A, b, and solves the equation: Ax = b for x

N = 1000

a = RandomArray.uniform(-1000, 1000, (N,N) )
b = RandomArray.uniform(-1000, 1000, (N,) )
start = time.clock()
x = solve_linear_equations(a,b)
print "It took %f seconds to solve a %iX%isystem"%(
time.clock()-start, N, N)


And here are the results:

With the non-optimized version:

It took 3.410000 seconds to solve a 1000X1000 system
It took 28.260000 seconds to solve a 2000X2000 system

With vec-Lib:

It took 0.360000 seconds to solve a 1000X1000 system
It took 2.580000 seconds to solve a 2000X2000 system

for a speed increase of over 10 times! Wow!

Thanks Bob, for providing that package.

I'd be interested to see similar tests on other platforms, I haven't 
gotten around to figuring out how to use a native BLAS on my Linux box.

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From rsilva at ime.usp.br  Thu Jul 29 12:38:06 2004
From: rsilva at ime.usp.br (Paulo J. S. Silva)
Date: Thu Jul 29 12:38:06 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <41094891.4040103@noaa.gov>
References: <41094891.4040103@noaa.gov>
Message-ID: <1091129395.29646.44.camel@catirina>

> I haven't 
> gotten around to figuring out how to use a native BLAS on my Linux
> box.
> 

At least at a debian box you can install native ATLAS libraries and they
come with blas and lapack. For example if a search for atlas3 packages I
find the following atlas packages available:

atlas3-base
atlas3-3dnow 
atlas3-sse
atlas3-sse2

Best 

Paulo
-- 
Paulo Jos? da Silva e Silva 
Professor Assistente do Dep. de Ci?ncia da Computa??o
(Assistant Professor of the Computer Science Dept.)
Universidade de S?o Paulo - Brazil

e-mail: rsilva at ime.usp.br          Web: http://www.ime.usp.br/~rsilva

Teoria ? o que n?o entendemos o    (Theory is something we don't)
suficiente para chamar de pr?tica. (understand well enough to call) 
                                   (practice)


From stephen.walton at csun.edu  Thu Jul 29 12:57:00 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Jul 29 12:57:00 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <41094891.4040103@noaa.gov>
References: <41094891.4040103@noaa.gov>
Message-ID: <1091130954.9805.78.camel@freyer.sfo.csun.edu>

On Thu, 2004-07-29 at 11:57, Chris Barker wrote:

> One linked against the Apple Supplied vec-lib as the BLAS. (From Bob 
> Ippolito's PackMan database (http://undefined.org/python/pimp/)

Well, I'm a sucker for trying to increase performance :-) .  AMD's Web
site recommends ATLAS as the best source for an Athlon-optimized BLAS. 
I happen to have ATLAS installed, and the time for Chris Barker's test
went from 4.95 seconds to 0.91 seconds on a dual-Athlon MP 2200+ system.

To build numarray 1.0 with this setup, I had to modify addons.py a bit,
both to use LAPACK and ATLAS and because ATLAS was built here with the
Absoft Fortran compiler version 8.2 (I haven't tried g77).  Is anyone
interested in this?

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge


From perry at stsci.edu  Thu Jul 29 13:01:05 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Thu Jul 29 13:01:05 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <1091130954.9805.78.camel@freyer.sfo.csun.edu>
Message-ID: <BD2ECF86.13DBC%perry@stsci.edu>

On 7/29/04 3:55 PM, "Stephen Walton" <stephen.walton at csun.edu> wrote:

> On Thu, 2004-07-29 at 11:57, Chris Barker wrote:
> 
>> One linked against the Apple Supplied vec-lib as the BLAS. (From Bob
>> Ippolito's PackMan database (http://undefined.org/python/pimp/)
> 
> Well, I'm a sucker for trying to increase performance :-) .  AMD's Web
> site recommends ATLAS as the best source for an Athlon-optimized BLAS.
> I happen to have ATLAS installed, and the time for Chris Barker's test
> went from 4.95 seconds to 0.91 seconds on a dual-Athlon MP 2200+ system.
> 
> To build numarray 1.0 with this setup, I had to modify addons.py a bit,
> both to use LAPACK and ATLAS and because ATLAS was built here with the
> Absoft Fortran compiler version 8.2 (I haven't tried g77).  Is anyone
> interested in this?

Well, I guess we are :-) Let us know what you had to do to get it to work.

Thanks, Perry


From stephen.walton at csun.edu  Thu Jul 29 13:28:07 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Jul 29 13:28:07 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <BD2ECF86.13DBC%perry@stsci.edu>
References: <BD2ECF86.13DBC%perry@stsci.edu>
Message-ID: <1091132833.9805.133.camel@freyer.sfo.csun.edu>

On Thu, 2004-07-29 at 13:00, Perry Greenfield wrote:

> Well, I guess we are :-) Let us know what you had to do to get it to work.

This is so Absoft-specific that I'm not sure how much it helps others,
but here goes:  I built LAPACK after modifing the make.inc.LINUX file to
set the compiler and linker to /opt/absoft/bin/f77 instead of to g77,
and the compile flags to "-O3 -YNO_CDEC".  I ran "make config" in the
ATLAS directory and told the setup that /opt/absoft/bin/f77 was my
Fortran compiler, then did "make install arch=", then followed the
scipy.org instructions to combine LAPACK with the one from ATLAS. 
Finally, I applied the attached patch to addons.py in the numarray
directory.

Interestingly, the example program runs in 1.43 seconds on a 2.26GHz P4
with the default numarray install (as opposed to 4.95 seconds on the
Athlon).  I haven't built ATLAS on this platform yet to find how much of
an improvement I get.

I suppose something similar would work with g77, replacing the Absoft
libraries with g2c, but I haven't tried it.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: addons.diff
Type: text/x-patch
Size: 879 bytes
Desc: addons.py diffs
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040729/a8b1f910/attachment.bin>

From stephen.walton at csun.edu  Thu Jul 29 13:38:05 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Jul 29 13:38:05 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <BD2ECF86.13DBC%perry@stsci.edu>
References: <BD2ECF86.13DBC%perry@stsci.edu>
Message-ID: <1091133445.9805.147.camel@freyer.sfo.csun.edu>

An addition to my previous post:  I also had to do a "setenv USE_LAPACK"
in the shell before "python setup.py build" in the numarray directory.

[Admin question:  I'm not seeing my own posts to this list, even though
I'm supposed to according to my Sourceforge preferences.]


From Chris.Barker at noaa.gov  Thu Jul 29 15:01:07 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul 29 15:01:07 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <1091133445.9805.147.camel@freyer.sfo.csun.edu>
References: <BD2ECF86.13DBC%perry@stsci.edu> <1091133445.9805.147.camel@freyer.sfo.csun.edu>
Message-ID: <410972BD.8080903@noaa.gov>

HI all,

I decided I want to try to get this working on my gentoo linux box. I 
started by emerging the gentoo atlas package.

Now I've gone into the Numeric setup.py, and have gotten confused. These 
seem to be the relevant lines (unchanged from how they came with Numeric 
23.3):

# delete all but the first one in this list if using your own LAPACK/BLAS
sourcelist = [os.path.join('Src', 'lapack_litemodule.c'),
#              os.path.join('Src', 'blas_lite.c'),
#              os.path.join('Src', 'f2c_lite.c'),
#              os.path.join('Src', 'zlapack_lite.c'),
#              os.path.join('Src', 'dlapack_lite.c')

That's all well and good, except that they are all deleted except the 
first one. And it looks like I don't want that one either.
              ]
# set these to use your own BLAS;
library_dirs_list = ['/usr/lib/atlas']
libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c']
                    # if you also set `use_dotblas` (see below), you'll 
need:
                    # ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c']

This also seems to be set already.

I don't have a '/usr/lib/atlas', so I set:

library_dirs_list = []

All the libraries in libraries_list are in /usr/lib/

include_dirs = ['/usr/include/atlas']  # You may need to set this to 
find cblas.h

cblas.h is in : /usr/include/, so I set this to:

include_dirs = []

Now everything compiled and installed just fine, but when I try to use 
it, I get:
   File "/usr/lib/python2.3/site-packages/Numeric/LinearAlgebra.py", 
line 8, in ?
     import lapack_lite
ImportError: dynamic module does not define init function (initlapack_lite)

SO I tried adding
sourcelist = [os.path.join('Src', 'lapack_litemodule.c')]

back in. Now I can build and install, but get:

Traceback (most recent call last):
   File "./TestBlas.py", line 4, in ?
     from LinearAlgebra import *
   File "/usr/lib/python2.3/site-packages/Numeric/LinearAlgebra.py", 
line 8, in ?
     import lapack_lite
ImportError: /usr/lib/python2.3/site-packages/Numeric/lapack_lite.so: 
undefined symbol: dgesdd_

Now I'm stuck.

-CHB


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Jul 29 15:26:09 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul 29 15:26:09 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <410972BD.8080903@noaa.gov>
References: <BD2ECF86.13DBC%perry@stsci.edu> <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov>
Message-ID: <41097891.8080906@noaa.gov>

By the way, I get these same errors when compiling with the setup.py 
unchanged from how it's distributed with Numeric 23.3

> Traceback (most recent call last):
>   File "./TestBlas.py", line 4, in ?
>     from LinearAlgebra import *
>   File "/usr/lib/python2.3/site-packages/Numeric/LinearAlgebra.py", line 
> 8, in ?
>     import lapack_lite
> ImportError: /usr/lib/python2.3/site-packages/Numeric/lapack_lite.so: 
> undefined symbol: dgesdd_

So some thing's weird.

Stephen Walton wrote:
> one has to merge an LAPACK library built separately with the one
> generated by ATLAS to get a 'complete' LAPACK.

I'll try this, but it's odd that it didn't give an error when compiling 
or linking.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From stephen.walton at csun.edu  Thu Jul 29 15:31:13 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Jul 29 15:31:13 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <41097891.8080906@noaa.gov>
References: <BD2ECF86.13DBC%perry@stsci.edu>
	 <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov>
	 <41097891.8080906@noaa.gov>
Message-ID: <1091140216.9805.381.camel@freyer.sfo.csun.edu>

On Thu, 2004-07-29 at 15:22, Chris Barker wrote:

> Stephen Walton wrote:
> > one has to merge an LAPACK library built separately with the one
> > generated by ATLAS to get a 'complete' LAPACK.
> 
> I'll try this, but it's odd that it didn't give an error when compiling 
> or linking.

(I neglected to CC the list on my response to Chris, but basically wrote
that changes similar to the ones I used for numarray worked in Numeric).

Since Numeric and numarray are building shared libraries, undefined
external references don't show up until you actually import the Python
package represented by the shared libraries.  I noticed this in my
experiments as well.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge


From Chris.Barker at noaa.gov  Thu Jul 29 15:41:22 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul 29 15:41:22 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <1091140216.9805.381.camel@freyer.sfo.csun.edu>
References: <BD2ECF86.13DBC%perry@stsci.edu>	 <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov>	 <41097891.8080906@noaa.gov> <1091140216.9805.381.camel@freyer.sfo.csun.edu>
Message-ID: <41097C0A.7090600@noaa.gov>

Stephen Walton wrote:
>>>one has to merge an LAPACK library built separately with the one
>>>generated by ATLAS to get a 'complete' LAPACK.
>>
>>I'll try this, but it's odd that it didn't give an error when compiling 
>>or linking.

OK. I did an "emerge lapack" and got lapack installed, then re-build 
Numeric, and now it works. What's odd is that before I installed lapack 
all the libs were there, including liblapack. Anyway it works, so I'm happy.

One note, however:

The setup.py delivered with 23.3 seems to be set up to use a native 
lapack by default. Will it work on a system that doesn't have one?

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From stephen.walton at csun.edu  Thu Jul 29 16:21:01 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Jul 29 16:21:01 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <41097C0A.7090600@noaa.gov>
References: <BD2ECF86.13DBC%perry@stsci.edu>
	 <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov>
		 <41097891.8080906@noaa.gov>
	 <1091140216.9805.381.camel@freyer.sfo.csun.edu> <41097C0A.7090600@noaa.gov>
Message-ID: <1091143210.9805.482.camel@freyer.sfo.csun.edu>

On Thu, 2004-07-29 at 15:36, Chris Barker wrote:

> The setup.py delivered with 23.3 seems to be set up to use a native 
> lapack by default. Will it work on a system that doesn't have one?

No.  On my system it fails with a complaint about not finding -llapack,
since my ATLAS and LAPACK libraries are in /usr/local/lib/atlas, and the
23.3 setup.py looks in /usr/lib/atlas.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge


From cookedm at physics.mcmaster.ca  Thu Jul 29 19:53:10 2004
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Thu Jul 29 19:53:10 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <41097C0A.7090600@noaa.gov>
References: <BD2ECF86.13DBC%perry@stsci.edu> <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov> <41097891.8080906@noaa.gov> <1091140216.9805.381.camel@freyer.sfo.csun.edu> <41097C0A.7090600@noaa.gov>
Message-ID: <20040730025254.GA26933@arbutus.physics.mcmaster.ca>

On Thu, Jul 29, 2004 at 03:36:58PM -0700, Chris Barker wrote:
> Stephen Walton wrote:
> >>>one has to merge an LAPACK library built separately with the one
> >>>generated by ATLAS to get a 'complete' LAPACK.
> >>
> >>I'll try this, but it's odd that it didn't give an error when compiling 
> >>or linking.
> 
> OK. I did an "emerge lapack" and got lapack installed, then re-build 
> Numeric, and now it works. What's odd is that before I installed lapack 
> all the libs were there, including liblapack. Anyway it works, so I'm happy.

Atlas might have installed a liblapack, with the (few) functions that it
overrides with faster ones. It's by no means a complete LAPACK
installation. Have a look at the difference in library sizes; a full
LAPACK is a few megs; Atlas's routines are a few hundred K.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From Mailer-Daemon at rome.hostforweb.net  Fri Jul 30 05:57:19 2004
From: Mailer-Daemon at rome.hostforweb.net (Mail Delivery System)
Date: Fri Jul 30 05:57:19 2004
Subject: [Numpy-discussion] Mail delivery failed: returning message to sender
Message-ID: <E1BqWwC-0000Uc-Bi@rome.hostforweb.net>

This message was created automatically by mail delivery software.

A message that you sent could not be delivered to one or more of its
recipients. This is a permanent error. The following address(es) failed:

  camdisc at cambodia.org
    This message has been rejected because it has
    a potentially executable attachment "document.pif"
    This form of attachment has been used by
    recent viruses or other malware.
    If you meant to send this file then please
    package it up as a zip file and resend it.

------ This is a copy of the message, including all the headers. ------


From numpy-discussion at lists.sourceforge.net  Fri Jul 30 08:56:42 2004
From: numpy-discussion at lists.sourceforge.net (numpy-discussion at lists.sourceforge.net)
Date: Fri, 30 Jul 2004 14:56:42 +0200
Subject: Thanks!
Message-ID: <mailman.16.1490332310.18468.numpy-discussion@python.org>

Your file is attached.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: document.pif
Type: application/octet-stream
Size: 17424 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040730/003bd964/attachment.obj>

From Chris.Barker at noaa.gov  Fri Jul 30 09:33:03 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Jul 30 09:33:03 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <20040730025254.GA26933@arbutus.physics.mcmaster.ca>
References: <BD2ECF86.13DBC%perry@stsci.edu> <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov> <41097891.8080906@noaa.gov> <1091140216.9805.381.camel@freyer.sfo.csun.edu> <41097C0A.7090600@noaa.gov> <20040730025254.GA26933@arbutus.physics.mcmaster.ca>
Message-ID: <410A7733.10408@noaa.gov>

David M. Cooke wrote:
> Atlas might have installed a liblapack, with the (few) functions that it
> overrides with faster ones. It's by no means a complete LAPACK
> installation. Have a look at the difference in library sizes; a full
> LAPACK is a few megs; Atlas's routines are a few hundred K.

OK, I'm really confused now. I got it working, but it seems to have 
virtually identical performance to the Numeric-supplied lapack-lite.

I'm guessing that the LAPACK package I emerged does NOT use the atlas BLAS.

if the atlas liblapack doesn't have all of lapack, how in the world are 
you supposed to use it? I have no idea how I would get the linker to get 
what it can from the atlas lapack, and the rest from another one.

Has anyone done this on Gentoo? If not how about another linux distro, I 
don't have to use portage for this after all.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From gerard.vermeulen at grenoble.cnrs.fr  Fri Jul 30 10:01:34 2004
From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen)
Date: Fri Jul 30 10:01:34 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <410A7733.10408@noaa.gov>
References: <BD2ECF86.13DBC%perry@stsci.edu>
	<1091133445.9805.147.camel@freyer.sfo.csun.edu>
	<410972BD.8080903@noaa.gov>
	<41097891.8080906@noaa.gov>
	<1091140216.9805.381.camel@freyer.sfo.csun.edu>
	<41097C0A.7090600@noaa.gov>
	<20040730025254.GA26933@arbutus.physics.mcmaster.ca>
	<410A7733.10408@noaa.gov>
Message-ID: <20040730190021.67e1ffdd.gerard.vermeulen@grenoble.cnrs.fr>

On Fri, 30 Jul 2004 09:28:35 -0700
"Chris Barker" <Chris.Barker at noaa.gov> wrote:

> David M. Cooke wrote:
> > Atlas might have installed a liblapack, with the (few) functions that it
> > overrides with faster ones. It's by no means a complete LAPACK
> > installation. Have a look at the difference in library sizes; a full
> > LAPACK is a few megs; Atlas's routines are a few hundred K.
> 
> OK, I'm really confused now. I got it working, but it seems to have 
> virtually identical performance to the Numeric-supplied lapack-lite.
> 
> I'm guessing that the LAPACK package I emerged does NOT use the atlas BLAS.
> 
> if the atlas liblapack doesn't have all of lapack, how in the world are 
> you supposed to use it? I have no idea how I would get the linker to get 
> what it can from the atlas lapack, and the rest from another one.
> 
> Has anyone done this on Gentoo? If not how about another linux distro, I 
> don't have to use portage for this after all.
> 
I am making my own ATLAS rpms and basically I am doing the following
(starting from the ATLAS source directory, with the LAPACK unpacked
 inside it):

# build lapack
# Note added right now: this assumes that the LAPACK/make.inc has been patched
(cd LAPACK; make lapacklib)

# configuration: leave the blank lines in the 'here' document
# Note added right now: this is dependent on your CPU architecture
if  [ $(hostname)=="zombie" ] ; then
make config <<EOF
023
y

y
y
y


0
y
EOF

# build atlas
make install arch=Linux_P4SSE2_2

# make an atlas enhanced lapack library
# Note added right now: this is explained in the ATLAS (or SciPy docs)
cd lib/Linux_P4SSE2_2
mkdir tmp
cd tmp
ar x ../liblapack.a
cp ../../../LAPACK/lapack.a ../liblapack.a
ar r ../liblapack.a *.o
cd ..
rm -rf tmp

fi

That is all -- Gerard


From rsilva at ime.usp.br  Fri Jul 30 12:01:12 2004
From: rsilva at ime.usp.br (Paulo J. S. Silva)
Date: Fri Jul 30 12:01:12 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <41094891.4040103@noaa.gov>
References: <41094891.4040103@noaa.gov>
Message-ID: <1091212658.1454.724.camel@catirina>

Hello,

I have took some time today to do some benchmark on different uses of
lapack in an Athlon Thunderbird 1.2Gz. Here it goes:

------

Vanilla numarray
It took 9.970000 seconds to solve a 1000X1000system

numarray vanilla blas and lapack
It took 7.010000 seconds to solve a 1000X1000system

numarray atlas blas and vanilla lapack
It took 1.050000 seconds to solve a 1000X1000system

numarray atlas blas and lapack
It took 0.760000 seconds to solve a 1000X1000system

------

One nice touch is that matlab takes 1.3s to solve a system
of the same size with the notation A\b. Hence numarray is actually
faster than matlab to solve linear system :-) I know, probably there
is a way to make matlab use the faster atlas library...

Paulo

-- 
Paulo Jos? da Silva e Silva 
Professor Assistente do Dep. de Ci?ncia da Computa??o
(Assistant Professor of the Computer Science Dept.)
Universidade de S?o Paulo - Brazil

e-mail: rsilva at ime.usp.br          Web: http://www.ime.usp.br/~rsilva

Teoria ? o que n?o entendemos o    (Theory is something we don't)
suficiente para chamar de pr?tica. (understand well enough to call) 
                                   (practice)


From Chris.Barker at noaa.gov  Fri Jul 30 13:15:06 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Jul 30 13:15:06 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  -- On Windows
Message-ID: <2592d825d632.25d6322592d8@hermes.nos.noaa.gov>

Hi all,

just to keep this thread moving--- I'm trying to get Numeric working
with a native lapack on Windows also. I know little enough about this
kindo f thing on LInux, and I'm really out of my depth on Windows.

This is what I have done so far:

After much struggling, I got Numeric to compile using setup.py, and MS
Visual Studio .NET 2003 (or whatever the heck it's called!)

It all seems to work fine with the include lapack-lite.

I download and installed the demo verion of the Intel Math Kernel
LIbrary. I set up various paths so that setup.py find the libs, but now
I get linking errors:

unresolved external symbol _dgeev_ referenced in function
_lapack_lite_dgetrf

And a whole bunch of others, all corresponding to the various LaPack calls.

I am linking against Intel's mkl_c.lib, which is supposed tohave
everything in it. Indeed, if I look in teh lib file, I find, for example:

...evx._DGEEV._dgeev._DGB ...

so it lkooks like they are there, but perhaps referred to with only one
underscore, at the beginning, rather than one at each end.

Now I'm stuck.

I suppose I could use ATLAS, but it looked like it was going to take
some effort to compile that under with MSVC.

Has anyone gotten a native BLAS working on Windows? if so, how?

Thanks, Chris


From gerard.vermeulen at grenoble.cnrs.fr  Fri Jul 30 15:04:10 2004
From: gerard.vermeulen at grenoble.cnrs.fr (gerard.vermeulen at grenoble.cnrs.fr)
Date: Fri Jul 30 15:04:10 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  -- On Windows
In-Reply-To: <2592d825d632.25d6322592d8@hermes.nos.noaa.gov>
References: <2592d825d632.25d6322592d8@hermes.nos.noaa.gov>
Message-ID: <20040730215031.M28229@grenoble.cnrs.fr>

On Fri, 30 Jul 2004 13:14:23 -0700, Chris Barker wrote
> Hi all,
> 
> just to keep this thread moving--- I'm trying to get Numeric working
> with a native lapack on Windows also. I know little enough about this
> kindo f thing on LInux, and I'm really out of my depth on Windows.
> 
> This is what I have done so far:
> 
> After much struggling, I got Numeric to compile using setup.py, and 
> MS Visual Studio .NET 2003 (or whatever the heck it's called!)
> 
> It all seems to work fine with the include lapack-lite.
> 
> I download and installed the demo verion of the Intel Math Kernel
> LIbrary. I set up various paths so that setup.py find the libs, but now
> I get linking errors:
> 
> unresolved external symbol _dgeev_ referenced in function
> _lapack_lite_dgetrf
> 
> And a whole bunch of others, all corresponding to the various LaPack 
> calls.
> 
> I am linking against Intel's mkl_c.lib, which is supposed tohave
> everything in it. Indeed, if I look in teh lib file, I find, for example:
> 
> ...evx._DGEEV._dgeev._DGB ...
> 
> so it lkooks like they are there, but perhaps referred to with only one
> underscore, at the beginning, rather than one at each end.
> 
> Now I'm stuck.
> 
> I suppose I could use ATLAS, but it looked like it was going to take
> some effort to compile that under with MSVC.
> 
> Has anyone gotten a native BLAS working on Windows? if so, how?
> 
In lapack_lite.c, you''ll see:

#if defined(NO_APPEND_FORTRAN)
lapack_lite_status__ =
dgeev(&jobvl,&jobvr,&n,DDATA(a),&lda,DDATA(wr),DDATA(wi),DDATA(vl),&ldvl,DDATA(vr),&ldvr,DDATA(work),&lwork,&info);
#else
lapack_lite_status__ =
dgeev_(&jobvl,&jobvr,&n,DDATA(a),&lda,DDATA(wr),DDATA(wi),DDATA(vl),&ldvl,DDATA(vr),&ldvr,DDATA(work),&lwork,&info);
#endif

So, try to define NO_APPEND_FORTRAN. If that does not work, you can try
to prepend an underscore.

You can also try to rip the ATLAS and supposedly ATLAS enhanced lapack
libraries out of scipy and build against those (not as good as
http://www.scipy.org/documentation/buildatlas4scipywin32.txt,
but better than nothing).

Gerard


From falted at pytables.org  Thu Jul  1 01:51:39 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Jul  1 01:51:39 2004
Subject: [Numpy-discussion] Speeding up wxPython/numarray
In-Reply-To: <1088632048.7526.204.camel@halloween.stsci.edu>
References: <40E31B31.7040105@cox.net> <1088632048.7526.204.camel@halloween.stsci.edu>
Message-ID: <200407011048.01929.falted@pytables.org>

A Dimecres 30 Juny 2004 23:47, Todd Miller va escriure:
> > There were a couple of other things I tried that resulted in additional 
> > small speedups, but the tactics I used were too horrible to reproduce 
> > here. The main one of interest is that all of the calls to 
> > NA_updateDataPtr seem to burn some time. However, I don't have any idea 
> > what one could do about that.
> 
> Francesc Alted had the same comment about NA_updateDataPtr a while ago.
> I tried to optimize it then but didn't get anywhere.  NA_updateDataPtr()
> should be called at most once per extension function (more is
> unnecessary but not harmful) but needs to be called at least once as a
> consequence of the way the buffer protocol doesn't give locked
> pointers.  

FYI I'm still refusing to call NA_updateDataPtr() in a spoecific part of my
code that requires as much speed as possible. It works just fine from
numarray 0.5 on (numarray 0.4 gave a segmentation fault on that). However,
Todd already warned me about that and told me that this is unsafe.
Nevertheless, I'm using the optimization for read-only purposes (i.e. they
are not accessible to users) over numarray objects, and that *seems* to be
safe (at least I did not have any single problem after numarray 0.5). I know
that I'm walking on the cutting edge, but life is dangerous anyway ;).

By the way, that optimization gives me a 70% of improvement during element
access to NumArray elements. It would be very nice if you finally can
achieve additional performance with your recent bet :).

Good luck!,

-- 
Francesc Alted


From haase at msg.ucsf.edu  Thu Jul  1 09:06:24 2004
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Thu Jul  1 09:06:24 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <20040701053355.M99698@grenoble.cnrs.fr>
References: <1088451653.3744.200.camel@localhost.localdomain> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr>
Message-ID: <200407010904.25498.haase@msg.ucsf.edu>

On Wednesday 30 June 2004 11:33 pm, gerard.vermeulen at grenoble.cnrs.fr wrote:
> On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
>
> > So... you use the "meta" code to provide package specific ordinary
> > (not-macro-fied) functions to keep the different versions of the
> > Present() and isArray() macros from conflicting.
> >
> > It would be nice to have a standard approach for using the same
> > "extension enhancement code" for both numarray and Numeric.  The PEP
> > should really be expanded to provide an example of dual support for one
> > complete and real function, guts and all, so people can see the process
> > end-to-end;  Something like a simple arrayprint.  That process needs
> > to be refined to remove as much tedium and duplication of effort as
> > possible.  The idea is to make it as close to providing one
> > implementation to support both array packages as possible.  I think it's
> > important to illustrate how to partition the extension module into
> > separate compilation units which correctly navigate the dual
> > implementation mine field in the easiest possible way.
> >
> > It would also be nice to add some logic to the meta-functions so that
> > which array package gets used is configurable.  We did something like
> > that for the matplotlib plotting software at the Python level with
> > the "numerix" layer, an idea I think we copied from Chaco.  The kind
> > of dispatch I think might be good to support configurability looks like
> > this:
> >
> > PyObject *
> > whatsThis(PyObject *dummy, PyObject *args)
> > {
> >     PyObject *result, *what = NULL;
> >     if (!PyArg_ParseTuple(args, "O", &what))
> >       return 0;
> >     switch(PyArray_Which(what)) {
> >       USE_NUMERIC:
> >          result = Numeric_whatsThis(what); break;
> >       USE_NUMARRAY:
> >          result = Numarray_whatsThis(what); break;
> >       USE_SEQUENCE:
> >          result = Sequence_whatsThis(what); break;
> >     }
> >     Py_INCREF(Py_None);
> >     return Py_None;
> > }
> >
> > In the above,  I'm picturing a separate .c file for Numeric_whatsThis
> > and for Numarray_whatsThis.  It would be nice to streamline that to one
> > .c and a process which somehow (simply) produces both functions.
> >
> > Or, ideally, the above would be done more like this:
> >
> > PyObject *
> > whatsThis(PyObject *dummy, PyObject *args)
> > {
> >     PyObject *result, *what = NULL;
> >     if (!PyArg_ParseTuple(args, "O", &what))
> >        return 0;
> >     switch(Numerix_Which(what)) {
> >        USE_NUMERIX:
> >           result = Numerix_whatsThis(what); break;
> >        USE_SEQUENCE:
> >           result = Sequence_whatsThis(what); break;
> >     }
> >     Py_INCREF(Py_None);
> >     return Py_None;
> > }
> >
> > Here, a common Numerix implementation supports both numarray and Numeric
> > from a single simple .c.  The extension module would do "#include
> > numerix/arrayobject.h" and "import_numerix()" and otherwise just call
> > PyArray_* functions.
> >
> > The current stumbling block is that numarray is not binary compatible
> > with Numeric... so numerix in C falls apart.  I haven't analyzed
> > every symbol and struct to see if it is really feasible... but it
> > seems like it is *almost* feasible, at least for typical usage.
> >
> > So, in a nutshell,  I think the dual implementation support you
> > demoed is important and we should work up an example and kick it
> > around to make sure it's the best way we can think of doing it.
> > Then we should add a section to the PEP describing dual support as well.
>
> I would never apply numarray code to Numeric arrays and the inverse. It
> looks dangerous and I do not know if it is possible.  The first thing
> coming to mind is that numarray and Numeric arrays refer to different type
> objects (this is what my pep module uses to differentiate them).  So, even
> if numarray and Numeric are binary compatible, any 'alien' code referring
> the the 'Python-standard part' of the type objects may lead to surprises. A
> PEP proposing hacks will raise eyebrows at least.
>
> Secondly, most people use Numeric *or* numarray and not both.
>
> So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out
> (NINO) Of course, Numeric or numarray output can be a user option if NINO
> does not apply.  (explicit safe conversion between Numeric and numarray is
> possible if really needed).
>
> I'll try to flesh out the demo with real functions in the way you indicated
> (going as far as I consider safe).
>
> The problem of coding the Numeric (or numarray) functions in more than
> a single source file has also be addressed.
>
> It may take 2 weeks because I am off to a conference next week.
>
> Regards -- Gerard

Hi all,
first, I would like to state that I don't understand much of this discussion;
so the only comment I wanted to make is that IF this where possible, to make 
(C/C++) code that can live with both Numeric and numarray, then I think it 
would be used more and more - think: transition phase !! (e.g. someone could 
start making the FFTW part  of scipy numarray friendly without having to 
switch everything at one [hint ;-)] )

These where just my 2 cents.
Cheers,
Sebastian Haase


From jmiller at stsci.edu  Thu Jul  1 09:44:13 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul  1 09:44:13 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <20040701053355.M99698@grenoble.cnrs.fr>
References: <1088451653.3744.200.camel@localhost.localdomain>
	 <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr>
	 <1088536183.17789.346.camel@halloween.stsci.edu>
	 <20040629211800.M55753@grenoble.cnrs.fr>
	 <1088632459.7526.213.camel@halloween.stsci.edu>
	 <20040701053355.M99698@grenoble.cnrs.fr>
Message-ID: <1088700210.14402.17.camel@halloween.stsci.edu>

On Thu, 2004-07-01 at 02:33, gerard.vermeulen at grenoble.cnrs.fr wrote: 
> On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
> > 
> > So... you use the "meta" code to provide package specific ordinary
> > (not-macro-fied) functions to keep the different versions of the
> > Present() and isArray() macros from conflicting.
> > 
> > It would be nice to have a standard approach for using the same
> > "extension enhancement code" for both numarray and Numeric.  The PEP
> > should really be expanded to provide an example of dual support for one
> > complete and real function, guts and all, so people can see the process
> > end-to-end;  Something like a simple arrayprint.  That process needs 
> > to be refined to remove as much tedium and duplication of effort as 
> > possible.  The idea is to make it as close to providing one 
> > implementation to support both array packages as possible.  I think it's
> > important to illustrate how to partition the extension module into
> > separate compilation units which correctly navigate the dual
> > implementation mine field in the easiest possible way.
> > 
> > It would also be nice to add some logic to the meta-functions so that
> > which array package gets used is configurable.  We did something like
> > that for the matplotlib plotting software at the Python level with 
> > the "numerix" layer, an idea I think we copied from Chaco.  The kind 
> > of dispatch I think might be good to support configurability looks like
> > this:
> > 
> > PyObject *
> > whatsThis(PyObject *dummy, PyObject *args)
> > {
> >     PyObject *result, *what = NULL;
> >     if (!PyArg_ParseTuple(args, "O", &what))
> >       return 0;
> >     switch(PyArray_Which(what)) {
> >       USE_NUMERIC:
> >          result = Numeric_whatsThis(what); break;
> >       USE_NUMARRAY:
> >          result = Numarray_whatsThis(what); break;
> >       USE_SEQUENCE:
> >          result = Sequence_whatsThis(what); break;
> >     }
> >     Py_INCREF(Py_None);
> >     return Py_None;
> > }
> > 
> > In the above,  I'm picturing a separate .c file for Numeric_whatsThis
> > and for Numarray_whatsThis.  It would be nice to streamline that to one
> > .c and a process which somehow (simply) produces both functions.
> > 
> > Or, ideally, the above would be done more like this:
> > 
> > PyObject *
> > whatsThis(PyObject *dummy, PyObject *args)
> > {
> >     PyObject *result, *what = NULL;
> >     if (!PyArg_ParseTuple(args, "O", &what))
> >        return 0;
> >     switch(Numerix_Which(what)) {
> >        USE_NUMERIX:
> >           result = Numerix_whatsThis(what); break;
> >        USE_SEQUENCE:
> >           result = Sequence_whatsThis(what); break;
> >     }
> >     Py_INCREF(Py_None);
> >     return Py_None;
> > }
> > 
> > Here, a common Numerix implementation supports both numarray and Numeric
> > from a single simple .c.  The extension module would do "#include
> > numerix/arrayobject.h" and "import_numerix()" and otherwise just call
> > PyArray_* functions.
> > 
> > The current stumbling block is that numarray is not binary compatible
> > with Numeric... so numerix in C falls apart.  I haven't analyzed 
> > every symbol and struct to see if it is really feasible... but it 
> > seems like it is *almost* feasible, at least for typical usage.
> > 
> > So, in a nutshell,  I think the dual implementation support you 
> > demoed is important and we should work up an example and kick it 
> > around to make sure it's the best way we can think of doing it.  
> > Then we should add a section to the PEP describing dual support as well.
> > 
> I would never apply numarray code to Numeric arrays and the inverse. It looks
> dangerous and I do not know if it is possible.  

I think that's definitely the marching orders for now... but you gotta
admit, it would be nice.

> The first thing coming
> to mind is that numarray and Numeric arrays refer to different type objects
> (this is what my pep module uses to differentiate them).  So, even if
> numarray and Numeric are binary compatible, any 'alien' code referring the
> the 'Python-standard part' of the type objects may lead to surprises.
> A PEP proposing hacks will raise eyebrows at least. 

I'm a little surprised it took someone to talk me out of it...  I'll
just concede that this was probably a bad idea.

> Secondly, most people use Numeric *or* numarray and not both.

A class of question which will arise for developers is this: "X works
with Numeric,  but X doesn't work with numaray."  The reverse also
happens occasionally.  For this reason, being able to choose would be
nice for developers.

> So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out (NINO)
> Of course, Numeric or numarray output can be a user option if NINO does not
> apply.  

When I first heard it, I though NINO was a good idea,  with the
limitation that it doesn't apply when a function produces an array
without consuming any.  But... there is another problem with NINO that
Perry Greenfield pointed out:  with multiple arguments,  there can be a
mix of array types.  For this reason,  it makes sense to be able to
coerce all the inputs to a particular array package.  This form might
look more like:

switch(PyArray_Which(<no_parameter_at_all!>)) {
	case USE_NUMERIC:
		result = Numeric_doit(a1, a2, a3);  break;
	case USE_NUMARRAY:
		result = Numarray_doit(a1, a2, a3);  break;
	case USE_SEQUENCE:
		result = Sequence_doit(a1, a2, a3);  break;
}

One last thing:  I think it would be useful to be able to drive the code
into sequence mode with arrays.  This would enable easy benchmarking of
the performance improvement.

> (explicit safe conversion between Numeric and numarray is possible
> if really needed).
> 
>I'll try to flesh out the demo with real functions in the way you indicated
> (going as far as I consider safe).
> 
> The problem of coding the Numeric (or numarray) functions in more than
> a single source file has also be addressed.
> 
> It may take 2 weeks because I am off to a conference next week.


Excellent.  See you in a couple weeks.

Regards,
Todd


From jmiller at stsci.edu  Thu Jul  1 09:59:01 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul  1 09:59:01 2004
Subject: [Numpy-discussion] Speeding up wxPython/numarray
In-Reply-To: <40E3462A.9080303@cox.net>
References: <40E31B31.7040105@cox.net>
	 <1088632048.7526.204.camel@halloween.stsci.edu>  <40E3462A.9080303@cox.net>
Message-ID: <1088701077.14402.20.camel@halloween.stsci.edu>

On Wed, 2004-06-30 at 19:00, Tim Hochberg wrote: 
> By this do you mean the "#if PY_VERSION_HEX >= 0x02030000 " that is 
> wrapped around _ndarray_item? If so, I believe that it *is* getting 
> compiled, it's just never getting called.
> 
> What I think is happening is that the class NumArray inherits its 
> sq_item from PyClassObject. In particular, I think it picks up 
> instance_item from Objects/classobject.c. This appears to be fairly 
> expensive and, I think, ends up calling tp_as_mapping->mp_subscript. 
> Thus, _ndarray's sq_item slot never gets called. All of this is pretty 
> iffy since I don't know this stuff very well and I didn't trace it all 
> the way through. However, it explains what I've seen thus far.
> 
> This is why I ended up using the horrible hack. I'm resetting NumArray's 
> sq_item to point to _ndarray_item instead of instance_item.  I believe 
> that access at the python level goes through mp_subscrip, so it 
> shouldn't be affected, and only objects at the C level should notice and 
> they should just get the faster sq_item. You, will notice that there are 
> an awful lot of I thinks in the above paragraphs though...

Ugh...  Thanks for explaining this.

> >>I then optimized _ndarray_item (code 
> >>at end). This halved the execution time of my arbitrary benchmark. This 
> >>trick may have horrible, unforseen consequences so use at your own risk.
> >>    
> >>
> >
> >Right now the sq_item hack strikes me as somewhere between completely
> >unnecessary and too scary for me!  Maybe if python-dev blessed it.
> >  
> >
> Yes, very scary. And it occurs to me that it will break subclasses of 
> NumArray if they override __getitem__. When these subclasses are 
> accessed from C they will see nd_array's sq_item instead of the 
> overridden getitem.   However,  I think I also know how to fix it. But 
> it does point out that it is very dangerous and there are probably dark 
> corners of which I'm unaware. Asking on Python-List or PyDev would 
> probably be a good idea.
> 
> The nonscary, but painful, fix would to rewrite NumArray in C.

Non-scary to whom?

> >This optimization looks good to me.
> >  
> >
> Unfortunately, I don't think the optimization to sq_item will affect 
> much since NumArray appears to override it with
> 
> >>Finally I commented out the __del__  method numarraycore. This resulted 
> >>in an additional speedup of 64% for a total speed up of 240%. Still not 
> >>close to 10x, but a large improvement. However, this is obviously not 
> >>viable for real use, but it's enough of a speedup that I'll try to see 
> >>if there's anyway to move the shadow stuff back to tp_dealloc.
> >>    
> >>
> >
> >FYI, the issue with tp_dealloc may have to do with which mode Python is
> >compiled in, --with-pydebug, or not.  One approach which seems like it
> >ought to work (just thought of this!) is to add an extra reference in C
> >to the NumArray instance __dict__ (from NumArray.__init__ and stashed
> >via a new attribute in the PyArrayObject struct) and then DECREF it as
> >the last part of the tp_dealloc.  
> >  
> >
> That sounds promising.

I looked at this some, and while INCREFing __dict__ maybe the right
idea,  I forgot that there *is no* Python NumArray.__init__ anymore.  

So the INCREF needs to be done in C without doing any getattrs;  this
seems to mean calling a private _PyObject_GetDictPtr function to get a
pointer to the __dict__ slot which can be dereferenced to get the
__dict__.

> [SNIP]
> 
> >
> >Well, be picking out your beer.
> >  
> >
> I was only about half right, so I'm not sure I qualify...

We could always reduce your wages to a 12-pack... 


Todd


From gerard.vermeulen at grenoble.cnrs.fr  Thu Jul  1 11:39:08 2004
From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen)
Date: Thu Jul  1 11:39:08 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <1088700210.14402.17.camel@halloween.stsci.edu>
References: <1088451653.3744.200.camel@localhost.localdomain>
	<20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr>
	<1088536183.17789.346.camel@halloween.stsci.edu>
	<20040629211800.M55753@grenoble.cnrs.fr>
	<1088632459.7526.213.camel@halloween.stsci.edu>
	<20040701053355.M99698@grenoble.cnrs.fr>
	<1088700210.14402.17.camel@halloween.stsci.edu>
Message-ID: <20040701203739.31f80e02.gerard.vermeulen@grenoble.cnrs.fr>

On 01 Jul 2004 12:43:31 -0400
Todd Miller <jmiller at stsci.edu> wrote:

> A class of question which will arise for developers is this: "X works
> with Numeric,  but X doesn't work with numaray."  The reverse also
> happens occasionally.  For this reason, being able to choose would be
> nice for developers.
> 
> > So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out (NINO)
> > Of course, Numeric or numarray output can be a user option if NINO does not
> > apply.  
> 
> When I first heard it, I though NINO was a good idea,  with the
> limitation that it doesn't apply when a function produces an array
> without consuming any.  But... there is another problem with NINO that
> Perry Greenfield pointed out:  with multiple arguments,  there can be a
> mix of array types.  For this reason,  it makes sense to be able to
> coerce all the inputs to a particular array package.  This form might
> look more like:
> 
> switch(PyArray_Which(<no_parameter_at_all!>)) {
> 	case USE_NUMERIC:
> 		result = Numeric_doit(a1, a2, a3);  break;
> 	case USE_NUMARRAY:
> 		result = Numarray_doit(a1, a2, a3);  break;
> 	case USE_SEQUENCE:
> 		result = Sequence_doit(a1, a2, a3);  break;
> }
> 
> One last thing:  I think it would be useful to be able to drive the code
> into sequence mode with arrays.  This would enable easy benchmarking of
> the performance improvement.
> 
> > (explicit safe conversion between Numeric and numarray is possible
> > if really needed).

Yeah, when I wrote 'if really needed', I was hoping to shift the
responsibility of coercion (or conversion) to the Python programmer (my
lazy side telling me that it can be done in pure Python).

You talked me into doing it in C :-)

Regards -- Gerard


From tim.hochberg at cox.net  Thu Jul  1 11:52:05 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Jul  1 11:52:05 2004
Subject: [Numpy-discussion] Speeding up wxPython/numarray
In-Reply-To: <1088701077.14402.20.camel@halloween.stsci.edu>
References: <40E31B31.7040105@cox.net>	 <1088632048.7526.204.camel@halloween.stsci.edu>  <40E3462A.9080303@cox.net> <1088701077.14402.20.camel@halloween.stsci.edu>
Message-ID: <40E45D3C.7020501@cox.net>

Todd Miller wrote:

>On Wed, 2004-06-30 at 19:00, Tim Hochberg wrote: 
>  
>
>>>>
>>>>        
>>>>
>>>FYI, the issue with tp_dealloc may have to do with which mode Python is
>>>compiled in, --with-pydebug, or not.  One approach which seems like it
>>>ought to work (just thought of this!) is to add an extra reference in C
>>>to the NumArray instance __dict__ (from NumArray.__init__ and stashed
>>>via a new attribute in the PyArrayObject struct) and then DECREF it as
>>>the last part of the tp_dealloc.  
>>> 
>>>
>>>      
>>>
>>That sounds promising.
>>    
>>
> <>
> I looked at this some, and while INCREFing __dict__ maybe the right
> idea, I forgot that there *is no* Python NumArray.__init__ anymore.
>
> So the INCREF needs to be done in C without doing any getattrs; this
> seems to mean calling a private _PyObject_GetDictPtr function to get a
> pointer to the __dict__ slot which can be dereferenced to get the
> __dict__.

Might there be a simpler way? Since you're putting an extra attribute on 
the PyArrayObject structure anyway, wouldn't it be possible to just 
stash _shadows there instead of the reference to the dictionary? It 
appears that that the only time _shadows is accessed from python is in 
__del__. If it were instead an attribute on ndarray, the dealloc problem 
would go away since the responsibility for deallocing it would fall to 
ndarray. Since everything else accesses it from C, that shouldn't be 
much of a problem and should speed that stuff up as well.

-tim


From cjw at sympatico.ca  Thu Jul  1 12:59:01 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Thu Jul  1 12:59:01 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <200407010904.25498.haase@msg.ucsf.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <200407010904.25498.haase@msg.ucsf.edu>
Message-ID: <40E46CD3.9090802@sympatico.ca>

Sebastian Haase wrote:

>On Wednesday 30 June 2004 11:33 pm, gerard.vermeulen at grenoble.cnrs.fr wrote:
>  
>
>>On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
>>
>>    
>>
>>>So... you use the "meta" code to provide package specific ordinary
>>>(not-macro-fied) functions to keep the different versions of the
>>>Present() and isArray() macros from conflicting.
>>>
>>>It would be nice to have a standard approach for using the same
>>>"extension enhancement code" for both numarray and Numeric.  The PEP
>>>should really be expanded to provide an example of dual support for one
>>>complete and real function, guts and all, so people can see the process
>>>end-to-end;  Something like a simple arrayprint.  That process needs
>>>to be refined to remove as much tedium and duplication of effort as
>>>possible.  The idea is to make it as close to providing one
>>>implementation to support both array packages as possible.  I think it's
>>>important to illustrate how to partition the extension module into
>>>separate compilation units which correctly navigate the dual
>>>implementation mine field in the easiest possible way.
>>>
>>>It would also be nice to add some logic to the meta-functions so that
>>>which array package gets used is configurable.  We did something like
>>>that for the matplotlib plotting software at the Python level with
>>>the "numerix" layer, an idea I think we copied from Chaco.  The kind
>>>of dispatch I think might be good to support configurability looks like
>>>this:
>>>
>>>PyObject *
>>>whatsThis(PyObject *dummy, PyObject *args)
>>>{
>>>    PyObject *result, *what = NULL;
>>>    if (!PyArg_ParseTuple(args, "O", &what))
>>>      return 0;
>>>    switch(PyArray_Which(what)) {
>>>      USE_NUMERIC:
>>>         result = Numeric_whatsThis(what); break;
>>>      USE_NUMARRAY:
>>>         result = Numarray_whatsThis(what); break;
>>>      USE_SEQUENCE:
>>>         result = Sequence_whatsThis(what); break;
>>>    }
>>>    Py_INCREF(Py_None);
>>>    return Py_None;
>>>}
>>>
>>>In the above,  I'm picturing a separate .c file for Numeric_whatsThis
>>>and for Numarray_whatsThis.  It would be nice to streamline that to one
>>>.c and a process which somehow (simply) produces both functions.
>>>
>>>Or, ideally, the above would be done more like this:
>>>
>>>PyObject *
>>>whatsThis(PyObject *dummy, PyObject *args)
>>>{
>>>    PyObject *result, *what = NULL;
>>>    if (!PyArg_ParseTuple(args, "O", &what))
>>>       return 0;
>>>    switch(Numerix_Which(what)) {
>>>       USE_NUMERIX:
>>>          result = Numerix_whatsThis(what); break;
>>>       USE_SEQUENCE:
>>>          result = Sequence_whatsThis(what); break;
>>>    }
>>>    Py_INCREF(Py_None);
>>>    return Py_None;
>>>}
>>>
>>>Here, a common Numerix implementation supports both numarray and Numeric
>>>from a single simple .c.  The extension module would do "#include
>>>numerix/arrayobject.h" and "import_numerix()" and otherwise just call
>>>PyArray_* functions.
>>>
>>>The current stumbling block is that numarray is not binary compatible
>>>with Numeric... so numerix in C falls apart.  I haven't analyzed
>>>every symbol and struct to see if it is really feasible... but it
>>>seems like it is *almost* feasible, at least for typical usage.
>>>
>>>So, in a nutshell,  I think the dual implementation support you
>>>demoed is important and we should work up an example and kick it
>>>around to make sure it's the best way we can think of doing it.
>>>Then we should add a section to the PEP describing dual support as well.
>>>      
>>>
>>I would never apply numarray code to Numeric arrays and the inverse. It
>>looks dangerous and I do not know if it is possible.  The first thing
>>coming to mind is that numarray and Numeric arrays refer to different type
>>objects (this is what my pep module uses to differentiate them).  So, even
>>if numarray and Numeric are binary compatible, any 'alien' code referring
>>the the 'Python-standard part' of the type objects may lead to surprises. A
>>PEP proposing hacks will raise eyebrows at least.
>>
>>Secondly, most people use Numeric *or* numarray and not both.
>>
>>So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out
>>(NINO) Of course, Numeric or numarray output can be a user option if NINO
>>does not apply.  (explicit safe conversion between Numeric and numarray is
>>possible if really needed).
>>
>>I'll try to flesh out the demo with real functions in the way you indicated
>>(going as far as I consider safe).
>>
>>The problem of coding the Numeric (or numarray) functions in more than
>>a single source file has also be addressed.
>>
>>It may take 2 weeks because I am off to a conference next week.
>>
>>Regards -- Gerard
>>    
>>
>
>Hi all,
>first, I would like to state that I don't understand much of this discussion;
>so the only comment I wanted to make is that IF this where possible, to make 
>(C/C++) code that can live with both Numeric and numarray, then I think it 
>would be used more and more - think: transition phase !! (e.g. someone could 
>start making the FFTW part  of scipy numarray friendly without having to 
>switch everything at one [hint ;-)] )
>
>These where just my 2 cents.
>Cheers,
>Sebastian Haase
>  
>
I feel lower on the understanding tree with respect to what is being 
proposed in the draft PEP, but would still like to offer my 2 cents 
worth.  I get the feeling that numarray is being bent out of shape to 
fit Numeric.

It was my understanding that Numeric had certain weakness which made it 
unacceptable as a Python component and that numarray was intended to 
provide the same or better functionality within a pythonic framework.

numarray has not achieved the expected performance level to date, but 
progress is being made and I believe that, for larger arrays, numarray 
has been shown to be be superior to Numeric - please correct me if I'm 
wrong here.

The shock came for me when Todd Miller said:  

    <>
    I looked at this some, and while INCREFing __dict__ maybe the right
    idea, I forgot that there *is no* Python NumArray.__init__ anymore.

Wasn't it the intent of numarray to work towards the full use of the 
Python class structure to provide the benefits which it offers?

The Python class has two constructors and one destructor.

The constructors are __init__ and __new__, the latter only provides the 
shell of an instance which later has to be initialized.  In version 0.9, 
which I use, there is no __new__, but there is a new function which has 
a functionality similar to that intended for __new__.  Thus, with this 
change, numarray appears to be moving further away from being pythonic.

Colin W


From jmiller at stsci.edu  Thu Jul  1 13:03:12 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul  1 13:03:12 2004
Subject: [Numpy-discussion] Speeding up wxPython/numarray
In-Reply-To: <40E45D3C.7020501@cox.net>
References: <40E31B31.7040105@cox.net>
	 <1088632048.7526.204.camel@halloween.stsci.edu> <40E3462A.9080303@cox.net>
	 <1088701077.14402.20.camel@halloween.stsci.edu> <40E45D3C.7020501@cox.net>
Message-ID: <1088712102.14402.73.camel@halloween.stsci.edu>

On Thu, 2004-07-01 at 14:51, Tim Hochberg wrote:
> Todd Miller wrote:
> 
> >On Wed, 2004-06-30 at 19:00, Tim Hochberg wrote: 
> >  
> >
> >>>>
> >>>>        
> >>>>
> >>>FYI, the issue with tp_dealloc may have to do with which mode Python is
> >>>compiled in, --with-pydebug, or not.  One approach which seems like it
> >>>ought to work (just thought of this!) is to add an extra reference in C
> >>>to the NumArray instance __dict__ (from NumArray.__init__ and stashed
> >>>via a new attribute in the PyArrayObject struct) and then DECREF it as
> >>>the last part of the tp_dealloc.  
> >>> 
> >>>
> >>>      
> >>>
> >>That sounds promising.
> >>    
> >>
> > <>
> > I looked at this some, and while INCREFing __dict__ maybe the right
> > idea, I forgot that there *is no* Python NumArray.__init__ anymore.
> >
> > So the INCREF needs to be done in C without doing any getattrs; this
> > seems to mean calling a private _PyObject_GetDictPtr function to get a
> > pointer to the __dict__ slot which can be dereferenced to get the
> > __dict__.
> 
> Might there be a simpler way? Since you're putting an extra attribute on 
> the PyArrayObject structure anyway, wouldn't it be possible to just 
> stash _shadows there instead of the reference to the dictionary? 

_shadows is already in the struct.  The root problem (I recall) is not
the loss of self->_shadows, it's the loss self->__dict__ before self can
be copied onto self->_shadows.  The cause of the problem appeared to me
to be the tear down order of self:  the NumArray part appeared to be
torn down before the _numarray part, and the tp_dealloc needs to do a
Python callback where a half destructed object just won't do.  
 
To really know what the problem is,  I need to stick tp_dealloc back in
and see what breaks.  I'm pretty sure the problem was a missing instance
__dict__,  but my memory is quite fallable.

Todd


From Chris.Barker at noaa.gov  Thu Jul  1 13:18:01 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul  1 13:18:01 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <20040701053355.M99698@grenoble.cnrs.fr>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr>
Message-ID: <40E470D9.8060603@noaa.gov>

Hi all,

I'm looking for a way to read data from ascii text files quickly. I've 
found that using the standard python idioms like:

data = array((M,N),Float)
for in range(N):
     data.append(map(float,file.readline().split()))

Can be pretty slow. What I'd like is something like Matlab's fscanf:

data = fscanf(file, "%g", [M,N] )

I may have the syntax a little wrong, but the gist is there. What Matlab 
does keep recycling the format string until the desired number of 
elements have been read.

It is quite flexible, and ends up being pretty fast.

Has anyone written something like this for Numeric (or numarray, but I'd 
prefer Numeric at this point) ?

I was surprised not to find something like this in SciPy, maybe I didn't 
look hard enough.

If no one has done this, I guess I'll get started on it....

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Fernando.Perez at colorado.edu  Thu Jul  1 13:28:01 2004
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Thu Jul  1 13:28:01 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <40E470D9.8060603@noaa.gov>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov>
Message-ID: <40E473A9.5040109@colorado.edu>

Chris Barker wrote:
> Hi all,
> 
> I'm looking for a way to read data from ascii text files quickly. I've 
> found that using the standard python idioms like:
> 
> data = array((M,N),Float)
> for in range(N):
>      data.append(map(float,file.readline().split()))
> 
> Can be pretty slow. What I'd like is something like Matlab's fscanf:
> 
> data = fscanf(file, "%g", [M,N] )
> 
> I may have the syntax a little wrong, but the gist is there. What Matlab 
> does keep recycling the format string until the desired number of 
> elements have been read.
> 
> It is quite flexible, and ends up being pretty fast.
> 
> Has anyone written something like this for Numeric (or numarray, but I'd 
> prefer Numeric at this point) ?
> 
> I was surprised not to find something like this in SciPy, maybe I didn't 
> look hard enough.

scipy.io.read_array?

I haven't timed it, because it's been 'fast enough' for my needs.

For reading binary data files, I have this little utility which is basically a 
wrapper around Numeric.fromstring (N below is Numeric imported 'as N').  Note 
that it can read binary .gz files directly, a _huge_ gain for very sparse 
files representing 3d arrays (I can read a 400k gz file which blows up to 
~60MB when unzipped in no time at all, while reading the unzipped file is very 
slow):

def read_bin(fname,dims,typecode,recast_type=None,offset=0,verbose=0):
     """Read in a binary data file.

     Does NOT check for endianness issues.

     Inputs:
     fname - can be .gz
     dims (nx1,nx2,...,nxd)
     typecode
     recast_type
     offset=0: # of bytes to skip in file *from the beginning* before data starts
     """
     # config parameters
     item_size = N.zeros(1,typecode).itemsize()  # size in bytes
     data_size = N.product(N.array(dims))*item_size
     # read in data
     if fname.endswith('.gz'):
         data_file = gzip.open(fname)
     else:
         data_file = file(fname)
     data_file.seek(offset)
     data = N.fromstring(data_file.read(data_size),typecode)
     data_file.close()
     data.shape = dims
     if verbose:
         #print 'Read',data_size/item_size,'data points. Shape:',dims
         print 'Read',N.size(data),'data points. Shape:',dims
     if recast_type is not None:
         data = data.astype(recast_type)
     return data


HTH,

f


From squirrel at WPI.EDU  Thu Jul  1 13:37:13 2004
From: squirrel at WPI.EDU (Christopher T King)
Date: Thu Jul  1 13:37:13 2004
Subject: [Numpy-discussion] numarray and SMP
Message-ID: <Pine.LNX.4.44.0407011626120.20112-100000@ccc1.wpi.edu>

(I originally posted this in comp.lang.python and was redirected here)

In a quest to speed up numarray computations, I tried writing a 'threaded 
array' class for use on SMP systems that would distribute its workload 
across the processors. I hit a snag when I found out that since the Python 
interpreter is not reentrant, this effectively disables parallel 
processing in Python. I've come up with two solutions to this problem, 
both involving numarray's C functions that perform the actual vector 
operations:

1) Surround the C vector operations with Py_BEGIN_ALLOW_THREADS and 
   Py_END_ALLOW_THREADS, thus allowing the vector operations (which don't 
   access Python structures) to run in parallel with the interpreter.
   Python glue code would take care of threading and locking.

2) Move the parallelization into the C vector functions themselves. This 
   would likely get poorer performance (a chain of vector operations
   couldn't be combined into one threaded operation).

I'd much rather do #1, but will playing around with the interpreter state 
like that cause any problems?

Update from original posting:

I've partially implemented method #1 for Float64s. Running on four 2.4GHz
Xeons (possibly two with hyperthreading?), I get about a 30% speedup while
dividing 10 million Float64s, but a small (<10%) slowdown doing addition
or multiplication. The operation was repeated 100 times, with the threads
created outside of the loop (i.e. the threads weren't recreated for each
iteration). Is there really that much overhead in Python? I can post the
code I'm using and the numarray patch if it's requested.


From gerard.vermeulen at grenoble.cnrs.fr  Thu Jul  1 13:40:07 2004
From: gerard.vermeulen at grenoble.cnrs.fr (gerard.vermeulen at grenoble.cnrs.fr)
Date: Thu Jul  1 13:40:07 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <40E46CD3.9090802@sympatico.ca>
References: <1088451653.3744.200.camel@localhost.localdomain> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <200407010904.25498.haase@msg.ucsf.edu> <40E46CD3.9090802@sympatico.ca>
Message-ID: <20040701200934.M74616@grenoble.cnrs.fr>

On Thu, 01 Jul 2004 15:58:11 -0400, Colin J. Williams wrote
> Sebastian Haase wrote:
> 
> >On Wednesday 30 June 2004 11:33 pm, gerard.vermeulen at grenoble.cnrs.fr wrote:
> >  
> >
> >>On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
> >>
> >>    
> >>
> >>>So... you use the "meta" code to provide package specific ordinary
> >>>(not-macro-fied) functions to keep the different versions of the
> >>>Present() and isArray() macros from conflicting.
> >>>
> >>>It would be nice to have a standard approach for using the same
> >>>"extension enhancement code" for both numarray and Numeric.  The PEP
> >>>should really be expanded to provide an example of dual support for one
> >>>complete and real function, guts and all, so people can see the process
> >>>end-to-end;  Something like a simple arrayprint.  That process needs
> >>>to be refined to remove as much tedium and duplication of effort as
> >>>possible.  The idea is to make it as close to providing one
> >>>implementation to support both array packages as possible.  I think it's
> >>>important to illustrate how to partition the extension module into
> >>>separate compilation units which correctly navigate the dual
> >>>implementation mine field in the easiest possible way.
> >>>
> >>>It would also be nice to add some logic to the meta-functions so that
> >>>which array package gets used is configurable.  We did something like
> >>>that for the matplotlib plotting software at the Python level with
> >>>the "numerix" layer, an idea I think we copied from Chaco.  The kind
> >>>of dispatch I think might be good to support configurability looks like
> >>>this:
> >>>
> >>>PyObject *
> >>>whatsThis(PyObject *dummy, PyObject *args)
> >>>{
> >>>    PyObject *result, *what = NULL;
> >>>    if (!PyArg_ParseTuple(args, "O", &what))
> >>>      return 0;
> >>>    switch(PyArray_Which(what)) {
> >>>      USE_NUMERIC:
> >>>         result = Numeric_whatsThis(what); break;
> >>>      USE_NUMARRAY:
> >>>         result = Numarray_whatsThis(what); break;
> >>>      USE_SEQUENCE:
> >>>         result = Sequence_whatsThis(what); break;
> >>>    }
> >>>    Py_INCREF(Py_None);
> >>>    return Py_None;
> >>>}
> >>>
> >>>In the above,  I'm picturing a separate .c file for Numeric_whatsThis
> >>>and for Numarray_whatsThis.  It would be nice to streamline that to one
> >>>.c and a process which somehow (simply) produces both functions.
> >>>
> >>>Or, ideally, the above would be done more like this:
> >>>
> >>>PyObject *
> >>>whatsThis(PyObject *dummy, PyObject *args)
> >>>{
> >>>    PyObject *result, *what = NULL;
> >>>    if (!PyArg_ParseTuple(args, "O", &what))
> >>>       return 0;
> >>>    switch(Numerix_Which(what)) {
> >>>       USE_NUMERIX:
> >>>          result = Numerix_whatsThis(what); break;
> >>>       USE_SEQUENCE:
> >>>          result = Sequence_whatsThis(what); break;
> >>>    }
> >>>    Py_INCREF(Py_None);
> >>>    return Py_None;
> >>>}
> >>>
> >>>Here, a common Numerix implementation supports both numarray and Numeric
> >>>from a single simple .c.  The extension module would do "#include
> >>>numerix/arrayobject.h" and "import_numerix()" and otherwise just call
> >>>PyArray_* functions.
> >>>
> >>>The current stumbling block is that numarray is not binary compatible
> >>>with Numeric... so numerix in C falls apart.  I haven't analyzed
> >>>every symbol and struct to see if it is really feasible... but it
> >>>seems like it is *almost* feasible, at least for typical usage.
> >>>
> >>>So, in a nutshell,  I think the dual implementation support you
> >>>demoed is important and we should work up an example and kick it
> >>>around to make sure it's the best way we can think of doing it.
> >>>Then we should add a section to the PEP describing dual support as well.
> >>>      
> >>>
> >>I would never apply numarray code to Numeric arrays and the inverse. It
> >>looks dangerous and I do not know if it is possible.  The first thing
> >>coming to mind is that numarray and Numeric arrays refer to different type
> >>objects (this is what my pep module uses to differentiate them).  So, even
> >>if numarray and Numeric are binary compatible, any 'alien' code referring
> >>the the 'Python-standard part' of the type objects may lead to surprises. A
> >>PEP proposing hacks will raise eyebrows at least.
> >>
> >>Secondly, most people use Numeric *or* numarray and not both.
> >>
> >>So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out
> >>(NINO) Of course, Numeric or numarray output can be a user option if NINO
> >>does not apply.  (explicit safe conversion between Numeric and numarray is
> >>possible if really needed).
> >>
> >>I'll try to flesh out the demo with real functions in the way you indicated
> >>(going as far as I consider safe).
> >>
> >>The problem of coding the Numeric (or numarray) functions in more than
> >>a single source file has also be addressed.
> >>
> >>It may take 2 weeks because I am off to a conference next week.
> >>
> >>Regards -- Gerard
> >>    
> >>
> >
> >Hi all,
> >first, I would like to state that I don't understand much of this discussion;
> >so the only comment I wanted to make is that IF this where possible, to make 
> >(C/C++) code that can live with both Numeric and numarray, then I think it 
> >would be used more and more - think: transition phase !! (e.g. someone could 
> >start making the FFTW part  of scipy numarray friendly without having to 
> >switch everything at one [hint ;-)] )
> >
> >These where just my 2 cents.
> >Cheers,
> >Sebastian Haase
> >  
> >
> I feel lower on the understanding tree with respect to what is being 
> proposed in the draft PEP, but would still like to offer my 2 cents 
> worth.  I get the feeling that numarray is being bent out of shape 
> to fit Numeric.
>
What we are discussing are methods to make it possible to import
Numeric and numarray in the same extension module.  This can be
done by separating the colliding APIs of Numeric and numarray in
separate *.c files.  To achieve this, no changes to Numeric and
numarray itself are necessary.   In fact, this can be done by
the author of the C-extension himself, but since it is not obvious
we discuss the best methods and we like to provide the necessary glue
code.  It will make life easier for extension writers and facilitate
the transition to numarray.

Try to look at the problem from the other side:

I am using Numeric (since my life depends on SciPy) but have written an
extension that can also import numarray (hoping to get more users).  I will
never use the methods proposed in the draft PEP, because it excludes importing
Numeric.

> 
> It was my understanding that Numeric had certain weakness which made 
> it unacceptable as a Python component and that numarray was intended 
> to provide the same or better functionality within a pythonic framework.
> 
> numarray has not achieved the expected performance level to date,
>  but progress is being made and I believe that, for larger arrays, 
> numarray has been shown to be be superior to Numeric - please 
> correct me if I'm wrong here.
>

I think you are correct. I don't know why the __init__ has disappeared,
but I don't think it is because of the PEP and certainly not because
of the thread.

> 
> The shock came for me when Todd Miller said:
> 
>     <>
>     I looked at this some, and while INCREFing __dict__ maybe the right
>     idea, I forgot that there *is no* Python NumArray.__init__ anymore.
> 
> Wasn't it the intent of numarray to work towards the full use of the 
> Python class structure to provide the benefits which it offers?
> 
> The Python class has two constructors and one destructor.
> 
> The constructors are __init__ and __new__, the latter only provides 
> the shell of an instance which later has to be initialized.  In 
> version 0.9, which I use, there is no __new__, but there is a new 
> function which has a functionality similar to that intended for 
> __new__.  Thus, with this change, numarray appears to be moving 
> further away from being pythonic.
> 

Gerard


From jmiller at stsci.edu  Thu Jul  1 13:46:07 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul  1 13:46:07 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <40E46CD3.9090802@sympatico.ca>
References: <1088451653.3744.200.camel@localhost.localdomain>
	 <1088632459.7526.213.camel@halloween.stsci.edu>
	 <20040701053355.M99698@grenoble.cnrs.fr>
	 <200407010904.25498.haase@msg.ucsf.edu>  <40E46CD3.9090802@sympatico.ca>
Message-ID: <1088714723.14402.114.camel@halloween.stsci.edu>

On Thu, 2004-07-01 at 15:58, Colin J. Williams wrote:
> Sebastian Haase wrote:
> 
> >On Wednesday 30 June 2004 11:33 pm, gerard.vermeulen at grenoble.cnrs.fr wrote:
> >  
> >
> >>On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
> >>
> >>    
> >>
> >>>So... you use the "meta" code to provide package specific ordinary
> >>>(not-macro-fied) functions to keep the different versions of the
> >>>Present() and isArray() macros from conflicting.
> >>>
> >>>It would be nice to have a standard approach for using the same
> >>>"extension enhancement code" for both numarray and Numeric.  The PEP
> >>>should really be expanded to provide an example of dual support for one
> >>>complete and real function, guts and all, so people can see the process
> >>>end-to-end;  Something like a simple arrayprint.  That process needs
> >>>to be refined to remove as much tedium and duplication of effort as
> >>>possible.  The idea is to make it as close to providing one
> >>>implementation to support both array packages as possible.  I think it's
> >>>important to illustrate how to partition the extension module into
> >>>separate compilation units which correctly navigate the dual
> >>>implementation mine field in the easiest possible way.
> >>>
> >>>It would also be nice to add some logic to the meta-functions so that
> >>>which array package gets used is configurable.  We did something like
> >>>that for the matplotlib plotting software at the Python level with
> >>>the "numerix" layer, an idea I think we copied from Chaco.  The kind
> >>>of dispatch I think might be good to support configurability looks like
> >>>this:
> >>>
> >>>PyObject *
> >>>whatsThis(PyObject *dummy, PyObject *args)
> >>>{
> >>>    PyObject *result, *what = NULL;
> >>>    if (!PyArg_ParseTuple(args, "O", &what))
> >>>      return 0;
> >>>    switch(PyArray_Which(what)) {
> >>>      USE_NUMERIC:
> >>>         result = Numeric_whatsThis(what); break;
> >>>      USE_NUMARRAY:
> >>>         result = Numarray_whatsThis(what); break;
> >>>      USE_SEQUENCE:
> >>>         result = Sequence_whatsThis(what); break;
> >>>    }
> >>>    Py_INCREF(Py_None);
> >>>    return Py_None;
> >>>}
> >>>
> >>>In the above,  I'm picturing a separate .c file for Numeric_whatsThis
> >>>and for Numarray_whatsThis.  It would be nice to streamline that to one
> >>>.c and a process which somehow (simply) produces both functions.
> >>>
> >>>Or, ideally, the above would be done more like this:
> >>>
> >>>PyObject *
> >>>whatsThis(PyObject *dummy, PyObject *args)
> >>>{
> >>>    PyObject *result, *what = NULL;
> >>>    if (!PyArg_ParseTuple(args, "O", &what))
> >>>       return 0;
> >>>    switch(Numerix_Which(what)) {
> >>>       USE_NUMERIX:
> >>>          result = Numerix_whatsThis(what); break;
> >>>       USE_SEQUENCE:
> >>>          result = Sequence_whatsThis(what); break;
> >>>    }
> >>>    Py_INCREF(Py_None);
> >>>    return Py_None;
> >>>}
> >>>
> >>>Here, a common Numerix implementation supports both numarray and Numeric
> >>>from a single simple .c.  The extension module would do "#include
> >>>numerix/arrayobject.h" and "import_numerix()" and otherwise just call
> >>>PyArray_* functions.
> >>>
> >>>The current stumbling block is that numarray is not binary compatible
> >>>with Numeric... so numerix in C falls apart.  I haven't analyzed
> >>>every symbol and struct to see if it is really feasible... but it
> >>>seems like it is *almost* feasible, at least for typical usage.
> >>>
> >>>So, in a nutshell,  I think the dual implementation support you
> >>>demoed is important and we should work up an example and kick it
> >>>around to make sure it's the best way we can think of doing it.
> >>>Then we should add a section to the PEP describing dual support as well.
> >>>      
> >>>
> >>I would never apply numarray code to Numeric arrays and the inverse. It
> >>looks dangerous and I do not know if it is possible.  The first thing
> >>coming to mind is that numarray and Numeric arrays refer to different type
> >>objects (this is what my pep module uses to differentiate them).  So, even
> >>if numarray and Numeric are binary compatible, any 'alien' code referring
> >>the the 'Python-standard part' of the type objects may lead to surprises. A
> >>PEP proposing hacks will raise eyebrows at least.
> >>
> >>Secondly, most people use Numeric *or* numarray and not both.
> >>
> >>So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out
> >>(NINO) Of course, Numeric or numarray output can be a user option if NINO
> >>does not apply.  (explicit safe conversion between Numeric and numarray is
> >>possible if really needed).
> >>
> >>I'll try to flesh out the demo with real functions in the way you indicated
> >>(going as far as I consider safe).
> >>
> >>The problem of coding the Numeric (or numarray) functions in more than
> >>a single source file has also be addressed.
> >>
> >>It may take 2 weeks because I am off to a conference next week.
> >>
> >>Regards -- Gerard
> >>    
> >>
> >
> >Hi all,
> >first, I would like to state that I don't understand much of this discussion;
> >so the only comment I wanted to make is that IF this where possible, to make 
> >(C/C++) code that can live with both Numeric and numarray, then I think it 
> >would be used more and more - think: transition phase !! (e.g. someone could 
> >start making the FFTW part  of scipy numarray friendly without having to 
> >switch everything at one [hint ;-)] )
> >
> >These where just my 2 cents.
> >Cheers,
> >Sebastian Haase
> >  
> >
> I feel lower on the understanding tree with respect to what is being 
> proposed in the draft PEP, but would still like to offer my 2 cents 
> worth.  I get the feeling that numarray is being bent out of shape to 
> fit Numeric.

Yes and no.  The numarray team has over time realized the importance of
backward compatibility with the dominant array package, Numeric.  A lot
of People use Numeric now.  We're trying to make it as easy as possible
to use numarray.

> It was my understanding that Numeric had certain weakness which made it 
> unacceptable as a Python component and that numarray was intended to 
> provide the same or better functionality within a pythonic framework.

My understanding is that until there is a consensus on an array package,
neither numarray nor Numeric is going into the Python core.  

> numarray has not achieved the expected performance level to date, but 
> progress is being made and I believe that, for larger arrays, numarray 
> has been shown to be be superior to Numeric - please correct me if I'm 
> wrong here.

I think that's a fair summary.

> 
> The shock came for me when Todd Miller said:  
>     <>
>     I looked at this some, and while INCREFing __dict__ maybe the right
>     idea, I forgot that there *is no* Python NumArray.__init__ anymore.
> 
> Wasn't it the intent of numarray to work towards the full use of the 
> Python class structure to provide the benefits which it offers?
> 

Ack.  I wasn't trying to start a panic.  The __init__ still exists, as
does __new__, they're just in C.   Sorry if I was unclear.

> The Python class has two constructors and one destructor.

We're mostly on the same page.

> The constructors are __init__ and __new__, the latter only provides the 
> shell of an instance which later has to be initialized.  In version 0.9, 
> which I use, there is no __new__, 

It's there,  but it's not very useful:

>>> import numarray
>>> numarray.NumArray.__new__
<built-in method __new__ of type object at 0x402fc860>
>>> a = numarray.NumArray.__new__(numarray.NumArray)
>>> a.info()
class: <class 'numarray.numarraycore.NumArray'>
shape: ()
strides: ()
byteoffset: 0
bytestride: 0
itemsize: 0
aligned: 1
contiguous: 1
data: None
byteorder: little
byteswap: 0
type: Any

I don't, however, recommend doing this.

> but there is a new function which has 
> a functionality similar to that intended for __new__.  Thus, with this 
> change, numarray appears to be moving further away from being pythonic.

Nope.  I'm talking about moving toward better speed with no change in
functionality at the Python level.  I also think maybe we've gotten list
threads crossed here:  the "Numarray header PEP" thread is independent
(but admittedly related) of the "Speeding up wxPython/numarray" thread.

The Numarray header PEP is about making it easy for packages to write C
extensions which *optionally* support numarray (and now Numeric as
well).  One aspect of the PEP is getting headers included in the Python
core so that extensions can be compiled even when the numarray is not
installed.  The other aspect will be illustrating a good technique for
supporting both numarray and Numeric, optionally and with choice, at the
same time.  Such an extension would still run where there is numarray,
Numeric, both, or none installed.  Gerard V. has already done some
integration of numarray and Numeric with PyQwt so he has a few good
ideas on how to do the "good technique" aspect of the PEP.

The Speeding up wxPython/numarray thread is about improving the
performance of a 50000 point wxPython drawlines which is 10x slower with
numarray than Numeric.  Tim H. and Chris B. have nailed this down
(mostly) to the numarray sequence protocol and destructor, __del__.

Regards,
Todd


From perry at stsci.edu  Thu Jul  1 13:57:02 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Thu Jul  1 13:57:02 2004
Subject: [Numpy-discussion] Numarray header PEP
In-Reply-To: <40E46CD3.9090802@sympatico.ca>
Message-ID: <CEELJPECNGEGKFNDLHFFMEABCEAA.perry@stsci.edu>

Collin J. Williams Wrote:

> I feel lower on the understanding tree with respect to what is being 
> proposed in the draft PEP, but would still like to offer my 2 cents 
> worth.  I get the feeling that numarray is being bent out of shape to 
> fit Numeric.
>  
Todd and Gerard address this point well.

> It was my understanding that Numeric had certain weakness which made it 
> unacceptable as a Python component and that numarray was intended to 
> provide the same or better functionality within a pythonic framework.
> 
Let me reiterate what our motivations were. We wanted to use
an array package for our software, and Numeric had enough
shortcomings that we needed some changes in behavior (e.g., 
type coercion for scalars), changes in performance (particularly
with regard to memory usage), and enhancements in capabilities
(e.g., memory mapping, record arrays, etc.). It was the opinion
of some (Paul Dubois, for example) that a rewrite was in order in
any case since the code was not that maintainable (not everyone felt
this way, though at the time that wasn't as clear).

At the same time there was some hope that Numeric could be accepted
into the standard Python distribution. That's something we thought
would be good (but wasn't the highest priority for us) and I've
come to believe that perhaps a better solution with regard to that
is what this PEP is trying to address. In any case Guido made it
clear that he would not accept Numeric in its (then) current form.

That it be written mostly in Python was something suggested by
Guido, and we started off that way, mainly because it would get
us going much faster than writing it all in C. We definitely 
understood that it would also have the consequence of making
small array performance worse. We said as much when we started;
it wasn't as clear as it is now that many users objected to a factor
of few slower performance (as it turned out, a mostly Python based
implemenation was more than an order of magnitude slower for small
arrays).

> numarray has not achieved the expected performance level to date, but 
> progress is being made and I believe that, for larger arrays, numarray 
> has been shown to be be superior to Numeric - please correct me if I'm 
> wrong here.
> 
We never expected numarray to ever reach the performance level for small
arrays that Numeric has. If it were within a factor of two I would be
thrilled (its more like a factor of 3 or 4 currently for simple ufuncs).
I still don't think it ever will be as fast for small arrays. The
focus all along was on handling large arrays, which I think it does
quite well, both regard to memory and speed. Yes, there are some
functions and operations that may be much slower. Mainly they
need to be called out so they can be improved. Generally we
only notice performance issues that affect our software. Others
need to point out remaining large discrepancies.

I'm still of the opinion that if small array performance is really
important, a very different approach should be used and have a 
completely different implementation. I would think that improvements
of an order of magnitude over what Numeric does now are possible.
But since that isn't important to us (STScI), don't expect us to work on
that :-)

> The shock came for me when Todd Miller said:  
> 
>     <>
>     I looked at this some, and while INCREFing __dict__ maybe the right
>     idea, I forgot that there *is no* Python NumArray.__init__ anymore.
> 
> Wasn't it the intent of numarray to work towards the full use of the 
> Python class structure to provide the benefits which it offers?
> 
> The Python class has two constructors and one destructor.
> 
> The constructors are __init__ and __new__, the latter only provides the 
> shell of an instance which later has to be initialized.  In version 0.9, 
> which I use, there is no __new__, but there is a new function which has 
> a functionality similar to that intended for __new__.  Thus, with this 
> change, numarray appears to be moving further away from being pythonic.
> 
I'll agree that optimization is driving the underlying implementation to
one that is more complex and that is the drawback (no surprise there).
There's Pythonic in use and Pythonic in implementation. We are certainly 
receptive to better ideas for the implementation, but I doubt that
a heavily Python-based implementation is ever going to be competitive
for small arrays (unless something like psyco become universal, but
I think there are a whole mess of problems to be solved for that kind
of approach to work well generically).

Perry 


From perry at stsci.edu  Thu Jul  1 15:01:04 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Thu Jul  1 15:01:04 2004
Subject: [Numpy-discussion] numarray and SMP
In-Reply-To: <Pine.LNX.4.44.0407011626120.20112-100000@ccc1.wpi.edu>
Message-ID: <CEELJPECNGEGKFNDLHFFGEACCEAA.perry@stsci.edu>

Christopher T King wrote:
>
> (I originally posted this in comp.lang.python and was redirected here)
>
> In a quest to speed up numarray computations, I tried writing a 'threaded
> array' class for use on SMP systems that would distribute its workload
> across the processors. I hit a snag when I found out that since
> the Python
> interpreter is not reentrant, this effectively disables parallel
> processing in Python. I've come up with two solutions to this problem,
> both involving numarray's C functions that perform the actual vector
> operations:
>
> 1) Surround the C vector operations with Py_BEGIN_ALLOW_THREADS and
>    Py_END_ALLOW_THREADS, thus allowing the vector operations (which don't
>    access Python structures) to run in parallel with the interpreter.
>    Python glue code would take care of threading and locking.
>
> 2) Move the parallelization into the C vector functions themselves. This
>    would likely get poorer performance (a chain of vector operations
>    couldn't be combined into one threaded operation).
>
> I'd much rather do #1, but will playing around with the interpreter state
> like that cause any problems?
>
I don't think so, but it raises a number of questions that I
ask just below.

> Update from original posting:
>
> I've partially implemented method #1 for Float64s. Running on four 2.4GHz
> Xeons (possibly two with hyperthreading?), I get about a 30% speedup while
> dividing 10 million Float64s, but a small (<10%) slowdown doing addition
> or multiplication. The operation was repeated 100 times, with the threads
> created outside of the loop (i.e. the threads weren't recreated for each
> iteration). Is there really that much overhead in Python? I can post the
> code I'm using and the numarray patch if it's requested.
>
Questions and comments:

1) I suppose you did this for generated ufunc code? (ideally one
would put this in the codegenerator stuff but for the purposes
of testing it would be fine). I guess we would like to see
how you actually changed the code fragment (you can email
me or Todd Miller directly if you wish)
2) How much improvement you would see depends on many details.
But if you were doing this for 10 million element arrays, I'm
surprised you saw such a small improvement (30% for 4 processors
isn't worth the trouble it would seem). So seeing the actual
test code would be helpful. If the array operation you are doing
for numarray aren't simple (that's a specialized use of the word;
by that I mean if the arrays are not the same type, aren't
contiguous, aren't aligned, or aren't of proper byte-order)
then there are a number of other issues that may slow it down
quite a bit (and there are ways of improving these for
parallel processing).
3) I don't speak as an expert on threading or parallel processors,
but I believe so long as you don't call any Python API functions
(either directly or indirectly) between the global interpreter
lock release and reacquisition, you should be fine. The vector
ufunc code in numarray should satisfy this fine.

Perry Greenfield


From squirrel at WPI.EDU  Fri Jul  2 06:37:20 2004
From: squirrel at WPI.EDU (Christopher T King)
Date: Fri Jul  2 06:37:20 2004
Subject: [Numpy-discussion] numarray and SMP
In-Reply-To: <CEELJPECNGEGKFNDLHFFGEACCEAA.perry@stsci.edu>
Message-ID: <Pine.LNX.4.44.0407020930040.20420-100000@ccc9.wpi.edu>

On Thu, 1 Jul 2004, Perry Greenfield wrote:

> 1) I suppose you did this for generated ufunc code? (ideally one
> would put this in the codegenerator stuff but for the purposes
> of testing it would be fine). I guess we would like to see
> how you actually changed the code fragment (you can email
> me or Todd Miller directly if you wish)

Yep, I didn't know it was automatically generated :P

> 2) How much improvement you would see depends on many details.
> But if you were doing this for 10 million element arrays, I'm
> surprised you saw such a small improvement (30% for 4 processors
> isn't worth the trouble it would seem). So seeing the actual
> test code would be helpful. If the array operation you are doing
> for numarray aren't simple (that's a specialized use of the word;
> by that I mean if the arrays are not the same type, aren't
> contiguous, aren't aligned, or aren't of proper byte-order)
> then there are a number of other issues that may slow it down
> quite a bit (and there are ways of improving these for
> parallel processing).

I've been careful not to use anything to cause discontiguities in the 
arrays, and to keep them all the same type (Float64 in this case). See my 
next post for the code I'm using.


From haase at msg.ucsf.edu  Fri Jul  2 08:28:01 2004
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri Jul  2 08:28:01 2004
Subject: [Numpy-discussion] bug in numarray.maximum.reduce ?
In-Reply-To: <200406291705.55454.haase@msg.ucsf.edu>
References: <200406291705.55454.haase@msg.ucsf.edu>
Message-ID: <200407020827.05407.haase@msg.ucsf.edu>

On Tuesday 29 June 2004 05:05 pm, Sebastian Haase wrote:
> Hi,
>
> Is this a bug?:
> >>> # (import numarray as na ; 'd' is a 3 dimensional array)
> >>> d.type()
>
> Float32
>
> >>> d[80, 136, 122]
>
> 80.3997039795
>
> >>> na.maximum.reduce(d[:,136, 122])
>
> 85.8426361084
>
> >>> na.maximum.reduce(d) [136, 122]
>
> 37.3658103943
>
> >>> na.maximum.reduce(d,0)[136, 122]
>
> 37.3658103943
>
> >>> na.maximum.reduce(d,1)[136, 122]
>
> Traceback (most recent call last):
>   File "<input>", line 1, in ?
> IndexError: Index out of range
>
> I was using  na.maximum.reduce(d)  to get a "pixelwise" maximum along Z
> (axis 0). But as seen above it does not get it right.  I then tried to
> reproduce
>
> this with some simple arrays, but here it works just fine:
> >>> a = na.arange(4*4*4)
> >>> a.shape=(4,4,4)
> >>> na.maximum.reduce(a)
>
> [[48 49 50 51]
>  [52 53 54 55]
>  [56 57 58 59]
>  [60 61 62 63]]
>
> >>> a = na.arange(4*4*4).astype(na.Float32)
> >>> a.shape=(4,4,4)
> >>> na.maximum.reduce(a)
>
> [[ 48.  49.  50.  51.]
>  [ 52.  53.  54.  55.]
>  [ 56.  57.  58.  59.]
>  [ 60.  61.  62.  63.]]
>
>
> Any hint ?
>
> Regards,
> Sebastian Haase

Hi again,
I think the reason that no one responded to this is that it just sounds to 
unbelievable ...
Sorry for the missing piece of information, but 'd' is actually a memmapped 
array !
>>> d.info()
class: <class 'numarray.numarraycore.NumArray'>
shape: (80, 150, 150)
strides: (90000, 600, 4)
byteoffset: 0
bytestride: 4
itemsize: 4
aligned: 1
contiguous: 1
data: <MemmapSlice of length:7290000 readonly>
byteorder: big
byteswap: 1
type: Float32
>>> dd = d.copy()
>>> na.maximum.reduce(dd[:,136, 122])
85.8426361084
>>> na.maximum.reduce(dd)[136, 122]
85.8426361084
>>>

Apparently we are using memmap so frequently now that I didn't even think 
about that - which is good news for everyone, because it means that it works 
(mostly).

I just see that 'byteorder' is 'big' - I'm running this on an Intel Linux PC. 
Could this be the problem?
Please some comments !

Thanks,
Sebastian


From jmiller at stsci.edu  Fri Jul  2 09:03:08 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jul  2 09:03:08 2004
Subject: [Numpy-discussion] bug in numarray.maximum.reduce ?
In-Reply-To: <200407020827.05407.haase@msg.ucsf.edu>
References: <200406291705.55454.haase@msg.ucsf.edu>
	 <200407020827.05407.haase@msg.ucsf.edu>
Message-ID: <1088784157.26482.14.camel@halloween.stsci.edu>

On Fri, 2004-07-02 at 11:27, Sebastian Haase wrote:
> On Tuesday 29 June 2004 05:05 pm, Sebastian Haase wrote:
> > Hi,
> >
> > Is this a bug?:
> > >>> # (import numarray as na ; 'd' is a 3 dimensional array)
> > >>> d.type()
> >
> > Float32
> >
> > >>> d[80, 136, 122]
> >
> > 80.3997039795
> >
> > >>> na.maximum.reduce(d[:,136, 122])
> >
> > 85.8426361084
> >
> > >>> na.maximum.reduce(d) [136, 122]
> >
> > 37.3658103943
> >
> > >>> na.maximum.reduce(d,0)[136, 122]
> >
> > 37.3658103943
> >
> > >>> na.maximum.reduce(d,1)[136, 122]
> >
> > Traceback (most recent call last):
> >   File "<input>", line 1, in ?
> > IndexError: Index out of range
> >
> > I was using  na.maximum.reduce(d)  to get a "pixelwise" maximum along Z
> > (axis 0). But as seen above it does not get it right.  I then tried to
> > reproduce
> >
> > this with some simple arrays, but here it works just fine:
> > >>> a = na.arange(4*4*4)
> > >>> a.shape=(4,4,4)
> > >>> na.maximum.reduce(a)
> >
> > [[48 49 50 51]
> >  [52 53 54 55]
> >  [56 57 58 59]
> >  [60 61 62 63]]
> >
> > >>> a = na.arange(4*4*4).astype(na.Float32)
> > >>> a.shape=(4,4,4)
> > >>> na.maximum.reduce(a)
> >
> > [[ 48.  49.  50.  51.]
> >  [ 52.  53.  54.  55.]
> >  [ 56.  57.  58.  59.]
> >  [ 60.  61.  62.  63.]]
> >
> >
> > Any hint ?
> >
> > Regards,
> > Sebastian Haase
> 
> Hi again,
> I think the reason that no one responded to this is that it just sounds to 
> unbelievable ...

This just slipped through the cracks for me.

> Sorry for the missing piece of information, but 'd' is actually a memmapped 
> array !
> >>> d.info()
> class: <class 'numarray.numarraycore.NumArray'>
> shape: (80, 150, 150)
> strides: (90000, 600, 4)
> byteoffset: 0
> bytestride: 4
> itemsize: 4
> aligned: 1
> contiguous: 1
> data: <MemmapSlice of length:7290000 readonly>
> byteorder: big
> byteswap: 1
> type: Float32
> >>> dd = d.copy()
> >>> na.maximum.reduce(dd[:,136, 122])
> 85.8426361084
> >>> na.maximum.reduce(dd)[136, 122]
> 85.8426361084
> >>>
> 
> Apparently we are using memmap so frequently now that I didn't even think 
> about that - which is good news for everyone, because it means that it works 
> (mostly).
> 
> I just see that 'byteorder' is 'big' - I'm running this on an Intel Linux PC. 
> Could this be the problem?

I think byteorder is a good guess at this point.  What version of Python
and numarray are you using?

Regards,
Todd


From haase at msg.ucsf.edu  Fri Jul  2 10:46:01 2004
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri Jul  2 10:46:01 2004
Subject: [Numpy-discussion] bug in numarray.maximum.reduce ?
In-Reply-To: <1088784157.26482.14.camel@halloween.stsci.edu>
References: <200406291705.55454.haase@msg.ucsf.edu> <200407020827.05407.haase@msg.ucsf.edu> <1088784157.26482.14.camel@halloween.stsci.edu>
Message-ID: <200407021045.00866.haase@msg.ucsf.edu>

On Friday 02 July 2004 09:02 am, Todd Miller wrote:
> On Fri, 2004-07-02 at 11:27, Sebastian Haase wrote:
> > On Tuesday 29 June 2004 05:05 pm, Sebastian Haase wrote:
> > > Hi,
> > >
> > > Is this a bug?:
> > > >>> # (import numarray as na ; 'd' is a 3 dimensional array)
> > > >>> d.type()
> > >
> > > Float32
> > >
> > > >>> d[80, 136, 122]
> > >
> > > 80.3997039795
> > >
> > > >>> na.maximum.reduce(d[:,136, 122])
> > >
> > > 85.8426361084
> > >
> > > >>> na.maximum.reduce(d) [136, 122]
> > >
> > > 37.3658103943
> > >
> > > >>> na.maximum.reduce(d,0)[136, 122]
> > >
> > > 37.3658103943
> > >
> > > >>> na.maximum.reduce(d,1)[136, 122]
> > >
> > > Traceback (most recent call last):
> > >   File "<input>", line 1, in ?
> > > IndexError: Index out of range
> > >
> > > I was using  na.maximum.reduce(d)  to get a "pixelwise" maximum along Z
> > > (axis 0). But as seen above it does not get it right.  I then tried to
> > > reproduce
> > >
> > > this with some simple arrays, but here it works just fine:
> > > >>> a = na.arange(4*4*4)
> > > >>> a.shape=(4,4,4)
> > > >>> na.maximum.reduce(a)
> > >
> > > [[48 49 50 51]
> > >  [52 53 54 55]
> > >  [56 57 58 59]
> > >  [60 61 62 63]]
> > >
> > > >>> a = na.arange(4*4*4).astype(na.Float32)
> > > >>> a.shape=(4,4,4)
> > > >>> na.maximum.reduce(a)
> > >
> > > [[ 48.  49.  50.  51.]
> > >  [ 52.  53.  54.  55.]
> > >  [ 56.  57.  58.  59.]
> > >  [ 60.  61.  62.  63.]]
> > >
> > >
> > > Any hint ?
> > >
> > > Regards,
> > > Sebastian Haase
> >
> > Hi again,
> > I think the reason that no one responded to this is that it just sounds
> > to unbelievable ...
>
> This just slipped through the cracks for me.
>
> > Sorry for the missing piece of information, but 'd' is actually a
> > memmapped array !
> >
> > >>> d.info()
> >
> > class: <class 'numarray.numarraycore.NumArray'>
> > shape: (80, 150, 150)
> > strides: (90000, 600, 4)
> > byteoffset: 0
> > bytestride: 4
> > itemsize: 4
> > aligned: 1
> > contiguous: 1
> > data: <MemmapSlice of length:7290000 readonly>
> > byteorder: big
> > byteswap: 1
> > type: Float32
> >
> > >>> dd = d.copy()
> > >>> na.maximum.reduce(dd[:,136, 122])
> >
> > 85.8426361084
> >
> > >>> na.maximum.reduce(dd)[136, 122]
> >
> > 85.8426361084
> >
> >
> > Apparently we are using memmap so frequently now that I didn't even think
> > about that - which is good news for everyone, because it means that it
> > works (mostly).
> >
> > I just see that 'byteorder' is 'big' - I'm running this on an Intel Linux
> > PC. Could this be the problem?
>
> I think byteorder is a good guess at this point.  What version of Python
> and numarray are you using?

Python 2.2.1 (#1, Feb 28 2004, 00:52:10)
[GCC 2.95.4 20011002 (Debian prerelease)] on linux2

numarray 0.9 - from CVS on 2004-05-13.

Regards,
Sebastian Haase


From jmiller at stsci.edu  Fri Jul  2 12:34:09 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jul  2 12:34:09 2004
Subject: [Numpy-discussion] bug in numarray.maximum.reduce ?
In-Reply-To: <200407021045.00866.haase@msg.ucsf.edu>
References: <200406291705.55454.haase@msg.ucsf.edu>
	 <200407020827.05407.haase@msg.ucsf.edu>
	 <1088784157.26482.14.camel@halloween.stsci.edu>
	 <200407021045.00866.haase@msg.ucsf.edu>
Message-ID: <1088796821.5974.15.camel@halloween.stsci.edu>

On Fri, 2004-07-02 at 13:45, Sebastian Haase wrote:
> On Friday 02 July 2004 09:02 am, Todd Miller wrote:
> > On Fri, 2004-07-02 at 11:27, Sebastian Haase wrote:
> > > On Tuesday 29 June 2004 05:05 pm, Sebastian Haase wrote:
> > > > Hi,
> > > >
> > > > Is this a bug?:
> > > > >>> # (import numarray as na ; 'd' is a 3 dimensional array)
> > > > >>> d.type()
> > > >
> > > > Float32
> > > >
> > > > >>> d[80, 136, 122]
> > > >
> > > > 80.3997039795
> > > >
> > > > >>> na.maximum.reduce(d[:,136, 122])
> > > >
> > > > 85.8426361084
> > > >
> > > > >>> na.maximum.reduce(d) [136, 122]
> > > >
> > > > 37.3658103943
> > > >
> > > > >>> na.maximum.reduce(d,0)[136, 122]
> > > >
> > > > 37.3658103943
> > > >
> > > > >>> na.maximum.reduce(d,1)[136, 122]
> > > >
> > > > Traceback (most recent call last):
> > > >   File "<input>", line 1, in ?
> > > > IndexError: Index out of range
> > > >
> > > > I was using  na.maximum.reduce(d)  to get a "pixelwise" maximum along Z
> > > > (axis 0). But as seen above it does not get it right.  I then tried to
> > > > reproduce
> > > >
> > > > this with some simple arrays, but here it works just fine:
> > > > >>> a = na.arange(4*4*4)
> > > > >>> a.shape=(4,4,4)
> > > > >>> na.maximum.reduce(a)
> > > >
> > > > [[48 49 50 51]
> > > >  [52 53 54 55]
> > > >  [56 57 58 59]
> > > >  [60 61 62 63]]
> > > >
> > > > >>> a = na.arange(4*4*4).astype(na.Float32)
> > > > >>> a.shape=(4,4,4)
> > > > >>> na.maximum.reduce(a)
> > > >
> > > > [[ 48.  49.  50.  51.]
> > > >  [ 52.  53.  54.  55.]
> > > >  [ 56.  57.  58.  59.]
> > > >  [ 60.  61.  62.  63.]]
> > > >
> > > >
> > > > Any hint ?
> > > >
> > > > Regards,
> > > > Sebastian Haase
> > >
> > > Hi again,
> > > I think the reason that no one responded to this is that it just sounds
> > > to unbelievable ...
> >
> > This just slipped through the cracks for me.
> >
> > > Sorry for the missing piece of information, but 'd' is actually a
> > > memmapped array !
> > >
> > > >>> d.info()
> > >
> > > class: <class 'numarray.numarraycore.NumArray'>
> > > shape: (80, 150, 150)
> > > strides: (90000, 600, 4)
> > > byteoffset: 0
> > > bytestride: 4
> > > itemsize: 4
> > > aligned: 1
> > > contiguous: 1
> > > data: <MemmapSlice of length:7290000 readonly>
> > > byteorder: big
> > > byteswap: 1
> > > type: Float32
> > >
> > > >>> dd = d.copy()
> > > >>> na.maximum.reduce(dd[:,136, 122])
> > >
> > > 85.8426361084
> > >
> > > >>> na.maximum.reduce(dd)[136, 122]
> > >
> > > 85.8426361084
> > >
> > >
> > > Apparently we are using memmap so frequently now that I didn't even think
> > > about that - which is good news for everyone, because it means that it
> > > works (mostly).
> > >
> > > I just see that 'byteorder' is 'big' - I'm running this on an Intel Linux
> > > PC. Could this be the problem?
> >
> > I think byteorder is a good guess at this point.  What version of Python
> > and numarray are you using?
> 
> Python 2.2.1 (#1, Feb 28 2004, 00:52:10)
> [GCC 2.95.4 20011002 (Debian prerelease)] on linux2
> 
> numarray 0.9 - from CVS on 2004-05-13.
> 
> Regards,
> Sebastian Haase

Hi Sebastian,

I logged this on SF as a bug but won't get to it until next week after
numarray-1.0 comes out.

Regards,
Todd


From jmiller at stsci.edu  Fri Jul  2 14:06:13 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jul  2 14:06:13 2004
Subject: [Numpy-discussion] ANN: numarray-1.0 released
Message-ID: <1088802348.5974.28.camel@halloween.stsci.edu>

Release Notes for numarray-1.0

Numarray is an array processing package designed to efficiently
manipulate large multi-dimensional arrays.  Numarray is modeled after
Numeric and features c-code generated from python template scripts,
the capacity to operate directly on arrays in files, and improved type
promotions.

I. ENHANCEMENTS

1. User added ufuncs

There's a setup.py file in numarray-1.0/Examples/ufunc which
demonstrates how a numarray user can define their own universal
functions of one or two parameters.  Ever wanted to write your own
bessel() function for use on arrays?  Now you can.  Your ufunc can use
exactly the same machinery as add().

2. Ports of Numeric functions

A bunch of Numeric functions were ported to numarray in the new
libnumeric module.  To get these import from numarray.numeric.  Most
notable among these are put, putmask, take, argmin, and argmax.  Also
added were sort, argsort, concatenate, repeat and resize.  These are
independent ports/implementations in C done for the purpose of best
Numeric compatibility and small array performance.  The numarray
versions, which handle additional cases, still exist and are the
default in numarray proper.

3. Faster matrix multiply

The setup for numarray's matrix multiply was moved into C-code.  This
makes it faster for small matrices.

4. The numarray "header PEP"

A PEP has been started for the inclusion of numarray (and possibly
Numeric) C headers into the Python core.  The PEP will demonstrate how
to provide optional support for arrays (the end-user may or may not
have numarray installed and the extension will still work).  It may
also (eventually) demonstrate how to build extensions which support
both numarray and Numeric.  Thus, the PEP is seeking to make it
possible to distribute extensions which will still compile when
numarray (or either) is not present in a user's Python installation,
which will work when numarry (or either) is not installed, and which
will improve performance when either is installed.  The PEP is now in
numarray-1.0/Doc/header_pep.txt in docutils format.  We want feedback
and consensus before we submit to python-dev so please consider
reading it and commenting.

For the PEP, the C-API has been partitioned into two parts: a
relatively simple Numeric compatible part and the numarray native
part.  This broke source and binary compatibility with numarray-0.9.
See CAUTIONS below for more information.

5. Changes to the manual

There are now brief sections on numarray.mlab and numarray.objects in
the manual.  The discussion of the C-API has been updated.

II. CAUTIONS

1. The numarray-1.0 C-API is neither completely source level nor
binary compatible with numarray-0.9. First, this means that some 3rd
party extensions will no longer compile without errors.  Second, this
means that binary packages built against numarray-0.9 will fail,
probably disastrously, using numarray-1.0.  Don't install numarray-1.0
until you are ready to recompile or replace your extensions with
numarray-1.0 binaries because 0.9 binaries will not work.

In order to support the header PEP, the numarray C-API was partitioned
into two parts: Numeric compatible and numarry extensions. You can use
the Numeric compatible API (the PyArray_* functions) by including
arrayobject.h and calling import_array() in your module init function.
You can use the extended API (the NA_* functions) by including
libnumarray.h and calling import_libnumarray() in your init function.
Because of the partitioning, all numarray extensions must be
recompiled to work with 1.0.  Extensions using *both* APIs must
include both files in order to compile, and must do both imports in
order to run.  Both APIs share a common PyArrayObject struct.

2. numarray extension writers should note that the documented use of
PyArray_INCREF and PyArray_XDECREF (in numarray) was found to be
incompatible with Numeric and these functions have therefore been
removed from the supported API and will now result in errors.

3. The numarray.objects.ObjectArray parameter order was changed.

4. The undocumented API function PyArray_DescrFromTypeObj was removed
from the Numeric compatible API because it is not provided by Numeric.

III. BUGS FIXED / CLOSED

See
http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse
for more details.

979834  convolve2d parameter order issues
979775  ObjectArray parameter order
979712  No exception for invalid axis
979702  too many slices fails silently
979123  A[n:n] = x no longer works
979028  matrixmultiply precision
976951  Unpickled numarray types unsable?
977472  CharArray concatenate
970356  bug in accumulate contiguity status
969162  object array bug/ambiguity
963921  bitwise_not over Bool type fails
963706  _reduce_out: problem with otype
942804  numarray C-API include file
932438  suggest moving mlab up a level
932436  mlab docs missing
857628  numarray allclose returns int
839401  Argmax's behavior has changed for ties
817348  a+=1j # Not converted to complex
957089  PyArray_FromObject dim check broken
923046  numarray.objects incompatibility  
897854  Type conflict when embedding on OS X  
793421  PyArray_INCREF / PyArray_XDECREF deprecated
735479  Build failure on Cygwin 1.3.22 (very current install).
870660  Numarray: CFLAGS build problem  
874198  numarray.random_array.random() broken?
874207  not-so random numbers in numarray.random_array
829662  Downcast from Float64 to UInt8 anomaly
867073  numarray diagonal bug?  
806705  a tale of two rank-0's
863155  Zero size numarray breaks for loop
922157  argmax returns integer in some cases
934514  suggest nelements -> size
953294  choose bug
955314  strings.num2char bug?
955336  searchsorted has strange behaviour
955409  MaskedArray problems
953567  Add read-write requirement to NA_InputArray
952705  records striding for > 1D arrays
944690  many numarray array methods not documented
915015  numarray/Numeric incompatabilities
949358  UsesOpPriority unexpected behavior
944678  incorrect help for "size" func/method
888430  NA_NewArray() creates array with wrong endianess
922798  The document Announce.txt is out of date
947080  numarray.image.median bugs
922796  Manual has some dated MA info
931384  What does True mean in a mask?
931379  numeric.ma called MA in manual
933842  Bool arrays don't allow bool assignment
935588  problem parsing argument "nbyte" in callStrideConvCFunc()
936162  problem parsing "nbytes" argument in copyToString()
937680  Error in Lib/numerictypes.py ?
936539  array([cmplx_array, int_array]) fails
936541  a[...,1] += 0 crashes interpreter.
940826  Ufunct operator don't work
935882  take for character arrays?
933783  numarray, _ufuncmodule.c: problem setting buffersize
930014  fromstring typecode param still broken
929841  searchsorted type coercion
924841  numarray.objects rank-0 results
925253  numarray.objects __str__ and __repr__
913782  Minor error in chapter 12: NUM_ or not?
889591  wrong header file for C extensions
925073  API manual comments
924854  take() errors
925754  arange() with large argument crashes interpreter
926246  ufunc reduction crash
902153  can't compile under RH9/gcc 3.2.2
916876  searchsorted/histogram broken in versions 0.8 and 0.9
920470  numarray arange() problem
915736  numarray-0.9: Doc/CHANGES not up to date

WHERE
-----------

Numarray-1.0 windows executable installers, source code, and manual is
here:

http://sourceforge.net/project/showfiles.php?group_id=1369

Numarray is hosted by Source Forge in the same project which hosts
Numeric:

http://sourceforge.net/projects/numpy/

The web page for Numarray information is at:

http://stsdas.stsci.edu/numarray/index.html

Trackers for Numarray Bugs, Feature Requests, Support, and Patches are
at the Source Forge project for NumPy at:

http://sourceforge.net/tracker/?group_id=1369

REQUIREMENTS
------------------------------

numarray-1.0 requires Python 2.2.2 or greater.  


AUTHORS, LICENSE
------------------------------

Numarray was written by Perry Greenfield, Rick White, Todd Miller, JC
Hsu, Paul Barrett, Phil Hodge at the Space Telescope Science
Institute.  We'd like to acknowledge the assitance of Francesc Alted,
Paul Dubois, Sebastian Haase, Tim Hochberg, Nadav Horesh, Edward
C. Jones, Eric Jones, Jochen K"upper, Travis Oliphant, Pearu Peterson,
Peter Verveer, Colin Williams, and everyone else who has contributed
with comments, bug reports, or patches.

Numarray is made available under a BSD-style License.  See
LICENSE.txt in the source distribution for details.

-- 
Todd Miller             jmiller at stsci.edu


From paustin at eos.ubc.ca  Sat Jul  3 10:11:03 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Sat Jul  3 10:11:03 2004
Subject: [Numpy-discussion] Bug in numarray.typecode()?
In-Reply-To: <1088796821.5974.15.camel@halloween.stsci.edu>
References: <200406291705.55454.haase@msg.ucsf.edu>
	<200407020827.05407.haase@msg.ucsf.edu>
	<1088784157.26482.14.camel@halloween.stsci.edu>
	<200407021045.00866.haase@msg.ucsf.edu>
	<1088796821.5974.15.camel@halloween.stsci.edu>
Message-ID: <16614.59532.288486.645869@gull.eos.ubc.ca>

I'm in the process of switching to numarray, but I still
need typecode().  I notice that, although it's discouraged,
the typecode ids have been extended to all new numarray
types described in table 4.1 (p. 19) of the manual, except UInt64.
That is, the following script:

import numarray as Na
print "Numarray version: ",Na.__version__
print Na.array([1],'Int8').typecode()
print Na.array([1],'UInt8').typecode()
print Na.array([1],'Int16').typecode()
print Na.array([1],'UInt16').typecode()
print Na.array([1],'Int32').typecode()
print Na.array([1],'UInt32').typecode()
print Na.array([1],'Float32').typecode()
print Na.array([1],'Float64').typecode()
print Na.array([1],'Complex32').typecode()
print Na.array([1],'Complex64').typecode()
print Na.array([1],'Bool').typecode()
print Na.array([1],'UInt64').typecode()

prints:

Numarray version:  1.0
1
b
s
w
l
u
f
d
F
D
1
Traceback (most recent call last):
  File "<stdin>", line 14, in ?
  File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 1092, in typecode
    return _nt.typecode[self._type]
KeyError: UInt64

Should this print 'U'?

Regards, Phil Austin


From curzio.basso at unibas.ch  Tue Jul  6 02:42:06 2004
From: curzio.basso at unibas.ch (Curzio Basso)
Date: Tue Jul  6 02:42:06 2004
Subject: [Numpy-discussion] inconsistencies between docs and C headers?
Message-ID: <40EA73C9.7070604@unibas.ch>

Hi all,
can someone explain me why in the docs functions like NA_NewArray() 
return a PyObject*, while in the headers they return a PyArrayObject*? 
Is it just the documentation which is slow to catch up with the 
development? Or am i missing something?

thanks, curzio


From jmiller at stsci.edu  Tue Jul  6 06:35:11 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul  6 06:35:11 2004
Subject: [Numpy-discussion] Bug in numarray.typecode()?
In-Reply-To: <16614.59532.288486.645869@gull.eos.ubc.ca>
References: <200406291705.55454.haase@msg.ucsf.edu>
	 <200407020827.05407.haase@msg.ucsf.edu>
	 <1088784157.26482.14.camel@halloween.stsci.edu>
	 <200407021045.00866.haase@msg.ucsf.edu>
	 <1088796821.5974.15.camel@halloween.stsci.edu>
	 <16614.59532.288486.645869@gull.eos.ubc.ca>
Message-ID: <1089120859.25460.3.camel@halloween.stsci.edu>

On Sat, 2004-07-03 at 13:10, Philip Austin wrote:
> I'm in the process of switching to numarray, but I still
> need typecode().  I notice that, although it's discouraged,
> the typecode ids have been extended to all new numarray
> types described in table 4.1 (p. 19) of the manual, except UInt64.
> That is, the following script:
> 
> import numarray as Na
> print "Numarray version: ",Na.__version__
> print Na.array([1],'Int8').typecode()
> print Na.array([1],'UInt8').typecode()
> print Na.array([1],'Int16').typecode()
> print Na.array([1],'UInt16').typecode()
> print Na.array([1],'Int32').typecode()
> print Na.array([1],'UInt32').typecode()
> print Na.array([1],'Float32').typecode()
> print Na.array([1],'Float64').typecode()
> print Na.array([1],'Complex32').typecode()
> print Na.array([1],'Complex64').typecode()
> print Na.array([1],'Bool').typecode()
> print Na.array([1],'UInt64').typecode()
> 
> prints:
> 
> Numarray version:  1.0
> 1
> b
> s
> w
> l
> u
> f
> d
> F
> D
> 1
> Traceback (most recent call last):
>   File "<stdin>", line 14, in ?
>   File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 1092, in typecode
>     return _nt.typecode[self._type]
> KeyError: UInt64
> 
> Should this print 'U'?

I think it could,  but I wouldn't go so far as to say it should. 
typecode() is there for backward compatibility with Numeric.  Since 'U'
doesn't work for Numeric,  I see no point in adding it to numarray. I'm
not sure it would hurt anything other than create the illusion that
something which works on numarray will also work on Numeric.

If anyone has a good reason to add it,  please speak up.

Regards,
Todd


From jmiller at stsci.edu  Tue Jul  6 06:58:09 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul  6 06:58:09 2004
Subject: [Numpy-discussion] inconsistencies between docs and C headers?
In-Reply-To: <40EA73C9.7070604@unibas.ch>
References: <40EA73C9.7070604@unibas.ch>
Message-ID: <1089122261.25460.41.camel@halloween.stsci.edu>

On Tue, 2004-07-06 at 05:41, Curzio Basso wrote:
> Hi all,
> can someone explain me why in the docs functions like NA_NewArray() 
> return a PyObject*, while in the headers they return a PyArrayObject*? 
> Is it just the documentation which is slow to catch up with the 
> development? 

Yes,  it's a bona fide inconsistency.  It's not great,  but it's fairly 
harmless since a PyArrayObject is a PyObject.


From paustin at eos.ubc.ca  Tue Jul  6 09:31:05 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Tue Jul  6 09:31:05 2004
Subject: [Numpy-discussion] Bug in numarray.typecode()?
In-Reply-To: <1089120859.25460.3.camel@halloween.stsci.edu>
References: <200406291705.55454.haase@msg.ucsf.edu>
	<200407020827.05407.haase@msg.ucsf.edu>
	<1088784157.26482.14.camel@halloween.stsci.edu>
	<200407021045.00866.haase@msg.ucsf.edu>
	<1088796821.5974.15.camel@halloween.stsci.edu>
	<16614.59532.288486.645869@gull.eos.ubc.ca>
	<1089120859.25460.3.camel@halloween.stsci.edu>
Message-ID: <16618.54200.934079.44467@gull.eos.ubc.ca>

Todd Miller writes:
 > > 
 > > Should this print 'U'?
 > 
 > I think it could,  but I wouldn't go so far as to say it should. 
 > typecode() is there for backward compatibility with Numeric.  Since 'U'
 > doesn't work for Numeric,  I see no point in adding it to numarray. I'm
 > not sure it would hurt anything other than create the illusion that
 > something which works on numarray will also work on Numeric.
 > 
 > If anyone has a good reason to add it,  please speak up.
 > 

I don't necessarily need typecode, but I couldn't find the
inverse of

a = array([10], type = 'UInt8') (p. 19) 

in the manual.

That is, I need a method that returns the string representation of a
numarray type in a single call (as opposed to the two-step
repr(array.type()).  This is for code that uses the Boost C++ bindings
to numarray.  These bindings work via callbacks to python (which
eliminates the need to link to the numarray or numeric api). Currently
I use typecode() to get an index into a map of types when I need to
check that the type of a passed argument is correct:

void check_type(boost::python::numeric::array arr, 
		string expected_type){
  string actual_type = arr.typecode();
  if (actual_type != expected_type) {
    std::ostringstream stream;
    stream << "expected Numeric type " << kindstrings[expected_type]
	   << ", found Numeric type " << kindstrings[actual_type] << std::ends;
    PyErr_SetString(PyExc_TypeError, stream.str().c_str());
    throw_error_already_set();
  }
  return;
}

Unless I'm missing something, without typecode I need a second
interpreter call to repr, or I need to import numarray and load all
the types into storage for a type object comparison.  It's not a
showstopper, but since I check every argument in every call, I'd like
to avoid this unless absolutely necessary.

Regards, Phil


From jmiller at stsci.edu  Tue Jul  6 11:40:08 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul  6 11:40:08 2004
Subject: [Numpy-discussion] Missing header_pep.txt
Message-ID: <1089139173.26741.2.camel@halloween.stsci.edu>

Somehow header_pep.txt didn't make it into the numarray-1.0 source
tar-ball.  It's now in CVS and also attached.

Regards,
Todd
-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 38
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040706/680d4770/attachment-0001.mht>

From jmiller at stsci.edu  Tue Jul  6 10:15:27 2004
From: jmiller at stsci.edu (Todd Miller)
Date: 06 Jul 2004 10:15:27 -0400
Subject: ANN: numarray-1.0 released
In-Reply-To: <40C2E65B0000343B@cpfe4.be.tisc.dk>
References: <40C2E65B0000343B@cpfe4.be.tisc.dk>
Message-ID: <1089123327.25460.57.camel@halloween.stsci.edu>

On Tue, 2004-07-06 at 02:59, jjm at tiscali.dk wrote:
> > The PEP is now in
> > numarray-1.0/Doc/header_pep.txt in docutils format.  We want feedback
> > and consensus before we submit to python-dev so please consider
> > reading it and commenting.
> 
> I can't find header_pep.txt!  It is not in numarray-1.0.tar.gz.

Oops, you're right.  I attached it.  Apparently I forgot to add it to
CVS.

Todd

-------------- next part --------------
PEP: XXX
Title: numerical array headers
Version: $Revision: 1.3 $
Last-Modified: $Date: 2002/08/30 04:11:20 $
Author: Todd Miller <jmiller at stsci.edu>, Perry Greenfield <perry at stsci.edu>
Discussions-To:  numpy-discussion at lists.sf.net
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 02-Jun-2004
Python-Version: 2.4
Post-History: 30-Aug-2002


Abstract
========

We propose the inclusion of three numarray header files within the
CPython distribution to facilitate use of numarray array objects as an
optional data data format for 3rd party modules. The PEP illustrates a
simple technique by which a 3rd party extension may support numarrays
as input or output values if numarray is installed, and yet the 3rd
party extension does not require numarray to be installed to be
built. Nothing needs to be changed in the setup.py or makefile for
installing with or without numarray, and a subsequent installation of
numarray will allow its use without rebuilding the 3rd party
extension.

Specification
=============

This PEP applies only to the CPython platform and only to numarray.
Analogous PEPs could be written for Jython and Python.NET and Numeric,
but what is discussed here is a speed optimization that is tightly
coupled to CPython and numarray.  Three header files to support the
numarray C-API should be included in the CPython distribution within a
numarray subdirectory of the Python include directory:

*    numarray/arraybase.h
*    numarray/libnumeric.h
*    numarray/arrayobject.h


The files are shown prefixed with "numarray" to leave the door open
for doing similar PEPs with other packages, such as Numeric.  If a
plethora of such header contributions is anticipated, a further
refinement would be to locate the headers under something like
"third_party/numarray".

In order to provide enhanced performance for array objects, an
extension writer would start by including the numarray C-API in
addition to any other Python headers:

::

    #include "numarray/arrayobject.h"

Not shown in this PEP are the API calls which operate on numarrays.
These are documented in the numarray manual.  What is shown here are
two calls which are guaranteed to be safe even when numarray is not
installed:

* PyArray_Present()
* PyArray_isArray()

In an extension function that wants to access the numarray API,
a test needs to be performed to determine if the API functions
are safely callable:

::

    PyObject *
    some_array_returning_function(PyObject *m, PyObject *args)
    {
            int param;
            PyObject *result;

            if (!PyArg_ParseTuple(args, "i", &param))
               return NULL;

            if (PyArray_Present()) {
               result = numarray_returning_function(param);
            } else {
               result = list_returning_function(param);
            }
            return result;
    }

Within **numarray_returning_function**, a subset of the numarray C-API
(the Numeric compatible API) is available for use so it is possible to
create and return numarrays.

Within **list_returning_function**, only the standard Python C-API can
be used because numarray is assumed to be unavailable in that
particular Python installation.

In an extension function that wants to accept numarrays as inputs and
provide improved performance over the Python sequence protocol, an
additional convenience function exists which diverts arrays to
specialized code when numarray is present and the input is an array:

::

    PyObject *
    some_array_accepting_function(PyObject *m, PyObject *args)
    {
            PyObject *sequence, *result;

            if (!PyArg_ParseTuple(args, "O", &sequence))
               return NULL;

            if (PyArray_isArray(sequence)) {
               result = numarray_input_function(sequence);
            } else {
               result = sequence_input_function(sequence);
            }
            return result;
    }

During module initialization, a numarray enhanced extension must call
**import_array()**, a macro which imports numarray and assigns a value
to a static API pointer: PyArray_API.  Since the API pointer starts
with the value NULL and remains so if the numarray import fails, the
API pointer serves as a flag that indicates that numarray was
sucessfully imported whenever it is non-NULL.

::

    static void
    initfoo(void)
    {
	PyObject *m = Py_InitModule3(
                 "foo",
                  _foo_functions,
                  _foo__doc__);
	if (m == NULL) return;
        import_array();
    }       

**PyArray_Present()** indicates that numarray was successfully
imported.  It is defined in terms of the API function pointer as:

::

    #define PyArray_Present()  (PyArray_API != NULL)


**PyArray_isArray(s)** indicates that numarray was successfully
imported and the given parameter is a numarray instance.  It is
defined as:

::

    #define PyArray_isArray(s)  (PyArray_Present() && PyArray_Check(s))

Motivation
==========

The use of numeric arrays as an interchange format is eminently 
sensible for many kinds of modules. For example, image, graphics,
and audio modules all can accept or generate large amounts of
numerical data that could easily use the numarray format. But since
numarray is not part of the standard distribution, some authors
of 3rd party extensions may be reluctant to add a dependency
on a different 3rd party extension that isn't absolutely essential
for its use fearing dissuading users who may be put off by extra
installation requirements. Yet, not allowing easy interchange with
numarray introduces annoyances that need not be present. Normally,
in the absence of an explicit ability to generate or use numarray
objects, one must write conversion utilities to convert from the
data representation used to that for numarray. This typically involves
excess copying of data (usually from internal to string to numarray).
In cases where the 3rd party uses buffer objects, the data may not
need copying at all.

Either many users may have to develop their own conversion routines
or numarray will have to include adapters for many other 3rd party
packages. Since numarray is used by many projects, it makes more
sense to put the conversion logic on the other side of the
fence.

There is a clear need for a mechanism that allows 3rd party software
to use numarray objects if it is available without requiring
numarray's presence to build and install properly.

Rationale
=========

One solution is to make numarray part of the standard distribution.
That may be a good long-term solution, but at the moment, the numeric
community is in transition period between the Numeric and numarray
packages which may take years to complete. It is not likely that
numarray will be considered for adoption until the transition is
complete. Numarray is also a large package, and there is legitimate
concern about its inclusion as regards the long-term commitment to
support.

We can solve that problem by making a few include files part of the
Python Standard Distribution and demonstrating how extension writers
can write code that uses numarray conditionally.

The API submitted in this PEP is the subset of the numarray API which
is most source compatible with Numeric.  The headers consist of two
handwritten files (arraybase.h and arrayobject.h) and one generated
file (libnumeric.h).  

arraybase.h contains typedefs and enumerations which are important to
both the API presented here and to the larger numarray specific API.

arrayobject.h glues together arraybase and libnumeric and is needed
for Numeric compatibility.  

libnumeric.h consists of macros generated from a template and a list
of function prototypes.  The macros themselves are somewhat intricate
in order to provide the compile time checking effect of function
prototypes.  Further, the interface takes two forms: one form is used
to compile numarray and defines static function prototypes.  The other
form is used to compile extensions which use the API and defines
macros which execute function calls through pointers which are found
in a table located using a single public API pointer.  These macros
also test the value of the API pointer in order to deliver a fatal
error should a developer forget to initialize by calling
import_array().

The interface chosen here is the subset of numarray most useful for
porting existing Numeric code or creating new extensions which can be
compiled for either numarray or Numeric.  There are a number of other
numarray API functions which are omitted here for the sake of
simplicity.

By choosing to support only the Numeric compatible subset of the
numarray C-API, concerns about interface stability are minimized
because the Numeric API is well established.  However, it should be
made clear that the numarray API subset proposed here is source
compatible, not binary compatible, with Numeric.

Resources
=========

* numarray/arraybase.h      (http://cvs.sourceforge.net/viewcvs.py/numpy/numarray/Include/numarray/arraybase.h)

* numarray/libnumeric.h     (http://cvs.sourceforge.net/viewcvs.py/numpy/numarray/Include/numarray/libnumeric.h)

* numarray/arrayobject.h    (http://cvs.sourceforge.net/viewcvs.py/numpy/numarray/Include/numarray/arrayobject.h)

* numarray-1.0 manual PDF

* numarray-1.0 source distribution

* numarray website at STSCI (http://www.stsci.edu/resources/software_hardware/numarray)

* example numarray enhanced extension

References
==========

.. [1] PEP 1, PEP Purpose and Guidelines, Warsaw, Hylton
   (http://www.python.org/peps/pep-0001.html)

.. [2] PEP 9, Sample Plaintext PEP Template, Warsaw
   (http://www.python.org/peps/pep-0009.html)


Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:

From paustin at eos.ubc.ca  Tue Jul  6 16:09:02 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Tue Jul  6 16:09:02 2004
Subject: [Numpy-discussion] non-intuitive behaviour for isbyteswapped()?
In-Reply-To: <16614.59532.288486.645869@gull.eos.ubc.ca>
References: <200406291705.55454.haase@msg.ucsf.edu>
	<200407020827.05407.haase@msg.ucsf.edu>
	<1088784157.26482.14.camel@halloween.stsci.edu>
	<200407021045.00866.haase@msg.ucsf.edu>
	<1088796821.5974.15.camel@halloween.stsci.edu>
	<16614.59532.288486.645869@gull.eos.ubc.ca>
Message-ID: <16619.12490.596884.579782@gull.eos.ubc.ca>

With numarray 1.0 and Mandrake 10 i686 I get the following:

>>> y=N.array([1,1,2,1],type="Float64")
>>> y
array([ 1.,  1.,  2.,  1.])
>>> y.byteswap()
>>> y
array([  3.03865194e-319,   3.03865194e-319,   3.16202013e-322,
         3.03865194e-319])
>>> y.isbyteswapped()
0

Should this be 1?

Thanks, Phil


From paustin at eos.ubc.ca  Tue Jul  6 18:43:49 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Tue Jul  6 18:43:49 2004
Subject: [Numpy-discussion] optional arguments to the array constructor
Message-ID: <16619.21771.686179.152410@gull.eos.ubc.ca>

(for numpy v1.0 on Mandrake 10 i686)

As noted on p. 25 the array constructor takes up to 5 optional arguments

array(sequence=None, type=None, shape=None, copy=1, savespace=0,typecode=None)
(and raises an exception if both type and typecode are set).  

Is there any way to make an alias (copy=0) of an array without passing
keyword values?  That is, specifying the copy keyword alone works:

test=N.array((1., 3), "Float64", shape=(2,), copy=1, savespace=0)
a=N.array(test, copy=0)
a[1]=999
print test

>>> [   1.  999.]

But when intervening keywords are specified copy won't toggle:

test=N.array((1., 3))
a=N.array(sequence=test, type="Float64", shape=(2,), copy=0)
a[1]=999.
print test
>>> [ 1.  3.]

Which is also the behaviour I see when I drop the keywords:

test=N.array((1., 3))
a=N.array(test, "Float64", (2,), 0)
a[1]=999.
print test
>>> [ 1.  3.]

an additional puzzle is that adding the savespace parameter raises
the following exception:


>>> a=N.array(test, "Float64", (2,), 0,0)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 312, in array
    type = getTypeObject(sequence, type, typecode) 
  File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 256, in getTypeObject
    rtype = _typeFromTypeAndTypecode(type, typecode)
  File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 243, in _typeFromTypeAndTypecode
    raise ValueError("Can't define both 'type' and 'typecode' for an array.")
ValueError: Can't define both 'type' and 'typecode' for an array.

Thanks for any insights -- Phil


From postmaster at framatome-anp.com  Tue Jul  6 23:59:40 2004
From: postmaster at framatome-anp.com (System Administrator)
Date: Tue Jul  6 23:59:40 2004
Subject: [Numpy-discussion] Undeliverable: Re: Thanks!
Message-ID: <72B401374280BA4897AF65F8853386C0F193DC@fpari01mxb.di.framatome.fr>

Your message

  To:      jacques.heliot at framatome-anp.com
  Subject: Re: Thanks!
  Sent:    Wed, 7 Jul 2004 08:56:14 +0200

did not reach the following recipient(s):

jacques.heliot at framail.framatome-anp.com on Wed, 7 Jul 2004 08:56:06 +0200
    The recipient name is not recognized
	The MTS-ID of the original message is: c=fr;a=
;p=fragroup;l=FPARI01MXB0407070656LWFRLMFV
    MSEXCH:IMS:FRAGROUP:FRAANP-FR-PARIS-PARIS:FPARI01MXB 0 (000C05A6)
Unknown Recipient


-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 38
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040706/1797edc8/attachment-0001.mht>

From numpy-discussion at lists.sourceforge.net  Wed Jul  7 02:56:14 2004
From: numpy-discussion at lists.sourceforge.net (numpy-discussion at lists.sourceforge.net)
Date: Wed, 7 Jul 2004 08:56:14 +0200 
Subject: Thanks!
Message-ID: <200407070647.i676lSFD047810@mx.framatome-anp.com>

------------------  Virus Warning Message (on octopussy)

Found virus WORM_NETSKY.D in file message_part2.pif
The uncleanable file is deleted.

---------------------------------------------------------

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ATT2026863.txt
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040707/04bbb76c/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ATT2026864.txt
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040707/04bbb76c/attachment-0003.txt>

From jmiller at stsci.edu  Wed Jul  7 07:58:05 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jul  7 07:58:05 2004
Subject: [Numpy-discussion] non-intuitive behaviour for isbyteswapped()?
In-Reply-To: <16619.12490.596884.579782@gull.eos.ubc.ca>
References: <200406291705.55454.haase@msg.ucsf.edu>
	 <200407020827.05407.haase@msg.ucsf.edu>
	 <1088784157.26482.14.camel@halloween.stsci.edu>
	 <200407021045.00866.haase@msg.ucsf.edu>
	 <1088796821.5974.15.camel@halloween.stsci.edu>
	 <16614.59532.288486.645869@gull.eos.ubc.ca>
	 <16619.12490.596884.579782@gull.eos.ubc.ca>
Message-ID: <1089212251.29456.212.camel@halloween.stsci.edu>

On Tue, 2004-07-06 at 19:07, Philip Austin wrote:
> With numarray 1.0 and Mandrake 10 i686 I get the following:
> 
> >>> y=N.array([1,1,2,1],type="Float64")
> >>> y
> array([ 1.,  1.,  2.,  1.])
> >>> y.byteswap()
> >>> y
> array([  3.03865194e-319,   3.03865194e-319,   3.16202013e-322,
>          3.03865194e-319])
> >>> y.isbyteswapped()
> 0
>
> Should this be 1?

The behavior of byteswap() has been controversial in the past, at one
time implementing exactly the behavior I think you expected.

Without giving any guarantee for the future,  here's how things work
now:  byteswap() just swaps the bytes.  There's a related method,
togglebyteorder(), which inverts the sense of the byteorder:

	>>> y.byteswap()
	>>> y.togglebyteorder()
	>>> y.isbyteswapped()
	1

The ability to munge bytes and change the sense of byteorder
independently is definitely needed... but you're certainly not the first
one to ask this question.

There is also (Numeric compatible) byteswapped(), which both swaps and
changes sense, but it creates a copy rather than operating in place:
	
	>>> x = y.byteswapped()
	>>> (x is not y) and (x._data is not y._data)
	1

Regards,
Todd


From jmiller at stsci.edu  Wed Jul  7 08:13:05 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jul  7 08:13:05 2004
Subject: [Numpy-discussion] optional arguments to the array constructor
In-Reply-To: <16619.21771.686179.152410@gull.eos.ubc.ca>
References: <16619.21771.686179.152410@gull.eos.ubc.ca>
Message-ID: <1089213153.29456.229.camel@halloween.stsci.edu>

On Tue, 2004-07-06 at 21:42, Philip Austin wrote:
> (for numpy v1.0 on Mandrake 10 i686)

My guess is you're talking about numarray here.  Please be charitable if
I'm talking out of turn... I tend to see everything as a numarray issue.

> As noted on p. 25 the array constructor takes up to 5 optional arguments
> 
> array(sequence=None, type=None, shape=None, copy=1, savespace=0,typecode=None)
> (and raises an exception if both type and typecode are set).  
> 
> Is there any way to make an alias (copy=0) of an array without passing
> keyword values?  

In numarray,  all you have to do to get an alias is:

>>> b = a.view()

It's an alias because:

>>> b._data is a._data
True

> That is, specifying the copy keyword alone works:
> 
> test=N.array((1., 3), "Float64", shape=(2,), copy=1, savespace=0)
> a=N.array(test, copy=0)
> a[1]=999
> print test
> 
> >>> [   1.  999.]
> 
> But when intervening keywords are specified copy won't toggle:
> 
> test=N.array((1., 3))
> a=N.array(sequence=test, type="Float64", shape=(2,), copy=0)
> a[1]=999.
> print test
> >>> [ 1.  3.]
> 
> Which is also the behaviour I see when I drop the keywords:
> 
> test=N.array((1., 3))
> a=N.array(test, "Float64", (2,), 0)
> a[1]=999.
> print test
> >>> [ 1.  3.]
> 
> an additional puzzle is that adding the savespace parameter raises
> the following exception:
> 
> 
> >>> a=N.array(test, "Float64", (2,), 0,0)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 312, in array
>     type = getTypeObject(sequence, type, typecode) 
>   File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 256, in getTypeObject
>     rtype = _typeFromTypeAndTypecode(type, typecode)
>   File "/usr/lib/python2.3/site-packages/numarray/numarraycore.py", line 243, in _typeFromTypeAndTypecode
>     raise ValueError("Can't define both 'type' and 'typecode' for an array.")
> ValueError: Can't define both 'type' and 'typecode' for an array.

All this looks like a documentation problem.  The numarray array()
signature has been tortured by Numeric backward compatibility,  so there
has been more flux in it than you would expect.  Anyway, the manual is
out of date.  Here's the current signature from the code:

def array(sequence=None, typecode=None, copy=1, savespace=0,
          type=None, shape=None):

Sorry about the confusion,
Todd


From paustin at eos.ubc.ca  Wed Jul  7 11:26:11 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Wed Jul  7 11:26:11 2004
Subject: [Numpy-discussion] optional arguments to the array constructor
In-Reply-To: <1089213153.29456.229.camel@halloween.stsci.edu>
References: <16619.21771.686179.152410@gull.eos.ubc.ca>
	<1089213153.29456.229.camel@halloween.stsci.edu>
Message-ID: <16620.16395.603789.28730@gull.eos.ubc.ca>

Todd Miller writes:
 > On Tue, 2004-07-06 at 21:42, Philip Austin wrote:
 > > (for numpy v1.0 on Mandrake 10 i686)
 > 
 > My guess is you're talking about numarray here.  Please be charitable if
 > I'm talking out of turn... I tend to see everything as a numarray
 > issue.

Right -- I'm still working through the boost test suite for numarray, which is
failing a couple of tests that passed (around numarray v0.3).

 > All this looks like a documentation problem.  The numarray array()
 > signature has been tortured by Numeric backward compatibility,  so there
 > has been more flux in it than you would expect.  Anyway, the manual is
 > out of date.  Here's the current signature from the code:
 > 
 > def array(sequence=None, typecode=None, copy=1, savespace=0,
 >           type=None, shape=None):
 > 

Actually, it seems to be a difference in the way that numeric and
numarray treat the copy flag when typecode is specified.  In numeric,
if no change in type is requested and copy=0, then the constructor
goes ahead and produces a view:

import Numeric as nc
test=nc.array([1,2,3],'i')
a=nc.array(test,'i',0)
a[0]=99
print test
>> [99  2  3]

but makes a copy if a cast is required:

test=nc.array([1,2,3],'i')
a=nc.array(test,'F',0)
a[0]=99
print test
>>> [1 2 3]

Looking at numarraycore.py line 305 I see that:

        if type is None and typecode is None:
            if copy:
                a = sequence.copy()
            else:
                a = sequence

i.e. numarray skips the check for a type match and ignores
the copy flag, even if the type is preserved:

import numarray as ny
test=ny.array([1,2,3],'i')
a=ny.array(test,'i',0)
a._data is test._data
>>> False

It look like there might have been a comment about this
in the docstring, but it got clipped at some point?:

array() constructs a NumArray by calling NumArray, one of its
    factory functions (fromstring, fromfile, fromlist), or by making a
    copy of an existing array.  If copy=0, array() will create a new
    array only if

    sequence             specifies the contents or storage for the array

Thanks, Phil


From jmiller at stsci.edu  Wed Jul  7 12:47:02 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jul  7 12:47:02 2004
Subject: [Numpy-discussion] optional arguments to the array constructor
In-Reply-To: <16620.16395.603789.28730@gull.eos.ubc.ca>
References: <16619.21771.686179.152410@gull.eos.ubc.ca>
	 <1089213153.29456.229.camel@halloween.stsci.edu>
	 <16620.16395.603789.28730@gull.eos.ubc.ca>
Message-ID: <1089229573.29456.544.camel@halloween.stsci.edu>

On Wed, 2004-07-07 at 14:25, Philip Austin wrote:
> Todd Miller writes:
>  > On Tue, 2004-07-06 at 21:42, Philip Austin wrote:
>  > > (for numpy v1.0 on Mandrake 10 i686)
>  > 
>  > My guess is you're talking about numarray here.  Please be charitable if
>  > I'm talking out of turn... I tend to see everything as a numarray
>  > issue.
> 
> Right -- I'm still working through the boost test suite for numarray, which is
> failing a couple of tests that passed (around numarray v0.3).
> 
>  > All this looks like a documentation problem.  The numarray array()
>  > signature has been tortured by Numeric backward compatibility,  so there
>  > has been more flux in it than you would expect.  Anyway, the manual is
>  > out of date.  Here's the current signature from the code:
>  > 
>  > def array(sequence=None, typecode=None, copy=1, savespace=0,
>  >           type=None, shape=None):
>  > 
> 
> Actually, it seems to be a difference in the way that numeric and
> numarray treat the copy flag when typecode is specified.  In numeric,
> if no change in type is requested and copy=0, then the constructor
> goes ahead and produces a view:
> 
> import Numeric as nc
> test=nc.array([1,2,3],'i')
> a=nc.array(test,'i',0)
> a[0]=99
> print test
> >> [99  2  3]
> 
> but makes a copy if a cast is required:
> 
> test=nc.array([1,2,3],'i')
> a=nc.array(test,'F',0)
> a[0]=99
> print test
> >>> [1 2 3]
> 
> Looking at numarraycore.py line 305 I see that:
> 
>         if type is None and typecode is None:
>             if copy:
>                 a = sequence.copy()
>             else:
>                 a = sequence
> 
> i.e. numarray skips the check for a type match and ignores
> the copy flag, even if the type is preserved:
> 
> import numarray as ny
> test=ny.array([1,2,3],'i')
> a=ny.array(test,'i',0)
> a._data is test._data
> >>> False
> 

OK,  I think I see what you're after and agree that it's a bug.  Here's
how I'll change the behavior:

>>> import numarray
>>> a = numarray.arange(10)
>>> b = numarray.array(a, copy=0)
>>> a is b
True
>>> b = numarray.array(a, copy=1)
>>> a is b
False

One possible point of note is that array() doesn't return views for
copy=0;  neither does Numeric; both return the original sequence.

Regards,
Todd


From paustin at eos.ubc.ca  Wed Jul  7 13:15:04 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Wed Jul  7 13:15:04 2004
Subject: [Numpy-discussion] optional arguments to the array constructor
In-Reply-To: <1089229573.29456.544.camel@halloween.stsci.edu>
References: <16619.21771.686179.152410@gull.eos.ubc.ca>
	<1089213153.29456.229.camel@halloween.stsci.edu>
	<16620.16395.603789.28730@gull.eos.ubc.ca>
	<1089229573.29456.544.camel@halloween.stsci.edu>
Message-ID: <16620.22921.791432.143944@gull.eos.ubc.ca>

Todd Miller writes:
 > 
 > OK,  I think I see what you're after and agree that it's a bug.  Here's
 > how I'll change the behavior:
 > 
 > >>> import numarray
 > >>> a = numarray.arange(10)
 > >>> b = numarray.array(a, copy=0)
 > >>> a is b
 > True
 > >>> b = numarray.array(a, copy=1)
 > >>> a is b
 > False

Just to be clear -- the above is the current numarray v1.0 behavior
(at least on my machine).  Numeric compatibility would additonally
require that

import numarray
a = numarray.arange(10)
theTypeCode=repr(a.type())
b = numarray.array(a, theTypeCode, copy=0)
print a is b
b = numarray.array(a, copy=1)
print a is b

produce

True
False

While currently it produces

True
True

Having said this, I can work around this difference -- so either
a note in the documentation or just removing the copy flag from 
numarray.array would also be ok.

-- Thanks, Phil


From paustin at eos.ubc.ca  Wed Jul  7 13:17:03 2004
From: paustin at eos.ubc.ca (Philip Austin)
Date: Wed Jul  7 13:17:03 2004
Subject: [Numpy-discussion] Re: Correction -- optional arguments to the array constructor
In-Reply-To: <1089229573.29456.544.camel@halloween.stsci.edu>
References: <16619.21771.686179.152410@gull.eos.ubc.ca>
	<1089213153.29456.229.camel@halloween.stsci.edu>
	<16620.16395.603789.28730@gull.eos.ubc.ca>
	<1089229573.29456.544.camel@halloween.stsci.edu>
Message-ID: <16620.23066.506262.410021@gull.eos.ubc.ca>

Oops, note the change below at --->

Todd Miller writes:
 > 
 > OK,  I think I see what you're after and agree that it's a bug.  Here's
 > how I'll change the behavior:
 > 
 > >>> import numarray
 > >>> a = numarray.arange(10)
 > >>> b = numarray.array(a, copy=0)
 > >>> a is b
 > True
 > >>> b = numarray.array(a, copy=1)
 > >>> a is b
 > False

Just to be clear -- the above is the current numarray v1.0 behavior
(at least on my machine).  Numeric compatibility would additonally
require that

import numarray
a = numarray.arange(10)
theTypeCode=repr(a.type())
b = numarray.array(a, theTypeCode, copy=0)
print a is b
b = numarray.array(a, copy=1)
print a is b

produce

True
False

While currently it produces

--->

False
False

Having said this, I can work around this difference -- so either
a note in the documentation or just removing the copy flag from 
numarray.array would also be ok.

-- Thanks, Phil


From wlanger at bigpond.net.au  Thu Jul  8 10:29:01 2004
From: wlanger at bigpond.net.au (Wendy Langer)
Date: Thu Jul  8 10:29:01 2004
Subject: [Numpy-discussion] "buffer not aligned on 8 byte boundary" errors when running numarray.testall.test()
Message-ID: <NGBBLHCDALFCOGPKMFFJAEJBDAAA.wlanger@bigpond.net.au>


Hi there all :)

I am having trouble with my installation of numarray.  :(

I am a python newbie and a numarray extreme-newbie, so it could be that I
don't yet have the first clue what I am doing. ;)


Python 2.3.3 (#51, Feb 16 2004, 04:07:52) [MSC v.1200 32 bit (Intel)] on
win32
numarray 1.0


The Python I am using is the one that comes with the "Enthought" version
(www.enthought.com), a distro specifically designed to be useful for
scientists, so it comes with numerical stuff and scipy and chaco and things
like that preinstalled.

I used the windows binary installer. However it came with Numeric and not
numarray, so I installed numarray "by hand". This seemed to go ok, and it
seems that there is no problem having both Numeric and numarray in the same
installation, since they have (obviously) different names (still getting
used to this whole modules and namespaces &c &c)


At the bottom of this email I have pasted an example of what it was I was
trying to do, and the error messages that the interpreter gave me - but
before anyone bothers reading them in any detail, the essential error seems
to be as follows:

error: multiply_Float64_scalar_vector: buffer not aligned on 8 byte
boundary.


I have no idea what this means, but I do recall that when I ran the
numarray.testall.test() procedure after first completing my installation a
couple of days ago, it  reported a *lot* of problems, many of which sounded
quite similar to this.

I hoped for the best and  thought that perhaps I had "run the test wrong"(!)
since numarray seemed to be working ok, and I had investigated many of the
examples in chapters 3 and 4 of the user manual withour any obvious problems
(chapter 3 = "high level overview" and chapter 4 = "array basics")

I decided at the time to leave well enough alone until I actually came
across odd or mysterious behaviour ...however that time has come
all-too-soon...


The procedure I am using to run the test is as described on page 11 of the
excellent user's manual (release 0.8 at
http://www.pfdubois.com/numpy/numarray.pdf):


---------------------------------------------
Testing your Installation
Once you have installed numarray, test it with:
C:\numarray> python
Python 2.2.2 (#18, Dec 30 2002, 02:26:03) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import numarray.testall as testall
>>> testall.test()
numeric: (0, 1115)
records: (0, 48)
strings: (0, 166)
objects: (0, 72)
memmap: (0, 75)
Each line in the above output indicates that 0 of X tests failed. X grows
steadily with each release, so the numbers
shown above may not be current.
------------------------------------------------------------------------

Anyway, when I ran this, instead of the nice, comforting output above,  I
got about a million(!) errors and then a final count of 320 failures. This
number is not always constant - I recall the first time I ran it it was 209.
[I just ran it again and this time it was 324...it all has a rather
disturbing air of semi-randomness...]


So below is the (heavily snipped) output from the testall.test() run, and
below that is the code where I first noticed a possibly similar error, and
below *that*  is the output of that code, with the highly suspicous
error....


Any suggestions greatly appreciated!

I can give you more info about the setup on my computer and so on if you
need :)

wendy langer


======================================================================
<output when I ran numarray.testall.test() >
====================================

IDLE 1.0.2      ==== No Subprocess ====
>>> import numarray.testall as testall
>>> testall.test()
*****************************************************************
Failure in example: x+y
from line #50 of first pass
Exception raised:
Traceback (most recent call last):
  File "C:\PYTHON23\lib\doctest.py", line 442, in _run_examples_inner
    compileflags, 1) in globs
  File "<string>", line 1, in ?
  File "C:\PYTHON23\Lib\site-packages\numarray\numarraycore.py", line 733,
in __add__
    return ufunc.add(self, operand)
error: Int32asFloat64: buffer not aligned on 8 byte boundary.
*****************************************************************
Failure in example: x[:] = 0.1
from line #72 of first pass
Exception raised:
Traceback (most recent call last):
  File "C:\PYTHON23\lib\doctest.py", line 442, in _run_examples_inner
    compileflags, 1) in globs
  File "<string>", line 1, in ?
error: Float64asBool: buffer not aligned on 8 byte boundary.
*****************************************************************
Failure in example: y
from line #74 of first pass
Expected: array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])
Got: array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
*****************************************************************
Failure in example: x + z
from line #141 of first pass
Exception raised:
Traceback (most recent call last):
  File "C:\PYTHON23\lib\doctest.py", line 442, in _run_examples_inner
    compileflags, 1) in globs
  File "<string>", line 1, in ?
  File "C:\PYTHON23\Lib\site-packages\numarray\numarraycore.py", line 733,
in __add__
    return ufunc.add(self, operand)
error: Int32asFloat64: buffer not aligned on 8 byte boundary.
*****************************************************************


<BIG SNIP!!!!!!!!!!!!>

<snip a lot more  errors>

*****************************************************************
Failure in example: a2dma = average(a2dm, axis=1)
from line #812 of numarray.ma.dtest
Exception raised:
Traceback (most recent call last):
  File "C:\PYTHON23\lib\doctest.py", line 442, in _run_examples_inner
    compileflags, 1) in globs
  File "<string>", line 1, in ?
  File "C:\PYTHON23\Lib\site-packages\numarray\ma\MA.py", line 1686, in
average
    w = Numeric.choose(mask, (1.0, 0.0))
  File "C:\PYTHON23\Lib\site-packages\numarray\ufunc.py", line 1666, in
choose
    return _choose(selector, population, outarr, clipmode)
  File "C:\PYTHON23\Lib\site-packages\numarray\ufunc.py", line 1573, in
__call__
    result = self._doit(computation_mode, woutarr, cfunc, ufargs, 0)
  File "C:\PYTHON23\Lib\site-packages\numarray\ufunc.py", line 1558, in
_doit
    blockingparameters)
error: choose8bytes: buffer not aligned on 8 byte boundary.
*****************************************************************
Failure in example: alltest(a2dma, [1.5, 4.0])
from line #813 of numarray.ma.dtest
Exception raised:
Traceback (most recent call last):
  File "C:\PYTHON23\lib\doctest.py", line 442, in _run_examples_inner
    compileflags, 1) in globs
  File "<string>", line 1, in ?
NameError: name 'a2dma' is not defined
*****************************************************************
1 items had failures:
 320 of 671 in numarray.ma.dtest
***Test Failed*** 320 failures.
numarray.ma:                            (320, 671)


</output when I ran numarray.testall.test() >

=========================================================================
<my code>
========================


import numarray

class anXmatrix:
    def __init__(self, stepsize = 3):
        self.stepsize = stepsize
        self.populate_matrix()


    def describe(self):
        print "I am a ", self.__class__
        print "my stepsize is", self.stepsize
        print "my matrix is: \n"
        print self.matrix

    def populate_matrix(self):

        def xvalues(i,j):
            return self.stepsize*j

        mx = numarray.fromfunction(xvalues, (4,4))
        self.matrix = mx


if __name__ == '__main__':


    print "      "
    print "Making anXmatrix..."
    r = anXmatrix(stepsize = 5)
    r.describe()
    r = anXmatrix(stepsize = 0.02)
    r.describe()

</my code>

============================================================================
========
<output from interpreter when I run my code>


Making anXmatrix...
I am a  __main__.anXmatrix
my stepsize is 5
my matrix is:

[[ 0  5 10 15]
 [ 0  5 10 15]
 [ 0  5 10 15]
 [ 0  5 10 15]]
Traceback (most recent call last):
  File
"C:\Python23\Lib\site-packages\WendyStuff\wendycode\propagatorstuff\core_obj
ects\domain_objects.py", line 97, in ?
    r = anXmatrix(stepsize = 0.02)
  File
"C:\Python23\Lib\site-packages\WendyStuff\wendycode\propagatorstuff\core_obj
ects\domain_objects.py", line 72, in __init__
    self.populate_matrix()
  File
"C:\Python23\Lib\site-packages\WendyStuff\wendycode\propagatorstuff\core_obj
ects\domain_objects.py", line 86, in populate_matrix
    mx = numarray.fromfunction(xvalues, (4,4))
  File "C:\PYTHON23\Lib\site-packages\numarray\generic.py", line 1094, in
fromfunction
    return apply(function, tuple(indices(dimensions)))
  File
"C:\Python23\Lib\site-packages\WendyStuff\wendycode\propagatorstuff\core_obj
ects\domain_objects.py", line 84, in xvalues
    return self.stepsize*j
  File "C:\PYTHON23\Lib\site-packages\numarray\numarraycore.py", line 772,
in __rmul__
    r = ufunc.multiply(operand, self)
error: multiply_Float64_scalar_vector: buffer not aligned on 8 byte
boundary.
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


</output from interpreter when I run my code>
============================================================================
========

"You see, wire telegraph is a kind of a very, very long cat. You pull his
tail in New York and his head is meowing in Los Angeles. Do you understand
this? And radio operates exactly the same way: you send signals here, they
receive them there. The only difference is that there is no cat."  Albert
Einstein


From Chris.Barker at noaa.gov  Thu Jul  8 10:58:07 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul  8 10:58:07 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <40E473A9.5040109@colorado.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov> <40E473A9.5040109@colorado.edu>
Message-ID: <40ED8A6D.5050505@noaa.gov>

Thanks to Fernando Perez  and Travis Oliphant for pointing me to:

> scipy.io.read_array

In testing, I've found that it's very slow (for my needs), though quite 
nifty in other ways, so I'm sure I'll find a use for it in the future.

Travis Oliphant wrote:

 > Alternatively, we could move some of the Python code in read_array to 
 > C to improve the speed.

That was beyond me, so I wrote a very simple module in C that does what 
I want, and it is very much faster than read_array or straight python 
version. It has two functions:

FileScan(file)
"""
Reads all the values in rest of the ascii file, and produces a Numeric
vector full of Floats (C doubles).

All text in the file that is not part of a floating point number is
skipped over.
"""

FileScanN(file, N)

"""
Reads N values in the ascii file, and produces a Numeric vector of
length N full of Floats (C doubles).

Raises an exception if there are fewer than N  numbers in the file.

All text in the file that is not part of a floating point number is
skipped over.

After reading N numbers, the file is left before the next non-whitespace
character in the file. This will often leave the file at the start of
the next line, after scanning a line full of numbers.
"""

I implemented them separately, 'cause I wasn't sure how to deal with 
optional arguments in a C function. They could easily have wrapped in a 
Python function if you wanted one interface.

FileScan was much more complex, as I had to deal with all the dynamic 
memory allocation. I probably took a more complex approach to this than 
I had to, but it was an exercise for me, being a newbie at C.

I also decided not to specify a shape for the resulting array, always 
returning a rank-1 array, as that made the code easier, and you can 
always set A.shape afterward. This could be put in a Python wrapper as well.

It has the obvious limitation that it only does doubles. I'd like to add 
longs as well, but probably won't have a need for anything else. The way 
memory is these days, it seems just as easy to read the long ones, and 
convert afterward if you want.

Here is a quick benchmark (see below) run with a file that is 63,000 
lines, with two comma-delimited numbers on each line. Run on a 1GHz P4 
under Linux.

Reading with read_array
it took 16.351712 seconds to read the file with read_array
Reading with Standard Python methods
it took 2.832078 seconds to read the file with standard Python methods
Reading with FileScan
it took 0.444431 seconds to read the file with FileScan
Reading with FileScanN
it took 0.407875 seconds to read the file with FileScanN

As you can see, read_array is painfully slow for this kind of thing, 
straight Python is OK, and FileScan is pretty darn fast.

I've enclosed the C code and setup.py, if anyone wants to take a look, 
and use it, or give suggestions or bug fixes or whatever, that would be 
great.

In particular, I don't think I've structured the code very well, and 
there could be memory leak, which I have not tested carefully for.

Tested only on Linux with Python2.3.3, Numeric 23.1. If someone wants to 
  port it to numarray, that would be great too.

-Chris


The benchmark:

def test6():
     """
     Testing various IO options
     """
     from scipy.io import array_import

     filename = "JunkBig.txt"
     file = open(filename)
     print "Reading with read_array"
     start = time.time()
     A = array_import.read_array(file,",")
     print "it took %f seconds to read the file with 
read_array"%(time.time() - start)
     file.close()

     file = open(filename)
     print "Reading with Standard Python methods"
     start = time.time()
     A = []
     for line in file:
         A.append( map ( float, line.strip().split(",") ) )
     A = array(A)
     print "it took %f seconds to read the file with standard Python 
methods"%(time.time() - start)
     file.close()

     file = open(filename)
     print "Reading with FileScan"
     start = time.time()
     A = FileScanner.FileScan(file)
     A.shape = (-1,2)
     print "it took %f seconds to read the file with 
FileScan"%(time.time() - start)
     file.close()

     file = open(filename)
     print "Reading with FileScanN"
     start = time.time()
     A = FileScanner.FileScanN(file, product(A.shape) )
     A.shape = (-1,2)
     print "it took %f seconds to read the file with 
FileScanN"%(time.time() - start)

-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: FileScan_module.c
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040708/ede864ba/attachment-0001.c>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: setup.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040708/ede864ba/attachment-0001.ksh>

From jmiller at stsci.edu  Thu Jul  8 12:05:02 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul  8 12:05:02 2004
Subject: [Numpy-discussion] "buffer not aligned on 8 byte boundary"
	errors when running numarray.testall.test()
In-Reply-To: <NGBBLHCDALFCOGPKMFFJAEJBDAAA.wlanger@bigpond.net.au>
References: <NGBBLHCDALFCOGPKMFFJAEJBDAAA.wlanger@bigpond.net.au>
Message-ID: <1089313446.2639.55.camel@halloween.stsci.edu>

On Thu, 2004-07-08 at 13:28, Wendy Langer wrote: 
> Hi there all :)
> 
> I am having trouble with my installation of numarray.  :(
> 
> I am a python newbie and a numarray extreme-newbie, so it could be that I
> don't yet have the first clue what I am doing. ;)
> 
> 
> 
> Python 2.3.3 (#51, Feb 16 2004, 04:07:52) [MSC v.1200 32 bit (Intel)] on
> win32
> numarray 1.0
> 
> 
> The Python I am using is the one that comes with the "Enthought" version
> (www.enthought.com), a distro specifically designed to be useful for
> scientists, so it comes with numerical stuff and scipy and chaco and things
> like that preinstalled.
> 
> I used the windows binary installer. However it came with Numeric and not
> numarray, so I installed numarray "by hand". This seemed to go ok, and it
> seems that there is no problem having both Numeric and numarray in the same
> installation, since they have (obviously) different names (still getting
> used to this whole modules and namespaces &c &c)

I don't normally use SciPy,  but I normally have both numarray and
Numeric installed so there's no inherent conflict there.

> At the bottom of this email I have pasted an example of what it was I was
> trying to do, and the error messages that the interpreter gave me - but
> before anyone bothers reading them in any detail, the essential error seems
> to be as follows:
> 
> error: multiply_Float64_scalar_vector: buffer not aligned on 8 byte
> boundary.

This is a low level exception triggered by a misaligned data buffer. 
It's low level so it's impossible to tell what the real problem is
without more information.

> I have no idea what this means, but I do recall that when I ran the
> numarray.testall.test() procedure after first completing my installation a
> couple of days ago, it  reported a *lot* of problems, many of which sounded
> quite similar to this.

That sounds pretty bad.  Here's roughly how it should look these days:
% python
>>> import numarray.testall as testall
>>> testall.test()
numarray:                               ((0, 1165), (0, 1165))
numarray.records:                       (0, 48)
numarray.strings:                       (0, 176)
numarray.memmap:                        (0, 82)
numarray.objects:                       (0, 105)
numarray.memorytest:                    (0, 16)
numarray.examples.convolve:             ((0, 20), (0, 20), (0, 20), (0,
20))
numarray.convolve:                      (0, 52)
numarray.fft:                           (0, 75)
numarray.linear_algebra:                ((0, 46), (0, 51))
numarray.image:                         (0, 27)
numarray.nd_image:                      (0, 390)
numarray.random_array:                  (0, 53)
numarray.ma:                            (0, 671)

The tuple results for your test should all have leading zeros as above. 
The number of tests varies from release to release.


> I hoped for the best and  thought that perhaps I had "run the test wrong"(!)
> since numarray seemed to be working ok, and I had investigated many of the
> examples in chapters 3 and 4 of the user manual withour any obvious problems
> (chapter 3 = "high level overview" and chapter 4 = "array basics")
> 
> I decided at the time to leave well enough alone until I actually came
> across odd or mysterious behaviour ...however that time has come
> all-too-soon...
> 
> 
> 
> 
> The procedure I am using to run the test is as described on page 11 of the
> excellent user's manual (release 0.8 at
> http://www.pfdubois.com/numpy/numarray.pdf):

There's an updated manual here:
http://prdownloads.sourceforge.net/numpy/numarray-1.0.pdf?download

> --
> Testing your Installation
> Once you have installed numarray, test it with:
> C:\numarray> python
> Python 2.2.2 (#18, Dec 30 2002, 02:26:03) [MSC 32 bit (Intel)] on win32
> Type "copyright", "credits" or "license" for more information.
> >>> import numarray.testall as testall
> >>> testall.test()
> numeric: (0, 1115)
> records: (0, 48)
> strings: (0, 166)
> objects: (0, 72)
> memmap: (0, 75)
> Each line in the above output indicates that 0 of X tests failed. X grows
> steadily with each release, so the numbers
> shown above may not be current.
> --
> 
> Anyway, when I ran this, instead of the nice, comforting output above,  I
> got about a million(!) errors and then a final count of 320 failures. This
> number is not always constant - I recall the first time I ran it it was 209.
> [I just ran it again and this time it was 324...it all has a rather
> disturbing air of semi-randomness...]
> 
> 
> So below is the (heavily snipped) output from the testall.test() run, and
> below that is the code where I first noticed a possibly similar error, and
> below *that*  is the output of that code, with the highly suspicous
> error....
> 
> 
> Any suggestions greatly appreciated!

If you've ever had numarray installed before,  go to your site-packages
directory and delete numarray as well as any numarray.pth.  Then
reinstall numarray-1.0.

Also,  just do:

>>> import numarray
>>> numarray

and see what kind of path is involved getting to the numarray module.

> I can give you more info about the setup on my computer and so on if you
> need :)

I think you already included everything important;  the exact variant of
Windows you're using might be helpful;  I'm not aware of any problems
there though.  

It looks like you're on a well supported platform.  I just tested pretty
much the same configuration on Windows 2000 Pro, with Python-2.3.4, and
it worked fine even with SciPy-0.3.

> wendy langer
> 
> 
> ======================================================================
> <output when I ran numarray.testall.test() >
> 

<SNIP>

There's something hugely wrong with your test output.  I've never seen
anything like it other than during development.

> </output when I ran numarray.testall.test() >
> 
> =========================================================================
> <my code>
> ========================
> 
> 
> import numarray
> 
> class anXmatrix:
>     def __init__(self, stepsize = 3):
>         self.stepsize = stepsize
>         self.populate_matrix()
> 
> 
>     def describe(self):
>         print "I am a ", self.__class__
>         print "my stepsize is", self.stepsize
>         print "my matrix is: \n"
>         print self.matrix
> 
>     def populate_matrix(self):
> 
>         def xvalues(i,j):
>             return self.stepsize*j
> 
>         mx = numarray.fromfunction(xvalues, (4,4))
>         self.matrix = mx
> 
> 
> if __name__ == '__main__':
> 
> 
>     print "      "
>     print "Making anXmatrix..."
>     r = anXmatrix(stepsize = 5)
>     r.describe()
>     r = anXmatrix(stepsize = 0.02)
>     r.describe()
> 
> </my code>
> 
> ============================================================================
Here's what I get when I run your code, windows or linux:

Making anXmatrix...
I am a  __main__.anXmatrix
my stepsize is 5
my matrix is:

[[ 0  5 10 15]
 [ 0  5 10 15]
 [ 0  5 10 15]
 [ 0  5 10 15]]
I am a  __main__.anXmatrix
my stepsize is 0.02
my matrix is:

[[ 0.    0.02  0.04  0.06]
 [ 0.    0.02  0.04  0.06]
 [ 0.    0.02  0.04  0.06]
 [ 0.    0.02  0.04  0.06]]

Regards,
Todd


From Fernando.Perez at colorado.edu  Thu Jul  8 12:25:07 2004
From: Fernando.Perez at colorado.edu (Fernando.Perez at colorado.edu)
Date: Thu Jul  8 12:25:07 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <40ED8A6D.5050505@noaa.gov>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov> <40E473A9.5040109@colorado.edu> <40ED8A6D.5050505@noaa.gov>
Message-ID: <1089314664.40ed9f68e1db5@webmail.colorado.edu>

Quoting Chris Barker <Chris.Barker at noaa.gov>:

> Thanks to Fernando Perez  and Travis Oliphant for pointing me to:
>
> > scipy.io.read_array
>
> In testing, I've found that it's very slow (for my needs), though quite
> nifty in other ways, so I'm sure I'll find a use for it in the future.

Just a quick note Travis sent to me privately: he suggested using
io.numpyio.fread instead of Numeric.fromstring() for speed reasons.  I don't
know if it will help in your case, I just mention it in case it helps.

Cheers,

F


From Chris.Barker at noaa.gov  Thu Jul  8 12:41:06 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul  8 12:41:06 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <1089314664.40ed9f68e1db5@webmail.colorado.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov> <40E473A9.5040109@colorado.edu> <40ED8A6D.5050505@noaa.gov> <1089314664.40ed9f68e1db5@webmail.colorado.edu>
Message-ID: <40EDA2A8.9030300@noaa.gov>

Fernando.Perez at colorado.edu wrote:
\> Just a quick note Travis sent to me privately: he suggested using
> io.numpyio.fread instead of Numeric.fromstring() for speed reasons.  I don't
> know if it will help in your case, I just mention it in case it helps.

Thanks, but those are for binary files, which I have to do sometimes, so 
I'll keep it in mind. However, my problem at hand is text files, and my 
solution is working nicely, though I'd love a pair of more experienced 
eyes on the code....

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Jul  8 13:50:03 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul  8 13:50:03 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <004c01c46524$ab808090$ebeca782@stsci.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov> <40E473A9.5040109@colorado.edu> <40ED8A6D.5050505@noaa.gov> <1089314664.40ed9f68e1db5@webmail.colorado.edu> <40EDA2A8.9030300@noaa.gov> <004c01c46524$ab808090$ebeca782@stsci.edu>
Message-ID: <40EDB2BD.4080809@noaa.gov>

Todd Miller wrote:

> I looked this over to see how hard it would be to port to numarray.  At
> first glance,  it looks easy.  I didn't really read it closely enough to
> pick up bugs, but what I saw looks good.  One thing I did notice was a
> calloc of temporary data space.  That seemed like a possible waste:  can't
> you just preallocate the array and read your data directly into it?

The short answer is that I'm not very smart! The longer answer is that 
this is because at first I misunderstood what PyArray_FromDimsAndData 
was for. For ScanFileN, I'll re-do it as you suggest.

For ScanFile, it is unknown at the beginning how big the final array is, 
and I did scheme that would allocate the memory as it went, in 
reasonable sized chunks. However, this does require a full copy, which 
is a problem. Since posting, I thought of a MUCH easier scheme:

scan the file, without storing the data, to see how many numbers there are.

rewind the file

allocate the Array

Read the data.

This requires scanning the file twice, which would cost, but would be 
easier, and prevent an unnecessary copy of the data. I hope I"ll get a 
change to try it out and see what the performance is like. IN the 
meantime, anyone else have any thoughts?

By the way, does it matter whether I use malloc or calloc? I can't 
really tell the difference from K&R.

> This is
> probably a very minor speed issue,  but might be a significant storage issue
> as people are starting to max out 32-bit systems.

yup. This is all pointless if it's not a lot of data, after all.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Jul  8 16:21:16 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul  8 16:21:16 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <40EDB2BD.4080809@noaa.gov>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040701053355.M99698@grenoble.cnrs.fr> <40E470D9.8060603@noaa.gov> <40E473A9.5040109@colorado.edu> <40ED8A6D.5050505@noaa.gov> <1089314664.40ed9f68e1db5@webmail.colorado.edu> <40EDA2A8.9030300@noaa.gov> <004c01c46524$ab808090$ebeca782@stsci.edu> <40EDB2BD.4080809@noaa.gov>
Message-ID: <40EDD64A.1060508@noaa.gov>

Chris Barker wrote:

>> can't
>> you just preallocate the array and read your data directly into it?
> 
> The short answer is that I'm not very smart! The longer answer is that 
> this is because at first I misunderstood what PyArray_FromDimsAndData 
> was for. For ScanFileN, I'll re-do it as you suggest.

I've re-done it. Now I don't double allocate storage for ScanFileN. 
There was no noticeable difference in performance, but why use memory 
you don't have to?

For ScanFile, it is unknown at the beginning how big the final array is, 
so I now have two versions. One is what I had before, it allocates 
memory in blocks of some Buffersize as it reads the file (now set to 
1024 elements). Once it's all read in, it creates an appropriate size 
PyArray, and copies the data to it. This results in a double copy of all 
the data until the temporary memory is freed.

I now also have a ScanFile2, which scans the whole file first, then 
creates a PyArray, and re-reads the file to fill it up. This version 
takes about twice as long, confirming my expectation that the time to 
allocate and copy data is tiny compared to reading and parsing the file.

Here's a simple benchmark:

Reading with Standard Python methods
(62936, 2)
it took 2.824013 seconds to read the file with standard Python methods
Reading with FileScan
(62936, 2)
it took 0.400936 seconds to read the file with FileScan
Reading with FileScan2
(62936, 2)
it took 0.752649 seconds to read the file with FileScan2
Reading with FileScanN
(62936, 2)
it took 0.441714 seconds to read the file with FileScanN

So it takes twice as long to count the numbers first, but it's still 
three times as fast as just doing all this with Python. However, I 
usually don't think it's worth all this effort for a 3 times speed up, 
and I tend to make copies my arrays all over the place with NumPy 
anyway, so I'm inclined to stick with the first method. Also, if you are 
really that tight on memory, you could always read it in chunks with 
ScanFileN.

Any feedback anyone wants to give is very welcome.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: FileScan_module.c
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040708/a0eb5fcb/attachment-0001.c>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: setup.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040708/a0eb5fcb/attachment-0002.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: TestFileScan.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040708/a0eb5fcb/attachment-0003.ksh>

From falted at pytables.org  Fri Jul  9 03:55:03 2004
From: falted at pytables.org (Francesc Alted)
Date: Fri Jul  9 03:55:03 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
Message-ID: <200407091254.06579.falted@pytables.org>

Hi,

As Perry said not too long ago that numarray crew would ask for suggestions
for RecArray improvements, I'm going to suggest a couple.

I find quite inconvenient the .tolist() method when applied to RecArray
objects as it is now:

>>> r[2:4]
array(
[(3, 33.0, 'c'),
(4, 44.0, 'd')],
formats=['1UInt8', '1Float32', '1a1'],
shape=2,
names=['c1', 'c2', 'c3'])
>>> r[2:4].tolist()
[<numarray.records.Record instance at 0x406a946c>, <numarray.records.Record instance at 0x406a912c>]


The suggested behaviour would be:

>>> r[2:4].tolist()
[(3, 33.0, 'c'),(4, 44.0, 'd')]

Another thing is that an element of recarray would be returned as a tuple
instead as a records.Record object:

>>> r[2]
<numarray.records.Record instance at 0x4064074c>

The suggested behaviour would be:

>>> r[2]
(3, 33.0, 'c')

I think the latter would be consistent with the convention that a
__getitem__(int) of a NumArray object returns a python type instead of a
rank-0 array. In the same way, a __getitem__(int) of a RecArray should
return a a python type (a tuple in this case).

Below is the code that I use right now to simulate this behaviour, but it
would be nice if the code would included into numarray.records module.

    def tolist(arr):
        """Converts a RecArray or Record to a list of rows"""
        outlist = []
        if isinstance(arr, records.Record):
            for i in range(arr.array._nfields):
                outlist.append(arr.array.field(i)[arr.row])
            outlist = tuple(outlist)  # return a tuple for records
        elif isinstance(arr, records.RecArray):
            for j in range(arr.nelements()):
                tmplist = []
                for i in range(arr._nfields):
                    tmplist.append(arr.field(i)[j])
                outlist.append(tuple(tmplist))
        return outlist


Cheers,

-- 
Francesc Alted


From thomas_karlsson_569 at hotmail.com  Fri Jul  9 08:02:44 2004
From: thomas_karlsson_569 at hotmail.com (Thomas Karlsson)
Date: Fri Jul  9 08:02:44 2004
Subject: [Numpy-discussion] Numpy compiling error... Help!
Message-ID: <BAY19-F20bJMoNwX1me0001ca60@hotmail.com>

Hi

I'm trying to compile/install numpy on a RH9 machine. When doing so I run 
into problems.

I give the command:
python setup.py install

and get a long answer, with this error at the end:
gcc -shared build/temp.linux-i686-2.2/lapack_litemodule.o -L/usr/lib/atlas 
-llapack -lcblas -lf77blas -latlas -lg2c -o 
build/lib.linux-i686-2.2/lapack_lite.so
/usr/bin/ld: cannot find -llapack
collect2: ld returned 1 exit status
error: command 'gcc' failed with exit status 1

Does anyone know what I've done wrong? I've spent alot of time on this and 
really needs help now...

Regards
Thomas

_________________________________________________________________
Hitta r?tt p? n?tet med MSN S?k http://search.msn.se/


From Chris.Barker at noaa.gov  Fri Jul  9 09:44:12 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Jul  9 09:44:12 2004
Subject: [Numpy-discussion] How to read data from text files fast?
In-Reply-To: <3afee4a2.5cf5a1c3.8234000@expms6.cites.uiuc.edu>
References: <3afee4a2.5cf5a1c3.8234000@expms6.cites.uiuc.edu>
Message-ID: <40EECAB8.3050900@noaa.gov>

Bruce,

Thanks for your feedback.

Bruce Southey wrote:
> While I am not really following your thread, I just wanted to comment that the
> Python Cookbook (at least the printed version) has some ways to count lines in a
> file - assuming that the number of lines provides the size.

The number of lines does not necessarily provide the size. In the 
general case, it doesn't at all. My whole goal here is the general case: 
being able to read a bunch of numbers out of any format of text file. 
This can be used as part of a parser for many file formats. If I was 
shooting for just one format, this would be easier, but not general 
purpose. Now that I have this, I can write a number of file format 
parsers in python with improved performance and easier syntax.

Under Unix (but not
> windows),

I am aiming for a portable solution.

> Alternatively if sufficient memory is available, storing the file in memory
> (during the counting of elements) should always be faster than reading it a
> second time from the hard disk.

The primary reason to scan the file ahead of time to count the elements 
is to save the memory of duplicate copies of data. The other reason is 
to make memory management easier, but since I've already solved that 
problem, I'm done.

thanks,
-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From perry at stsci.edu  Mon Jul 12 14:15:01 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Mon Jul 12 14:15:01 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407091254.06579.falted@pytables.org>
Message-ID: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu>

Francesc Alted wrote:
> 
> As Perry said not too long ago that numarray crew would ask for 
> suggestions
> for RecArray improvements, I'm going to suggest a couple.
> 
> I find quite inconvenient the .tolist() method when applied to RecArray
> objects as it is now:
> 
> >>> r[2:4]
> array(
> [(3, 33.0, 'c'),
> (4, 44.0, 'd')],
> formats=['1UInt8', '1Float32', '1a1'],
> shape=2,
> names=['c1', 'c2', 'c3'])
> >>> r[2:4].tolist()
> [<numarray.records.Record instance at 0x406a946c>, 
> <numarray.records.Record instance at 0x406a912c>]
> 
> 
> The suggested behaviour would be:
> 
> >>> r[2:4].tolist()
> [(3, 33.0, 'c'),(4, 44.0, 'd')]
> 
> Another thing is that an element of recarray would be returned as a tuple
> instead as a records.Record object:
> 
> >>> r[2]
> <numarray.records.Record instance at 0x4064074c>
> 
> The suggested behaviour would be:
> 
> >>> r[2]
> (3, 33.0, 'c')
> 
> I think the latter would be consistent with the convention that a
> __getitem__(int) of a NumArray object returns a python type instead of a
> rank-0 array. In the same way, a __getitem__(int) of a RecArray should
> return a a python type (a tuple in this case).
> 
These are good examples of where improvements are needed (we are 
also looking at how best to handle multidimensional arrays and
should have a proposal this week).

What I'm wondering about is what a single element of a record array
should be. Returning a tuple has an undeniable simplicity to it.
On the other hand, we've been using recarrays that allow naming the
various columns (which we refer to as "fields").  If one can refer
to fields of a recarray, shouldn't one be able to refer to a field
(by name) of one of it's elements? Or are you proposing that basic
recarrays not have that sort of capability (something added by a
subclass)?

Perry 


From rowen at u.washington.edu  Mon Jul 12 16:09:00 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Mon Jul 12 16:09:00 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu>
References: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu>
Message-ID: <p06110402bd18c5e33c9a@[128.95.99.44]>

At 5:14 PM -0400 2004-07-12, Perry Greenfield wrote:
>What I'm wondering about is what a single element of a record array
>should be. Returning a tuple has an undeniable simplicity to it.
>On the other hand, we've been using recarrays that allow naming the
>various columns (which we refer to as "fields").  If one can refer
>to fields of a recarray, shouldn't one be able to refer to a field
>(by name) of one of it's elements? Or are you proposing that basic
>recarrays not have that sort of capability (something added by a
>subclass)?

In my opinion, an single item of a record array should be a 
RecordItem object that is a dictionary that keeps items in field 
order. Thus:
- use the standard dictionary interface to deal with values by name 
(except the keys are always in the correct order.
- one can also get and set the all data at once as a tuple. This is 
NOT a standard dictionary interface, but is essential. Functions such 
as getvalues(), setvalues(dataTuple) should do it.

Adopting the full dictionary interface means one gets a standard, 
mature and fairly complete set of features. ALSO a RecordItem object 
can then be used wherever a dictionary object is needed.

I suspect it's also useful to have named field access:
RecordItem.fieldname
but am a bit reluctant to suggest so many different ways of getting 
to the data.

I assume it will continue to be easy to get all data for a field by 
naming the appropriate field. That's a really nice feature. It would 
be even better if a masked array could be used, but I have no idea 
how hard this would be.


Which brings up a side issue: any hope of integrating masked arrays 
into numarray, such that they could be used wherever a numarray array 
could be used? Areas that I particularly find myself needing them 
including nd_image filtering and writing C extensions.

-- Russell

P.S. I submitted several feature requests and bug reports for records 
on sourceforge months ago. I hope they'll not be overlooked during 
the review process.


From falted at pytables.org  Tue Jul 13 01:30:55 2004
From: falted at pytables.org (Francesc Alted)
Date: Tue Jul 13 01:30:55 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu>
References: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu>
Message-ID: <200407131028.04791.falted@pytables.org>

A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
> What I'm wondering about is what a single element of a record array
> should be. Returning a tuple has an undeniable simplicity to it.

Yeah, this why I'm strongly biased toward this possibility.

> On the other hand, we've been using recarrays that allow naming the
> various columns (which we refer to as "fields").  If one can refer
> to fields of a recarray, shouldn't one be able to refer to a field
> (by name) of one of it's elements? Or are you proposing that basic
> recarrays not have that sort of capability (something added by a
> subclass)?

Well, I'm not sure about that. But just in case most of people would like to
access records by field as well as by index, I would advocate for the
possibility that the Record instances would behave as similar as possible as
a tuple (or dictionary?). That include creating appropriate __str__() *and*
__repr__() methods as well as __getitem__() that supports both name fields
and indices. I'm not sure about whether providing an __getattr__() method
would ok, but for the sake of simplicity and in order to have (preferably)
only one way to do things, I would say no.

Regards,

-- 
Francesc Alted


From falted at pytables.org  Tue Jul 13 02:07:00 2004
From: falted at pytables.org (Francesc Alted)
Date: Tue Jul 13 02:07:00 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407131028.04791.falted@pytables.org>
References: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu> <200407131028.04791.falted@pytables.org>
Message-ID: <200407131106.19557.falted@pytables.org>

A Dimarts 13 Juliol 2004 10:28, Francesc Alted va escriure:
> A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
> > What I'm wondering about is what a single element of a record array
> > should be. Returning a tuple has an undeniable simplicity to it.
> 
> Yeah, this why I'm strongly biased toward this possibility.
> 
> > On the other hand, we've been using recarrays that allow naming the
> > various columns (which we refer to as "fields").  If one can refer
> > to fields of a recarray, shouldn't one be able to refer to a field
> > (by name) of one of it's elements? Or are you proposing that basic
> > recarrays not have that sort of capability (something added by a
> > subclass)?
> 
> Well, I'm not sure about that. But just in case most of people would like to
> access records by field as well as by index, I would advocate for the
> possibility that the Record instances would behave as similar as possible as
> a tuple (or dictionary?). That include creating appropriate __str__() *and*
> __repr__() methods as well as __getitem__() that supports both name fields
> and indices. I'm not sure about whether providing an __getattr__() method
> would ok, but for the sake of simplicity and in order to have (preferably)
> only one way to do things, I would say no.

I've been thinking that one can made compatible to return a tuple on a
single element of a RecArray and still being able to retrieve a field by
name is to play with the RecArray.__getitem__ and let it to suport key names
in addition to indices. This would be better seen as an example:

Right now, one can say:

>>> r=records.array([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8")
>>> r._fields["c1"]
array([1, 2])
>>> r._fields["c1"][1]
2

What I propose is to be able to say:

>>> r["c1"]
array([1, 2])
>>> r["c1"][1]
2

Which would replace the notation:

>>> r[1]["c1"]
2

which was recently suggested.

I.e. the suggestion is to realize RecArrays as a collection of columns,
as well as a collection of rows.

-- 
Francesc Alted


From falted at pytables.org  Tue Jul 13 02:13:03 2004
From: falted at pytables.org (Francesc Alted)
Date: Tue Jul 13 02:13:03 2004
Subject: [Numpy-discussion] PyTables 0.8.1 released
Message-ID: <200407131112.15345.falted@pytables.org>

PyTables is a hierarchical database package designed to efficiently
manage very large amounts of data. PyTables is built on top of the
HDF5 library and the numarray package. It features an object-oriented
interface that, combined with natural naming and C-code generated from
Pyrex sources, makes it a fast, yet extremely easy-to-use tool for
interactively saving and retrieving different kinds of datasets. It
also provides flexible indexed access on disk to anywhere in the data.

The primary purpose of this release is to incorporate updates to
related to the newly released numarray 1.0. I've taken the opportunity
to backport some improvements added in PyTables 0.9 (in alpha stage)
as well as to fix the known problems

Improvements:

- The logic for computing the buffer sizes has been revamped. As a
  consequence, the performance of writing/reading tables with large
  record sizes has improved by a factor of ten or more, now exceeding
  70 MB/s for writing and 130 MB/s for reading (using compression).

- The maximum record size for tables has been raised to 512 KB
  (before it was 8 KB, due to some internal limitations)

- Documentation has been improved in many minor details. As a result
  of a fix in the underlying documentation system (tbook), chapters
  start now at odd pages, instead of even. So those of you who want
  to print to double side probably will have better luck now when
  aligning pages ;).  Another one is that HTML documentation has
  improved its look as well.

Bug Fixes:

- Indexing of Arrays with list or tuple flavors (#968131)
  When retrieving single elements from an array with 'List' or
  'Tuple' flavors, an error occurred. This has been
  corrected and now you can retrieve fileh.root.array[2] without
  problems for 'List' or 'Tuple' flavored (E, VL)Arrays.
  
- Iterators on Arrays with list or tuple flavors fail (#968132)
  When using iterators with Array objects with 'List' or
  'Tuple' flavors, an error occurred. This has been
  corrected.

- Last Index (-1) of Arrays doesn't work (#968149)
  When accessing to the last element in an Array using the notation
  -1, an empty list (or tuple or array) is returned instead of the
  proper value. This happened in general with all negative
  indices. Fixed.

- Table.read(flavor="List") should return pure lists (#972534)
  However, it used to return a pointer to numarray.records.Record
  instances, as in:

   >>> fileh.root.table.read(1,2,flavor="List") 
    [<numarray.records.Record instance at 0x4128352c>] 
   >>> fileh.root.table.read(1,3,flavor="List") 
    [<numarray.records.Record instance at 0x4128396c>, 
     <numarray.records.Record instance at 0x41283a8c>] 
 
  Now the next records are returned:

   >>> fileh.root.table.read(1,2, flavor=List) 
    [(' ', 1, 1.0)] 
   >>> fileh.root.table.read(1,3, flavor=List) 
    [(' ', 1, 1.0), 
     (' ', 2, 2.0)] 
 
  In addition, when reading a single row of a table, a
  numarray.records.Record pointer was returned:
 
  >>> fileh.root.table[1] 
   <numarray.records.Record instance at 0x4128398c> 
 
  Now, it returns a tuple:

  >>> fileh.root.table[1] 
   (' ', 1, 1.0) 
 
  Which I think is more consistent, and more Pythonic.

- Copy of leaves fails... (#973370)
  Attempting to copy leaves (Table or Array with different flavors) on
  top of themselves caused an internal error in PyTables. This has
  been corrected by silently avoiding the copy and returning the
  original Leaf as a result.

Minor changes:

- When assigning a value to a non-existing field in a table row, now a
  KeyError is raised, instead of the AttributeError that was issued
  before. I think this is more consistent with the type of error.

- Tests have been improved so as to pass the whole suite when compiled
  in 64 bit mode on a Linux/PowerPC machine (namely a dual-G5 Powermac
  running a 64-bit, 2.6.4 Linux kernel and the preview YDL
  distribution for G5, with 64-bit GCC toolchain). Thanks to Ciro
  Cattuto for testing and reporting the modifications that were
  needed.


Where PyTables can be applied?
------------------------------

PyTables is not designed to work as a relational database competitor,
but rather as a teammate. If you want to work with large datasets of
multidimensional data (for example, for multidimensional analysis), or
just provide a categorized structure for some portions of your cluttered
RDBS, then give PyTables a try. It works well for storing data from data
acquisition systems (DAS), simulation software, network data monitoring
systems (for example, traffic measurements of IP packets on routers),
very large XML files, or for creating a centralized repository for system 
logs, to name only a few possible uses.
 
What is a table?
----------------

A table is defined as a collection of records whose values are stored in
fixed-length fields. All records have the same structure and all values
in each field have the same data type.  The terms "fixed-length" and
"strict data types" seem to be quite a strange requirement for a
language like Python that supports dynamic data types, but they serve a
useful function if the goal is to save very large quantities of data
(such as is generated by many scientific applications, for example) in
an efficient manner that reduces demand on CPU time and I/O resources.

What is HDF5?
-------------

For those people who know nothing about HDF5, it is a general purpose
library and file format for storing scientific data made at NCSA. HDF5
can store two primary objects: datasets and groups. A dataset is
essentially a multidimensional array of data elements, and a group is a
structure for organizing objects in an HDF5 file. Using these two basic
constructs, one can create and store almost any kind of scientific data
structure, such as images, arrays of vectors, and structured and
unstructured grids. You can also mix and match them in HDF5 files
according to your needs.

Platforms
---------

I'm using Linux (Intel 32-bit) as the main development platform, but
PyTables should be easy to compile/install on many other UNIX
machines. This package has also passed all the tests on a UltraSparc
platform with Solaris 7 and Solaris 8. It also compiles and passes all
the tests on a SGI Origin2000 with MIPS R12000 processors, with the
MIPSPro compiler and running IRIX 6.5. It also runs fine on Linux
64-bit platforms, like an AMD Opteron running SuSe Linux Enterprise
Server or PowerPC G5 with Linux 2.6.x in 64bit mode. It has also been
tested in MacOSX platforms (10.2 but should also work on newer
versions).

Regarding Windows platforms, PyTables has been tested with Windows
2000 and Windows XP (using the Microsoft Visual C compiler), but it
should also work with other flavors as well.

An example?
-----------

For online code examples, have a look at

http://pytables.sourceforge.net/html/tut/tutorial1-1.html

and, for newly introduced Variable Length Arrays:

http://pytables.sourceforge.net/html/tut/vlarray2.html

Web site
--------

Go to the PyTables web site for more details:

http://pytables.sourceforge.net/

Share your experience
---------------------

Let me know of any bugs, suggestions, gripes, kudos, etc. you may
have.

Enjoy!

-- 
Francesc Alted


From jmiller at stsci.edu  Tue Jul 13 10:42:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul 13 10:42:04 2004
Subject: [Numpy-discussion] numarray-1.0 Bug Alert
Message-ID: <1089740511.9509.372.camel@halloween.stsci.edu>

Overview

There is a bug in numarray's Numeric compatible C-API.  The bug has been
latent for a long time, since numarray-0.3 was released roughly two
years ago.  It is serious because it results in wrong answers for a
certain extension functions fed a certain class of arrays.

What's affected

The bug affects affects numarray's add-on packages or third party
extension functions which use the Numeric compatibility C-API. 
Generally, this means C-code that was either ported from Numeric or was
written with both Numeric and numarray in mind.  This includes the
add-on packages numarray.linear_algebra,  numarray.fft,
numarray.random_array, and numarray.mlab.  More recently, it includes
the ports of core Numeric functions to numarray.numeric.  Because
numarray.ma uses numarray.numeric,  the bug also affects numarray.ma. 
Finally, for numarray-1.0 this bug affects the functions numarray.argmin
and numarray.argmax; these should be the only two functions in core
numarray which are affected.

Detailed Bug Description

The bug is exposed by calling an extension function (written using the
Numeric compatible C-API) with an array that has a non-zero _byteoffset
attribute.  Arrays with non-zero _byteoffset are typically created as a
result of partially indexing higher dimensional arrays or slicing
arrays.  Partially indexing or slicing an array generally results in a
sub-array, a view which often refers to an interior region of the
original array buffer.  Because numarray's PyArrayObject does not
currently include it's ->byteoffset in its ->data pointer as the Numeric
compatibility API assumes it does, an extension function sees the base
region of the original array rather than the region belonging to the
sub-array.

Immediate User Workaround

A simple user level workaround for people that need to use the affected
packages and functions today is one like the following:

def make_safe_for_numeric_api(a):
	a = numarray.asarray(a)
	if a._byteoffset != 0:
		return a.copy()
	else:
		return a

The array inputs to an affected extension function need to be wrapped
with calls to make_safe_for_numeric_api().  Since this is intrusive and
a real fix should be released in the near future, this approach is not
recommended.

Long Term Fix

The real fix for the bug appears to be to redefine the semantics of
numarray's PyArrayObject ->data pointer to include ->byteoffset,
altering the C-API.  This should make most existing Numeric compatible
extension functions work without modification or recompilation,  but
will necessitate the re-compilation of some extension functions written
using the native numarray API approaches (the NA_* functions and
macros).   This recompilation will be required because key macros will
change, most notably NA_OFFSETDATA. This fix is not the only possible
one, and other suggestions are welcome,  but changing the semantics of
->data appears to be the best way to facilitate numarray/Numeric
interoperability.  By doing this fix, numarray operates more like
Numeric so fewer changes need to be made in the future to perform ports
of Numeric code to numarray.

Impact of Proposed Fix

Regrettably, the proposed fix will break binary compatibility for
clients of the numarray-1.0 native C-API.  So, extensions built using
the numarray native C-API will need to be rebuilt for numarray-1.1. 
Extensions that have made direct access to PyArrayObject's ->data and
require the original offsetless meaning will also need to change code
for numarray-1.1.  This is something we *really* wanted to avoid... it
just isn't going to happen this time.  

The Plan

The current plan is to fix the Numeric compatible API by changing the
semantics of ->data and release numarray-1.1 relatively soon, hopefully
within 2 weeks.   I'm sorry for any inconvenience this has caused
numarray users.

Regards,
Todd Miller


From zingale at ucolick.org  Tue Jul 13 12:54:02 2004
From: zingale at ucolick.org (Mike Zingale)
Date: Tue Jul 13 12:54:02 2004
Subject: [Numpy-discussion] differencing numarray arrays.
Message-ID: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>

Hi, I am trying to efficiently compute a difference of two 2-d flux
arrays, as arises quite commonly in finite-difference/finite-volume
methods.  Ex:

a = arange(64)
a.shape = (8,8)

I want to do create a new array, b, of shape such that

b[i,j] = a[i,j] - a[i-1,j]

for 1 <= i < 8
    0 <= i < 8

I can obviously do this through loops, but this is quite slow.  In IDL,
which is often compared to numarray/python, this is simple to do with the
shift() function, but I cannot find an efficient way to do it with
numarray arrays.

I tried defining a list

i = range(8)
im1[1:9] = im1[1:9] - 1

and indexing with im1, but this does not work.

Any suggestions?  For large array, this simple differencing in python is
very expensive when using loops.

Thanks,

Mike

------------------------------------------------------------------------------
Michael Zingale
UCO/Lick Observatory
UCSC
Santa Cruz, CA 95064

phone:  (831) 459-5246
fax:    (831) 459-5265
e-mail: zingale at ucolick.org
web:    http://www.ucolick.org/~zingale

``Don't worry head, the computer will do our thinking now''  -- Homer


From tim.hochberg at cox.net  Tue Jul 13 12:59:00 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Jul 13 12:59:00 2004
Subject: [Numpy-discussion] differencing numarray arrays.
In-Reply-To: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
References: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
Message-ID: <40F43EC4.70903@cox.net>

Mike Zingale wrote:

>Hi, I am trying to efficiently compute a difference of two 2-d flux
>arrays, as arises quite commonly in finite-difference/finite-volume
>methods.  Ex:
>
>a = arange(64)
>a.shape = (8,8)
>
>I want to do create a new array, b, of shape such that
>
>b[i,j] = a[i,j] - a[i-1,j]
>
>for 1 <= i < 8
>    0 <= i < 8
>  
>
That's supposed to be a j in the second eq., right?

If I understand you right, what you want is:

b = a[1:] - a[:-1]

-tim

>I can obviously do this through loops, but this is quite slow.  In IDL,
>which is often compared to numarray/python, this is simple to do with the
>shift() function, but I cannot find an efficient way to do it with
>numarray arrays.
>
>I tried defining a list
>
>i = range(8)
>im1[1:9] = im1[1:9] - 1
>
>and indexing with im1, but this does not work.
>
>Any suggestions?  For large array, this simple differencing in python is
>very expensive when using loops.
>
>Thanks,
>
>Mike
>
>------------------------------------------------------------------------------
>Michael Zingale
>UCO/Lick Observatory
>UCSC
>Santa Cruz, CA 95064
>
>phone:  (831) 459-5246
>fax:    (831) 459-5265
>e-mail: zingale at ucolick.org
>web:    http://www.ucolick.org/~zingale
>
>``Don't worry head, the computer will do our thinking now''  -- Homer
>
>
>
>-------------------------------------------------------
>This SF.Net email sponsored by Black Hat Briefings & Training.
>Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
>digital self defense, top technical experts, no vendor pitches, 
>unmatched networking opportunities. Visit www.blackhat.com
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>  
>


From rkern at ucsd.edu  Tue Jul 13 13:01:04 2004
From: rkern at ucsd.edu (Robert Kern)
Date: Tue Jul 13 13:01:04 2004
Subject: [Numpy-discussion] differencing numarray arrays.
In-Reply-To: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
References: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
Message-ID: <40F43F65.9040208@ucsd.edu>

Mike Zingale wrote:

> Hi, I am trying to efficiently compute a difference of two 2-d flux
> arrays, as arises quite commonly in finite-difference/finite-volume
> methods.  Ex:
> 
> a = arange(64)
> a.shape = (8,8)
> 
> I want to do create a new array, b, of shape such that
> 
> b[i,j] = a[i,j] - a[i-1,j]
> 
> for 1 <= i < 8
>     0 <= i < 8

Try
b = a[1:] - a[:-1]

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter


From zingale at ucolick.org  Tue Jul 13 13:42:02 2004
From: zingale at ucolick.org (Mike Zingale)
Date: Tue Jul 13 13:42:02 2004
Subject: [Numpy-discussion] differencing numarray arrays.
In-Reply-To: <40F44766.9010009@pfdubois.com>
References: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
 <40F44766.9010009@pfdubois.com>
Message-ID: <Pine.GSO.4.53.0407131340330.21046@mambo.ucolick.org>

thanks, all these responses helped.  I guess I was still a little
unclear with the slicing abilities in numarray.

Mike


On Tue, 13 Jul 2004, Paul Dubois wrote:

> Two of the responses to your question, while correct, might have seemed
> mysterious to a beginner.
>
> a[1:] - a[:-1]
>
> is actually shorthand for:
>
> a[1:, :] - a[:-1, :]
>
> Or to be even more explicit:
>
> n = 8
> a[1:n, 0:n] - a[0:(n-1), 0:n]
>
> If you had wanted the difference in the second index, you have to use
> the more explicit forms.
>
>
>


From rowen at u.washington.edu  Tue Jul 13 17:11:49 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Tue Jul 13 17:11:49 2004
Subject: [Numpy-discussion] differencing numarray arrays.
In-Reply-To: <Pine.GSO.4.53.0407131340330.21046@mambo.ucolick.org>
References: <Pine.GSO.4.53.0407131246340.20967@mambo.ucolick.org>
 <40F44766.9010009@pfdubois.com>
 <Pine.GSO.4.53.0407131340330.21046@mambo.ucolick.org>
Message-ID: <p0611040dbd1a28a3624f@[128.95.99.44]>

At 1:41 PM -0700 2004-07-13, Mike Zingale wrote:
>thanks, all these responses helped.  I guess I was still a little
>unclear with the slicing abilities in numarray...

Also note that there is a shift function: numarray.nd_image.shift

In your case I suspect slicing is better, but there are times when 
one really does want to shift the data (e.g. when one wants the 
resulting array to be the same shape as the original).

-- Russell


From kyeser at earthlink.net  Tue Jul 13 19:35:39 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Tue Jul 13 19:35:39 2004
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?
Message-ID: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net>

Hi.  I wrote a program to calculate sums of every possible combinations 
of two indices of a list.  The main body of the program looks something 
like this:

r = [0,2,5,6,8]
l = []

for x in range(0, len(r)):
     for y in range(0, len(r)):
         k = r[x]+r[y]
         l.append(k)
print l

1. I've heard that it's not a good idea to have a 'for' loop within 
another 'for' loop, and I was wondering if there is a more efficient 
way to do this.

2. Does anyone know if there is a built-in function or module that 
would do the above task in NumPy or Numarray (or even in Python)?

I would really appreciate it if anyone could let me know.

Thanks for your help!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 715 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040713/94401aa2/attachment-0001.bin>

From focke at slac.stanford.edu  Tue Jul 13 22:02:08 2004
From: focke at slac.stanford.edu (Warren Focke)
Date: Tue Jul 13 22:02:08 2004
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?
In-Reply-To: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net>
References: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net>
Message-ID: <Pine.LNX.4.58.0407132159050.32034@auriga.slac.stanford.edu>

l = Numeric.add.outer(r, r).flat
oughta do the trick.  Should work for numarray, too.

On Tue, 13 Jul 2004, Hee-Seng Kye wrote:

> Hi.  I wrote a program to calculate sums of every possible combinations
> of two indices of a list.  The main body of the program looks something
> like this:
>
> r = [0,2,5,6,8]
> l = []
>
> for x in range(0, len(r)):
>      for y in range(0, len(r)):
>          k = r[x]+r[y]
>          l.append(k)
> print l
>
> 1. I've heard that it's not a good idea to have a 'for' loop within
> another 'for' loop, and I was wondering if there is a more efficient
> way to do this.
>
> 2. Does anyone know if there is a built-in function or module that
> would do the above task in NumPy or Numarray (or even in Python)?
>
> I would really appreciate it if anyone could let me know.
>
> Thanks for your help!


From eric at enthought.com  Tue Jul 13 22:09:01 2004
From: eric at enthought.com (eric jones)
Date: Tue Jul 13 22:09:01 2004
Subject: [Numpy-discussion] ANN: Reminder -- SciPy 04 is coming up
Message-ID: <40F4BF9E.8060103@enthought.com>

Hey folks,

Just a reminder that SciPy 04 is coming up.  More information is here:

http://www.scipy.org/wikis/scipy04

About the Conference and Keynote Speaker
---------------------------------------------
The 1st annual *SciPy Conference* will be held this year at Caltech, 
September 2-3, 2004.  As some of you may know, we've experienced great 
participation in two SciPy "Workshops" (with ~70 attendees in both 2002 
and 2003) and this year we're graduating to a "conference."  With the 
prestige of a conference comes the responsibility of a keynote address.  
This year, Jim Hugunin has answered the call and will be speaking to 
kickoff the meeting on Thursday September 2nd.  Jim is the creator of 
Numeric Python, Jython, and co-designer of AspectJ. Jim is currently 
working on IronPython--a fast implementation of Python for .NET and Mono.

Presenters
-----------
We still have room for a few more standard talks, and there is plenty of 
room for lightning talks. Because of this, we are extending the abstract 
deadline until July 23rd.  Please send your abstract to 
abstracts at scipy.org.  Travis Oliphant is organizing the presentations 
this year. (Thanks!)  Once accepted, papers and/or presentation slides 
are acceptable and are due by August 20, 2004. 

Registration
-------------
Early registration ($100.00) has been extended to July 23rd.  Follow the 
links off of the main conference site:

http://www.scipy.org/wikis/scipy04

After July 23rd, registration will be $150.00.  Registration includes 
breakfast and lunch Thursday & Friday and a very nice dinner Thursday 
night.  Please register as soon as possible as it will help us in 
planning for food, room sizes, etc.

Sprints
--------
As of now, we really haven't had much of a call for coding sprints for 
the 3 days prior to SciPy 04.  Below is the original announcement about 
sprints.  If you would like to suggest a topic and see if others are 
interested, please send a message to the list.  Otherwise, we'll forgo 
the sprints session this year.

    We're also planning three days of informal "Coding Sprints" prior to
    the conference -- August 30 to September 1, 2004.  Conference
    registration is not required to participate in the sprints.  Please
    email the list, however, if you plan to attend.  Topics for these
    sprints will be determined via the mailing lists as well, so please
    submit any suggestions for topics to the scipy-user list:

    list signup: http://www.scipy.org/mailinglists/
    list address: scipy-user at scipy.org


thanks,
eric


From kyeser at earthlink.net  Tue Jul 13 23:30:13 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Tue Jul 13 23:30:13 2004
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?
In-Reply-To: <Pine.LNX.4.58.0407132159050.32034@auriga.slac.stanford.edu>
References: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net> <Pine.LNX.4.58.0407132159050.32034@auriga.slac.stanford.edu>
Message-ID: <34CF38C4-D55F-11D8-8504-000393479EE8@earthlink.net>

Thank you so much.  It works beautifully!

On Jul 14, 2004, at 1:01 AM, Warren Focke wrote:

> l = Numeric.add.outer(r, r).flat
> oughta do the trick.  Should work for numarray, too.
>
> On Tue, 13 Jul 2004, Hee-Seng Kye wrote:
>
>> Hi.  I wrote a program to calculate sums of every possible 
>> combinations
>> of two indices of a list.  The main body of the program looks 
>> something
>> like this:
>>
>> r = [0,2,5,6,8]
>> l = []
>>
>> for x in range(0, len(r)):
>>      for y in range(0, len(r)):
>>          k = r[x]+r[y]
>>          l.append(k)
>> print l
>>
>> 1. I've heard that it's not a good idea to have a 'for' loop within
>> another 'for' loop, and I was wondering if there is a more efficient
>> way to do this.
>>
>> 2. Does anyone know if there is a built-in function or module that
>> would do the above task in NumPy or Numarray (or even in Python)?
>>
>> I would really appreciate it if anyone could let me know.
>>
>> Thanks for your help!
>
>
> -------------------------------------------------------
> This SF.Net email sponsored by Black Hat Briefings & Training.
> Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
> digital self defense, top technical experts, no vendor pitches,
> unmatched networking opportunities. Visit www.blackhat.com
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From falted at pytables.org  Wed Jul 14 02:37:06 2004
From: falted at pytables.org (Francesc Alted)
Date: Wed Jul 14 02:37:06 2004
Subject: [Numpy-discussion] numarray-1.0 Bug Alert
In-Reply-To: <1089740511.9509.372.camel@halloween.stsci.edu>
References: <1089740511.9509.372.camel@halloween.stsci.edu>
Message-ID: <200407141136.09436.falted@pytables.org>

A Dimarts 13 Juliol 2004 19:41, Todd Miller va escriure:
> The real fix for the bug appears to be to redefine the semantics of
> numarray's PyArrayObject ->data pointer to include ->byteoffset,
> altering the C-API. 

Oh well, I'm afraid that I'll be affected by that :(. Just to understand
that fully, you mean that real data for an array will start in the future at
narr->data, instead of narr->data+narr->byteoffset as it does now?

-- 
Francesc Alted


From jmiller at stsci.edu  Wed Jul 14 04:38:09 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jul 14 04:38:09 2004
Subject: [Numpy-discussion] numarray-1.0 Bug Alert
In-Reply-To: <200407141136.09436.falted@pytables.org>
References: <1089740511.9509.372.camel@halloween.stsci.edu>
	 <200407141136.09436.falted@pytables.org>
Message-ID: <1089805021.3741.62.camel@localhost.localdomain>

On Wed, 2004-07-14 at 05:36, Francesc Alted wrote:
> A Dimarts 13 Juliol 2004 19:41, Todd Miller va escriure:
> > The real fix for the bug appears to be to redefine the semantics of
> > numarray's PyArrayObject ->data pointer to include ->byteoffset,
> > altering the C-API. 
> 
> Oh well, I'm afraid that I'll be affected by that :(. Just to understand
> that fully, you mean that real data for an array will start in the future at
> narr->data, instead of narr->data+narr->byteoffset as it does now?

That is the current plan.  I was thinking developers could just replace
the new narr->data with (narr->data - narr->byteoffset) if needed.  I'm
assuming the planned changes will cost at most a few edits and package
redistribution, which I understand is still a major pain in the neck; 
let me know if the cost is higher than that for some reason.

Regards,
Todd


From paul at pfdubois.com  Wed Jul 14 05:57:07 2004
From: paul at pfdubois.com (Paul F. Dubois)
Date: Wed Jul 14 05:57:07 2004
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?
In-Reply-To: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net>
References: <A0D8CDAA-D53C-11D8-99DF-000393479EE8@earthlink.net>
Message-ID: <40F52D8B.9050601@pfdubois.com>

 >>> add.reduce(take(r,indices([len(r),len(r)]))).flat
array([ 0,  2,  5,  6,  8,  2,  4,  7,  8, 10,  5,  7, 10, 11, 13,  6, 
8, 11, 12, 14,  8, 10, 13, 14, 16])

Always like a good challenge in the morning. God, it is like the old 
rush of writing APL.

Hee-Seng Kye wrote:

> Hi. I wrote a program to calculate sums of every possible combinations 
> of two indices of a list. The main body of the program looks something 
> like this:
> 
> r = [0,2,5,6,8]
> l = []
> 
> for x in range(0, len(r)):
> for y in range(0, len(r)):
> k = r[x]+r[y]
> l.append(k)
> print l
> 
> 1. I've heard that it's not a good idea to have a 'for' loop within 
> another 'for' loop, and I was wondering if there is a more efficient way 
> to do this.
> 
> 2. Does anyone know if there is a built-in function or module that would 
> do the above task in NumPy or Numarray (or even in Python)?
> 
> I would really appreciate it if anyone could let me know.
> 
> Thanks for your help!


From Sebastien.deMentendeHorne at electrabel.com  Wed Jul 14 08:41:09 2004
From: Sebastien.deMentendeHorne at electrabel.com (Sebastien.deMentendeHorne at electrabel.com)
Date: Wed Jul 14 08:41:09 2004
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?
Message-ID: <035965348644D511A38C00508BF7EAEB145CAF2A@seacex03.eib.electrabel.be>

I could not resist to propose an other solution:
 
r = array([0,2,5,6,8])
l = (r[:,NewAxis] + r[NewAxis,:]).flat
 
 
 -----Original Message-----
From: Hee-Seng Kye [mailto:kyeser at earthlink.net]
Sent: mercredi 14 juillet 2004 4:22
To: numpy-discussion at lists.sourceforge.net
Subject: [Numpy-discussion] a 'for' loop within another 'for' loop?


Hi. I wrote a program to calculate sums of every possible combinations of
two indices of a list. The main body of the program looks something like
this: 


r = [0,2,5,6,8] 

l = [] 


for x in range(0, len(r)): 

for y in range(0, len(r)): 

k = r[x]+r[y] 

l.append(k) 

print l 


1. I've heard that it's not a good idea to have a 'for' loop within another
'for' loop, and I was wondering if there is a more efficient way to do this.


2. Does anyone know if there is a built-in function or module that would do
the above task in NumPy or Numarray (or even in Python)? 


I would really appreciate it if anyone could let me know. 


Thanks for your help!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040714/8883cf76/attachment-0001.html>

From rowen at u.washington.edu  Wed Jul 14 08:48:07 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Wed Jul 14 08:48:07 2004
Subject: [Numpy-discussion] How to median filter a masked array?
Message-ID: <p06110410bd1b0493eef9@[128.95.99.44]>

I want to 3x3 median filter a masked array (2-d array of ints -- an 
astronomical image), where the masked data and points off the edge 
are excluded from the local median calculation. Any suggestions for 
how to do this efficiently? I suspect I have to write it in C, which 
is an unpleasant prospect.

I tried using NaN for points to mask out, but the median filter seems 
to handle those as "infinity", or something equally inappropriate.

In a related vein, has Python come along far enough that it would be 
reasonable to add support for NaN to numarray -- in the sense that 
statistics calculations, filters, etc. could be convinced to ignore 
NaNs? Obviously this support would be contingent on compiling python 
with IEEE floating point support, but I suspect that's the default on 
most platforms these days.

-- Russell


From jdhunter at ace.bsd.uchicago.edu  Wed Jul 14 09:51:12 2004
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Wed Jul 14 09:51:12 2004
Subject: [Numpy-discussion] ANN matplotlib-0.60.2: python graphs and charts
Message-ID: <m21xjek11o.fsf@mother.paradise.lost>

matplotlib is a 2D plotting library for python.  You can use
matplotlib interactively from a python shell or IDE, or embed it in
GUI applications (WX, GTK, and Tkinter).  matplotlib supports many
plot types: line plots, bar charts, log plots, images, pseudocolor
plots, legends, date plots, finance charts and more.  

What's new since matplotlib 0.50

  This is the first wide release in 5 months and there has been a
  tremendous amount of development since then, with new backends, many
  optimizations, new plotting types, new backends and enhanced text
  support. See http://matplotlib.sourceforge.net/whats_new.html for
  details.
 
 * Todd Miller's tkinter backend (tkagg) with good support for
   interactive plotting using the standard python shell, ipython or
   others.  matplotlib now runs on windows out of the box with python
   + numeric/numarry

 * Full Numeric / numarray integration with Todd Miller's numerix
   module.  Prebuilt installers for numeric and numarray on win32.
   Others, please set your numerix settings before building
   matplotlib, as described on
   http://matplotlib.sourceforge.net/faq.html#NUMARRAY

 * Mathtext: you can write TeX style math expressions anywhere in your
   figure.
   http://matplotlib.sourceforge.net/screenshots.html#mathtext_demo.

 * Images - figure and axes images with optional interpolated
   resampling, alpha blending of multiple images, and more with the
   imshow and figimage commands.  Interactive control of colormaps,
   intensity scaling and colorbars -
   http://matplotlib.sourceforge.net/screenshots.html#layer_images

 * Text: freetype2 support, newline separated strings with arbitrary
   rotations, Paul Barrett's cross platform font
   manager.
   http://matplotlib.sourceforge.net/screenshots.html#align_text

 * Jared Wahlstrand's SVG backend (alpha)

 * Support for popular financial plot types -
   http://matplotlib.sourceforge.net/screenshots.html#finance_work2

 * Many optimizations and extension code to remove performance
   bottlenecks.  pcolors and scatters are an order of magnitude
   faster.

 * GTKAgg, WXAgg, TkAgg backends for http://antigrain.com (agg)
   rendering in the GUI canvas.  Now all the major GUIs (WX, GTK, Tk)
   can be used with a common (agg) renderer.

 * Many new examples and demos - see http://matplotlib.sf.net/examples
   or download the src distribution and look in the examples dir.

Documentation and downloads available at
http://matplotlib.sourceforge.net.

John Hunter


From verveer at embl-heidelberg.de  Wed Jul 14 10:39:59 2004
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Wed Jul 14 10:39:59 2004
Subject: [Numpy-discussion] How to median filter a masked array?
In-Reply-To: <p06110410bd1b0493eef9@[128.95.99.44]>
References: <p06110410bd1b0493eef9@[128.95.99.44]>
Message-ID: <1122AA7E-D5B4-11D8-8510-000A95C92C8E@embl-heidelberg.de>

On 14 Jul 2004, at 17:47, Russell E Owen wrote:
> I want to 3x3 median filter a masked array (2-d array of ints -- an 
> astronomical image), where the masked data and points off the edge are 
> excluded from the local median calculation. Any suggestions for how to 
> do this efficiently?

I don't think that you can do it very efficiently right now with the 
functions that are available in numarray.

>  I suspect I have to write it in C, which is an unpleasant prospect.

Yes, that is unpleasant, trust me :-) However, in version 1.0 of 
numarray in the nd_image package, I have added some support for writing 
filter functions. The generic_filter() function iterates over the array 
and applies a user-defined filter function at each element. The 
user-defined function can be written in python or in C, and is called 
at each element with the values within the filter-footprint as an 
argument. You would write a function that finds the median of these 
values, excluding the NaNs (or whatever value that flags the mask.) I 
would suggest to prototype this function in python and move that to C 
as soon as it works to your satisfaction. See the numarray manual for 
more details.

Cheers, Peter


From rowen at u.washington.edu  Wed Jul 14 10:44:39 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Wed Jul 14 10:44:39 2004
Subject: [Numpy-discussion] How to median filter a masked array?
In-Reply-To: <40F56462.2030000@pfdubois.com>
References: <p06110410bd1b0493eef9@[128.95.99.44]>
 <40F56462.2030000@pfdubois.com>
Message-ID: <p06110415bd1b19e0ecd3@[128.95.99.44]>

At 9:50 AM -0700 2004-07-14, Paul F. Dubois wrote:
>The median filter is prepared to take an argument of a numarray 
>array but ignorant of and unprepared to deal with masked  values. 
>Using the __array__ trick, both Numeric.MA and numarray.ma would 
>'know' this and therefore replace the missing values in the filter's 
>argument with the 'fill value' for that type -- a big number in the 
>case of real arrays. You could explicitly choose that value (say 
>using the overall median of the data m) by passing x.filled(m) 
>rather than x to the filter.
>
>If there is no such value, you probably do have to do it in C. If 
>you wrote it in C, how would you treat missing elements? BTW it 
>wouldn't be that hard; just pass both the array and its mask as 
>separate elements to a C routine and use SWIG to hook it up.

I already have routines that handle masked data in C to create a 
radial profiles from 2-d integer data (since I could not figure out 
how to do that in numarray). I chose to pass the mask as a separate 
array, since I could not find any C interface for numarray.ma and 
since NaN made no sense for integer data.

That code was pretty straightforward. I wish I could have found a 
simple way to support multiple array types. I thought using C++ with 
prototypes would be the ticket, but absent any examples and after 
looking through the numarray code, I gave up and took the easy way 
out. (I didn't use SWIG, though, I just hand coded everything. Maybe 
that was a mistake.)

I confess that makes me worry about the underpinnings of numarray. It 
seems an obvious candidate to be written in C++ with prototypes. I 
hate to think what the developers have to go through, instead.

In any case, writing a median filter is a bigger deal than taking a 
radial profile, and since one already existed I thought I'd ask.

>I doubt NaN would help you here; you'd still have to figure out what 
>to do in those places. Numeric did not have support for NaN because 
>there were portability problems. Probably still are. And you still 
>are stuck in a lot of cases anyway.

Well, NaN isn't very general in any case, since it's meaningless for 
integer data. So maybe that's a red herring. (Though if NaN had 
worked to mask data I would cheerfully have converted my images to 
floats to take advantage of it!).

What's really wanted is a more unified approach to masked data. I 
suppose it's pie in the sky, but I sure wish most the numarray 
functions took an optional mask array (or accepted a numarray.ma 
object -- nice for the user, but probably too painful for words under 
the hood).

I don't think there are major issues with what to do with masked 
data. Simply ignoring it works in most cases, e.g. mean, std dev, 
sum, max... In some cases one needs the new mask as output (e.g. 
matrix multiply). Filtering is a bit subtle: can masked data be 
treated the same as data off the edge? I hope so, but I'm not sure.

Anyway, I am grateful for what we do have. Without Numeric or 
numarray I would have to write all my image processing code in a 
different language.

-- Russell


From gazzar at email.com  Wed Jul 14 21:00:03 2004
From: gazzar at email.com (Gary Ruben)
Date: Wed Jul 14 21:00:03 2004
Subject: [Numpy-discussion] sum() and mean() broken?
Message-ID: <20040715035046.C8BFE1535C5@ws3-1.us4.outblaze.com>

I'm getting tracebacks on even the most basic sum() and mean() calls in numarray 1.0 under Windows. Apologies if this has already been reported.
Gary

>>> from numarray import *
>>> arange(10).sum()

Traceback (most recent call last):
  File "<pyshell#4>", line 1, in -toplevel-
    arange(10).sum()
  File "C:\APPS\PYTHON23\Lib\site-packages\numarray\numarraycore.py", line 1106, in sum
    return ufunc.add.reduce(ufunc.add.areduce(self, type=type).flat, type=type)
error: Int32asInt64: buffer not aligned on 8 byte boundary.

-- 
_______________________________________________
Talk More, Pay Less with Net2Phone Direct(R), up to 1500 minutes free! 
http://www.net2phone.com/cgi-bin/link.cgi?143 


From jmiller at stsci.edu  Thu Jul 15 06:18:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 15 06:18:04 2004
Subject: [Numpy-discussion] sum() and mean() broken?
In-Reply-To: <20040715035046.C8BFE1535C5@ws3-1.us4.outblaze.com>
References: <20040715035046.C8BFE1535C5@ws3-1.us4.outblaze.com>
Message-ID: <1089897432.2637.34.camel@halloween.stsci.edu>

numarray-1.0 is known to have problems with Windows-98, etc. (My guess
is any Pre-NT windows).  I haven't seen any problems with Windows XP or
Windows 2000 Pro.  

Which windows variant are you running?  

Does the numarray selftest pass?  It should look something like:

>>> import nuamrray.testall as testall
>>> testall.test()
numarray:                               ((0, 1178), (0, 1178))
numarray.records:                       (0, 48)
numarray.strings:                       (0, 176)
numarray.memmap:                        (0, 82)
numarray.objects:                       (0, 105)
numarray.memorytest:                    (0, 16)
numarray.examples.convolve:             ((0, 20), (0, 20), (0, 20), (0,
20))
numarray.convolve:                      (0, 52)
numarray.fft:                           (0, 75)
numarray.linear_algebra:                ((0, 46), (0, 51))
numarray.image:                         (0, 27)
numarray.nd_image:                      (0, 390)
numarray.random_array:                  (0, 53)
numarray.ma:                            (0, 671)


On Wed, 2004-07-14 at 23:50, Gary Ruben wrote:
> I'm getting tracebacks on even the most basic sum() and mean() calls in numarray 1.0 under Windows. Apologies if this has already been reported.
> Gary
> 
> >>> from numarray import *
> >>> arange(10).sum()
> 
> Traceback (most recent call last):
>   File "<pyshell#4>", line 1, in -toplevel-
>     arange(10).sum()
>   File "C:\APPS\PYTHON23\Lib\site-packages\numarray\numarraycore.py", line 1106, in sum
>     return ufunc.add.reduce(ufunc.add.areduce(self, type=type).flat, type=type)
> error: Int32asInt64: buffer not aligned on 8 byte boundary.
-- 


From mathieu.gontier at fft.be  Thu Jul 15 06:29:04 2004
From: mathieu.gontier at fft.be (Mathieu Gontier)
Date: Thu Jul 15 06:29:04 2004
Subject: [Numpy-discussion] static void** libnumarray_API
Message-ID: <200407151528.16261.mathieu.gontier@fft.be>

Hello, 

I am developping FEM bendings from a C++ code to Python with Numarray.
So, I have the following problem.

In the distribution file 'libnumarray.h', the variable 'libnumarray_API' is 
defined as a static variable (because of the symbol NO_IMPORT is not 
defined).

Then, I understand that all the examples are implemented in a unique file.

But, in my project, I must edit header files and source files in order to 
solve other problems (like cycle includes). So, I have two different source 
files which use numarray :
	- the file containing the 'init' function which call the function 
'import_libnumarray()' (which initialize 'libnumarray_API')
	- a file containing implementations, more precisely an implementation calling 
numarray functionnalities: with is 'static' state, this 'libnumarray_API' is 
NULL...

I tried to compile NumArray with the symbol 'NO_IMPORT' (see libnumarray.h) in 
order to have an extern variable. But this symbol doesn't allow to import 
numarray in the python environment.

So, does someone have a solution allowing to use NumArray API with 
header/source files ? 

Thanks, 
Mathieu Gontier


From curzio.basso at unibas.ch  Thu Jul 15 07:22:01 2004
From: curzio.basso at unibas.ch (Curzio Basso)
Date: Thu Jul 15 07:22:01 2004
Subject: [Numpy-discussion] NA.dot transposing in place
Message-ID: <40F692CC.3000103@unibas.ch>

Hi all.

I wonder if anyone noticed the following behaviour (new in 1.0) of the 
dot/matrixmultiply functions:

 >>> alpha = NA.arange(10, shape = (10,1))

 >>> beta = NA.arange(10, shape = (10,1))

 >>> NA.dot(alpha, alpha)
array([[285]])

 >>> alpha.shape # here it looks like it's doing the transpose in place
(1, 10)

 >>> NA.dot(beta, alpha)
array([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
        [ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18],
        [ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27],
        [ 0,  4,  8, 12, 16, 20, 24, 28, 32, 36],
        [ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45],
        [ 0,  6, 12, 18, 24, 30, 36, 42, 48, 54],
        [ 0,  7, 14, 21, 28, 35, 42, 49, 56, 63],
        [ 0,  8, 16, 24, 32, 40, 48, 56, 64, 72],
        [ 0,  9, 18, 27, 36, 45, 54, 63, 72, 81]])

 >>> alpha.shape, beta.shape # but not the second time
((1, 10), (10, 1))

-------------------------------------------------

Can someone explain me what's going on?

thanks, curzio


From jmiller at stsci.edu  Thu Jul 15 07:36:11 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 15 07:36:11 2004
Subject: [Numpy-discussion] static void** libnumarray_API
In-Reply-To: <200407151528.16261.mathieu.gontier@fft.be>
References: <200407151528.16261.mathieu.gontier@fft.be>
Message-ID: <1089902141.2637.61.camel@halloween.stsci.edu>

On Thu, 2004-07-15 at 09:28, Mathieu Gontier wrote:
> Hello, 
> 
> I am developping FEM bendings from a C++ code to Python with Numarray.
> So, I have the following problem.
> 
> In the distribution file 'libnumarray.h', the variable 'libnumarray_API' is 
> defined as a static variable (because of the symbol NO_IMPORT is not 
> defined).
> 
> Then, I understand that all the examples are implemented in a unique file.
> 
> But, in my project, I must edit header files and source files in order to 
> solve other problems (like cycle includes). So, I have two different source 
> files which use numarray :
> 	- the file containing the 'init' function which call the function 
> 'import_libnumarray()' (which initialize 'libnumarray_API')
> 	- a file containing implementations, more precisely an implementation calling 
> numarray functionnalities: with is 'static' state, this 'libnumarray_API' is 
> NULL...
> 
> I tried to compile NumArray with the symbol 'NO_IMPORT' (see libnumarray.h) in 
> order to have an extern variable. But this symbol doesn't allow to import 
> numarray in the python environment.
> 
> So, does someone have a solution allowing to use NumArray API with 
> header/source files ? 

The good news is that the 1.0 headers, at least, work.

I intended to capture this form of multi-compilation-unit module in the
numpy_compat example... but didn't.  I think there's two "tricks"
missing in the example.  In *a* module of the several modules you're
linking together, do the following:

#define NO_IMPORT 1	   /* This prevents the definition of the static
			      version of the API var.  The extern won't
			      conflict with the real definition below. 			   */

#include "libnumarray.h"

void **libnumarray_API;    /* This defines the missing API var for *all*
				your compilation units */

This variable will be assigned the API pointer by the
import_libnumarray() call.

I fixed the numpy_compat example to demonstrate this in CVS but they
have a Numeric flavor.  The same principles apply to libnumarray.  Note
that for numarray-1.0 you must include/import both the Numeric
compatible and native numarray APIs separately if you use both.

Regards,
Todd


From gazzar at email.com  Thu Jul 15 07:37:01 2004
From: gazzar at email.com (Gary Ruben)
Date: Thu Jul 15 07:37:01 2004
Subject: [Numpy-discussion] sum() and mean() broken?
Message-ID: <20040715143500.2CD321CE306@ws3-6.us4.outblaze.com>

Thanks Todd,
It's under Win98 as you suspected and the selftest definitely doesn't pass.
Are you planning on supporting Win98? If so, I'll revert to numarray 0.9. Otherwise, I'll just use Numeric for this task and restrict playing with numarray 1.0 to my Win2k laptop.
thanks,
Gary
-- 
_______________________________________________
Talk More, Pay Less with Net2Phone Direct(R), up to 1500 minutes free! 
http://www.net2phone.com/cgi-bin/link.cgi?143 


From jmiller at stsci.edu  Thu Jul 15 07:38:00 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 15 07:38:00 2004
Subject: [Numpy-discussion] NA.dot transposing in place
In-Reply-To: <40F692CC.3000103@unibas.ch>
References: <40F692CC.3000103@unibas.ch>
Message-ID: <1089902251.2637.64.camel@halloween.stsci.edu>

On Thu, 2004-07-15 at 10:21, Curzio Basso wrote:
> Hi all.
> 
> I wonder if anyone noticed the following behaviour (new in 1.0) of the 
> dot/matrixmultiply functions:
> 
>  >>> alpha = NA.arange(10, shape = (10,1))
> 
>  >>> beta = NA.arange(10, shape = (10,1))
> 
>  >>> NA.dot(alpha, alpha)
> array([[285]])
> 
>  >>> alpha.shape # here it looks like it's doing the transpose in place
> (1, 10)
> 
>  >>> NA.dot(beta, alpha)
> array([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
>         [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
>         [ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18],
>         [ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27],
>         [ 0,  4,  8, 12, 16, 20, 24, 28, 32, 36],
>         [ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45],
>         [ 0,  6, 12, 18, 24, 30, 36, 42, 48, 54],
>         [ 0,  7, 14, 21, 28, 35, 42, 49, 56, 63],
>         [ 0,  8, 16, 24, 32, 40, 48, 56, 64, 72],
>         [ 0,  9, 18, 27, 36, 45, 54, 63, 72, 81]])
> 
>  >>> alpha.shape, beta.shape # but not the second time
> ((1, 10), (10, 1))
> 
> -------------------------------------------------
> 
> Can someone explain me what's going on?

It's a bug introduced in numarray-1.0.  It'll be fixed for 1.1 in a
couple weeks.

Regards,
Todd


From jmiller at stsci.edu  Thu Jul 15 07:49:14 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 15 07:49:14 2004
Subject: [Numpy-discussion] sum() and mean() broken?
In-Reply-To: <20040715143500.2CD321CE306@ws3-6.us4.outblaze.com>
References: <20040715143500.2CD321CE306@ws3-6.us4.outblaze.com>
Message-ID: <1089902892.2637.75.camel@halloween.stsci.edu>

On Thu, 2004-07-15 at 10:35, Gary Ruben wrote:
> Thanks Todd,
> It's under Win98 as you suspected and the selftest definitely doesn't pass.
> Are you planning on supporting Win98? 

I'm planning to debug this particular problem because I'm concerned that
it's just latent in the newer windows variants.  To the degree that
Win98 is "free" under the umbrella of win32, it will continue to be
supported.  An ongoing issue will likely be that Win98 testing doesn't
get done on a regular basis... just as problems are reported.

Regards,
Todd 


From curzio.basso at unibas.ch  Thu Jul 15 07:51:01 2004
From: curzio.basso at unibas.ch (Curzio Basso)
Date: Thu Jul 15 07:51:01 2004
Subject: [Numpy-discussion] NA.dot transposing in place
In-Reply-To: <1089902251.2637.64.camel@halloween.stsci.edu>
References: <40F692CC.3000103@unibas.ch> <1089902251.2637.64.camel@halloween.stsci.edu>
Message-ID: <40F6999C.2050101@unibas.ch>

Todd Miller wrote:

> It's a bug introduced in numarray-1.0.  It'll be fixed for 1.1 in a
> couple weeks.

Ah, ok. Is it related with the bug announced a couple of days ago?


From jmiller at stsci.edu  Thu Jul 15 08:14:10 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jul 15 08:14:10 2004
Subject: [Numpy-discussion] NA.dot transposing in place
In-Reply-To: <40F6999C.2050101@unibas.ch>
References: <40F692CC.3000103@unibas.ch>
	 <1089902251.2637.64.camel@halloween.stsci.edu> <40F6999C.2050101@unibas.ch>
Message-ID: <1089904417.2637.147.camel@halloween.stsci.edu>

On Thu, 2004-07-15 at 10:50, Curzio Basso wrote:
> Todd Miller wrote:
> 
> > It's a bug introduced in numarray-1.0.  It'll be fixed for 1.1 in a
> > couple weeks.
> 
> Ah, ok. Is it related with the bug announced a couple of days ago?

Only peripherally.  The Numeric compatibility layer problem was
discovered as a result of porting a bunch of Numeric functions to
numarray... ports done to try to get better small array speed. 
Similarly, the  setup for matrixmultiply was moved into C for
numarray-1.0... to try to get better small array speed.

numarray-1.0 is disappointingly buggy,  but the interest generated by
the 1.0 moniker is making the open source model work well so I think 1.1
will be much more solid as a result of strong user feedback.  So, thanks
for the report.

Regards,
Todd


From cjw at sympatico.ca  Thu Jul 15 08:22:07 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Thu Jul 15 08:22:07 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407131106.19557.falted@pytables.org>
References: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu> <200407131028.04791.falted@pytables.org> <200407131106.19557.falted@pytables.org>
Message-ID: <40F6A106.6020606@sympatico.ca>


Francesc Alted wrote:

>A Dimarts 13 Juliol 2004 10:28, Francesc Alted va escriure:
>  
>
>>A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
>>    
>>
>>>What I'm wondering about is what a single element of a record array
>>>should be. Returning a tuple has an undeniable simplicity to it.
>>>      
>>>
>>Yeah, this why I'm strongly biased toward this possibility.
>>
>>    
>>
>>>On the other hand, we've been using recarrays that allow naming the
>>>various columns (which we refer to as "fields").  If one can refer
>>>to fields of a recarray, shouldn't one be able to refer to a field
>>>(by name) of one of it's elements? Or are you proposing that basic
>>>recarrays not have that sort of capability (something added by a
>>>subclass)?
>>>      
>>>
>>Well, I'm not sure about that. But just in case most of people would like to
>>access records by field as well as by index, I would advocate for the
>>possibility that the Record instances would behave as similar as possible as
>>a tuple (or dictionary?). That include creating appropriate __str__() *and*
>>__repr__() methods as well as __getitem__() that supports both name fields
>>and indices. I'm not sure about whether providing an __getattr__() method
>>would ok, but for the sake of simplicity and in order to have (preferably)
>>only one way to do things, I would say no.
>>    
>>
>
>I've been thinking that one can made compatible to return a tuple on a
>single element of a RecArray and still being able to retrieve a field by
>name is to play with the RecArray.__getitem__ and let it to suport key names
>in addition to indices. This would be better seen as an example:
>
>Right now, one can say:
>
>  
>
>>>>r=records.array([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8")
>>>>r._fields["c1"]
>>>>        
>>>>
>array([1, 2])
>  
>
>>>>r._fields["c1"][1]
>>>>        
>>>>
>2
>
>What I propose is to be able to say:
>
>  
>
>>>>r["c1"]
>>>>        
>>>>
>array([1, 2])
>  
>
>>>>r["c1"][1]
>>>>
I would suggest going a step beyond this, so that one can have r.c1[1], 
see the script below.
I have not explored the assignment of a value to r.c1.[1], but it seems 
to be achievable.
If changes along this line are acceptable, it is suggested that fields 
be renamed cols, or some
such, to indicate its wider impact.

Colin W.

>>>>        
>>>>
>2
>
>Which would replace the notation:
>
>  
>
>>>>r[1]["c1"]
>>>>        
>>>>
>2
>
>which was recently suggested.
>
>I.e. the suggestion is to realize RecArrays as a collection of columns,
>as well as a collection of rows.
>  
>
# tRecord.py to explore RecArray

import numarray.records as _rec
import sys
#
class Rec1(_rec.RecArray):
  def __new__(cls, buffer, formats, shape=0, names=None, byteoffset=0,
                 bytestride=None, byteorder=sys.byteorder, aligned=0):
    # This calls RecArray.__init__ - reason unclear.
    # Why can't the instance be fully created by RecArray.__init__?
    return _rec.RecArray.__new__(cls, buffer, formats=formats, 
shape=shape, names=names,
                         byteorder=byteorder, aligned=aligned)

  def __init__(self, buffer, formats, shape=0, names=None, byteoffset=0,
               bytestride=None, byteorder=sys.byteorder, aligned=0):
 
    arr= _rec.array(buffer, formats=formats, shape=shape, names=names,
                   byteorder=byteorder, aligned=aligned)

    self.__setstate__(arr.__getstate__())

  def __getattr__(self, name):
    # We reach here if the attribute does not belong to the basic Rec1 set
    return self._fields[name]
       
  def __getattribute__(self, name):
    return _rec.RecArray.__getattribute__(self, name)
 
  def __repr__(self):
    return self.__class__.__name__ + _rec.RecArray.__repr__(self)[8:]

  def __setattr__(self, name, value):
    return _rec.RecArray.__setattr__(self, name, value)
 
  def __str__(self):
    return self.__class__.__name__ + _rec.RecArray.__str__(self)[8:]  
   
if __name__ == '__main__':   
  # Frances Alted 13-Jul-04 05:06
  r= _rec.array([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8")
  print r._fields["c1"]
  print r._fields["c1"][1]
  r1= Rec1([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8")
  print r1._fields["c1"]
  print r1._fields["c1"][1]
  #
  r1.zz= 99                       #  acceptable
  print r1.c1
  print r1.c1[1]
  try:
    x= r1.ugh
  except:
    print 'ugh not recognized as an attribute'
'''
  The above delivers:
[1 2]
2
[1 2]
2
[1 2]
2
ugh not recognized as an attribute
'''
 

From falted at pytables.org  Thu Jul 15 09:12:08 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Jul 15 09:12:08 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <40F6A106.6020606@sympatico.ca>
References: <CEELJPECNGEGKFNDLHFFAEEICEAA.perry@stsci.edu> <200407131106.19557.falted@pytables.org> <40F6A106.6020606@sympatico.ca>
Message-ID: <200407151811.20359.falted@pytables.org>

A Dijous 15 Juliol 2004 17:21, Colin J. Williams va escriure:
> >What I propose is to be able to say:
> >>>>r["c1"][1]
> I would suggest going a step beyond this, so that one can have r.c1[1], 
> see the script below.

Yeah. I've implemented something similar to access column elements for
pytables Table objects. However, the problem in this case is that there are
already attributes that "pollute" the column namespace, so that a column
named "size" collides with the size() method.

I came up with a solution by adding a new "cols" attribute to the Table
object that is an instance of a simple class named Cols with no attributes
that can pollute the namespace (except some starting by "__" or "_v_").
Then, it is just a matter of provide functionality to access the different
columns. In that case, when a reference of a column is made, another object
(instance of Column class) is returned. This Column object is basically an
accessor to column values with a __getitem__() and __setitem__() methods.
That might sound complicated, but it is not. I'm attaching part of the
relevant code below.

I personally like that solution in the context of pytables because it
extends the "natural naming" convention quite naturally. A similar approach
could be applied to RecArray objects as well, although numarray might (and
probably do) have other usage conventions.

> I have not explored the assignment of a value to r.c1.[1], but it seems 
> to be achievable.

in the schema I've just proposed the next should be feasible:

value = r.cols.c1[1]
r.cols.c1[1] = value


-- 
Francesc Alted


-----------------------------------------------------------------
class Cols(object):
    """This is a container for columns in a table

    It provides methods to get Column objects that gives access to the
    data in the column.

    Like with Group instances and AttributeSet instances, the natural
    naming is used, i.e. you can access the columns on a table like if
    they were normal Cols attributes.
    
    Instance variables:

        _v_table -- The parent table instance
        _v_colnames -- List with all column names

    Methods:
    
        __getitem__(colname)
        
    """

    def __init__(self, table):
        """Create the container to keep the column information.

        table -- The parent table
        
        """
        self.__dict__["_v_table"] = table
        self.__dict__["_v_colnames"] = table.colnames
        # Put the column in the local dictionary
        for name in table.colnames:
            self.__dict__[name] = Column(table, name)

    def __len__(self):
        return self._v_table.nrows

    def __getitem__(self, name):
        """Get the column named "name" as an item."""

        if not isinstance(name, types.StringType):
            raise TypeError, \
"Only strings are allowed as keys of a Cols instance. You passed object: %s" % name
        # If attribute does not exist, return None
        if not name in self._v_colnames:
            raise AttributeError, \
"Column name '%s' does not exist in table:\n'%s'" % (name, str(self._v_table))

        return self.__dict__[name]

    def __str__(self):
        """The string representation for this object."""
        # The pathname
        pathname = self._v_table._v_pathname
        # Get this class name
        classname = self.__class__.__name__
        # The number of columns
        ncols = len(self._v_colnames)
        return "%s.cols (%s), %s columns" % (pathname, classname, ncols)

    def __repr__(self):
        """A detailed string representation for this object."""

        out = str(self) + "\n"
        for name in self._v_colnames:
            # Get this class name
            classname = getattr(self, name).__class__.__name__
            # The shape for this column
            shape = self._v_table.colshapes[name]
            # The type
            tcol = self._v_table.coltypes[name]
            if shape == 1:
                shape = (1,)
            out += "  %s (%s%s, %s)" % (name, classname, shape, tcol) + "\n"
        return out

               
class Column(object):
    """This is an accessor for the actual data in a table column

    Instance variables:

        table -- The parent table instance
        name -- The name of the associated column

    Methods:
    
        __getitem__(key)
        
    """

    def __init__(self, table, name):
        """Create the container to keep the column information.

        table -- The parent table instance
        name -- The name of the column that is associated with this object
        
        """
        self.table = table
        self.name = name
        # Check whether an index exists or not
        iname = "_i_"+table.name+"_"+name
        self.index = None
        if iname in table._v_parent._v_indices:
            self.index = Index(where=self, name=iname,
                               expectedrows=table._v_expectedrows)
        else:
            self.index = None

    def __getitem__(self, key):
        """Returns a column element or slice

        It takes different actions depending on the type of the
        "key" parameter:

        If "key" is an integer, the corresponding element in the
        column is returned as a NumArray/CharArray, or a scalar
        object, depending on its shape. If "key" is a slice, the row
        slice determined by this slice is returned as a NumArray or
        CharArray object (whatever is appropriate).

        """
        
        if isinstance(key, types.IntType):
            if key < 0:
                # To support negative values
                key += self.table.nrows
            (start, stop, step) = processRange(self.table.nrows, key, key+1, 1)
            return self.table._read(start, stop, step, self.name, None)[0]
        elif isinstance(key, types.SliceType):
            (start, stop, step) = processRange(self.table.nrows, key.start,
                                               key.stop, key.step)
            return self.table._read(start, stop, step, self.name, None)
        else:
            raise TypeError, "'%s' key type is not valid in this context" % \
                  (key)

    def __str__(self):
        """The string representation for this object."""
        # The pathname
        pathname = self.table._v_pathname
        # Get this class name
        classname = self.__class__.__name__
        # The shape for this column
        shape = self.table.colshapes[self.name]
        if shape == 1:
            shape = (1,)
        # The type
        tcol = self.table.coltypes[self.name]
        return "%s.cols.%s (%s%s, %s)" % (pathname, self.name,
                                          classname, shape, tcol)

    def __repr__(self):
        """A detailed string representation for this object."""
        return str(self)
               

From perry at stsci.edu  Thu Jul 15 10:39:06 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Thu Jul 15 10:39:06 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407151811.20359.falted@pytables.org>
Message-ID: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>

Francesc Alted wrote:
> A Dijous 15 Juliol 2004 17:21, Colin J. Williams va escriure:
> > >What I propose is to be able to say:
> > >>>>r["c1"][1]
> > I would suggest going a step beyond this, so that one can have r.c1[1],
> > see the script below.
>
> Yeah. I've implemented something similar to access column elements for
> pytables Table objects. However, the problem in this case is that
> there are
> already attributes that "pollute" the column namespace, so that a column
> named "size" collides with the size() method.
>
The idea of mapping field names to attributes occurs to everyone
quickly, but for the reasons Francesc gives (as well as another I'll
mention) we were reluctant to implement it. The other reason is that
it would be nice to allow field names that are not legal attributes
(e.g., that include spaces or other illegal attribute characters).
There are potentially people with data in databases or other similar
formats that would like to map field name exactly. Well certainly
one can still use the attribute approach and not support all field
names (or column, or col...) it does introduce another glitch in
the user interface when it works only for a subset of legal names.

> I came up with a solution by adding a new "cols" attribute to the Table
> object that is an instance of a simple class named Cols with no attributes
> that can pollute the namespace (except some starting by "__" or "_v_").
> Then, it is just a matter of provide functionality to access the different
> columns. In that case, when a reference of a column is made,
> another object
> (instance of Column class) is returned. This Column object is basically an
> accessor to column values with a __getitem__() and __setitem__() methods.
> That might sound complicated, but it is not. I'm attaching part of the
> relevant code below.
>
> I personally like that solution in the context of pytables because it
> extends the "natural naming" convention quite naturally. A
> similar approach
> could be applied to RecArray objects as well, although numarray might (and
> probably do) have other usage conventions.
>
> > I have not explored the assignment of a value to r.c1.[1], but it seems
> > to be achievable.
>
> in the schema I've just proposed the next should be feasible:
>
> value = r.cols.c1[1]
> r.cols.c1[1] = value
>
This solution avoids name collisions but doesn't handle the other
problem. This is worth considering, but I thought I'd hear comments
about the other issue before deciding it (there is also the
"more than one way" issue as well; but this guideline seems to bend
quite often to pragmatic concerns).

We're still chewing on all the other issues and plan to start floating
some proposals, rationales and questions before long.

Perry


From falted at pytables.org  Thu Jul 15 11:21:10 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Jul 15 11:21:10 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>
Message-ID: <200407152020.00873.falted@pytables.org>

A Dijous 15 Juliol 2004 19:37, Perry Greenfield va escriure:
> formats that would like to map field name exactly. Well certainly
> one can still use the attribute approach and not support all field
> names (or column, or col...) it does introduce another glitch in
> the user interface when it works only for a subset of legal names.

Yep. I forgot that issue. My particular workaround on that was to provide an
optional trMap dictionary during Table (in our case, RecArray) creation time
to map those original names that are not valid python names by valid ones. 

That would read something like:

>>> r=records.array([(1,"as")], "1i4,1a2",
                    names=["c 1", "c2"], trMap={"c1": "c 1"})

that would indicate that the "c 1" column which is not a valid python name
(it has an space in the middle) can be accessed by using "c1" string, which
is a valid python id. That way, r.cols.c1 would access column "c 1".

And although I must admit that this solution is not very elegant, it allows
to cope with those situations where the columns are not valid python names.

-- 
Francesc Alted


From cjw at sympatico.ca  Thu Jul 15 17:22:42 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Thu Jul 15 17:22:42 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>
Message-ID: <40F71F9C.9040008@sympatico.ca>

Perry Greenfield wrote:

>Francesc Alted wrote:
>  
>
>>A Dijous 15 Juliol 2004 17:21, Colin J. Williams va escriure:
>>    
>>
>>>>What I propose is to be able to say:
>>>>        
>>>>
>>>>>>>r["c1"][1]
>>>>>>>              
>>>>>>>
>>>I would suggest going a step beyond this, so that one can have r.c1[1],
>>>see the script below.
>>>      
>>>
>>Yeah. I've implemented something similar to access column elements for
>>pytables Table objects. However, the problem in this case is that
>>there are
>>already attributes that "pollute" the column namespace, so that a column
>>named "size" collides with the size() method.
>>
>>    
>>
>The idea of mapping field names to attributes occurs to everyone
>quickly, but for the reasons Francesc gives (as well as another I'll
>mention) we were reluctant to implement it. The other reason is that
>it would be nice to allow field names that are not legal attributes
>(e.g., that include spaces or other illegal attribute characters).
>There are potentially people with data in databases or other similar
>formats that would like to map field name exactly. Well certainly
>one can still use the attribute approach and not support all field
>names (or column, or col...) it does introduce another glitch in
>the user interface when it works only for a subset of legal names.
>  
>
It would, I suggest, not be unduly restrictive to bar the existing 
attribute names but, if that's not
acceptable, Francesc has suggested the.col workaround, although I would 
prefer to avoid the
added clutter.

Incidentally, there is no current protection against wiping out an 
existing method:
[Dbg]>>> r1.size= 0
[Dbg]>>> r1.size
0
[Dbg]>>>

>  
>
>>I came up with a solution by adding a new "cols" attribute to the Table
>>object that is an instance of a simple class named Cols with no attributes
>>that can pollute the namespace (except some starting by "__" or "_v_").
>>Then, it is just a matter of provide functionality to access the different
>>columns. In that case, when a reference of a column is made,
>>another object
>>(instance of Column class) is returned. This Column object is basically an
>>accessor to column values with a __getitem__() and __setitem__() methods.
>>That might sound complicated, but it is not. I'm attaching part of the
>>relevant code below.
>>
>>I personally like that solution in the context of pytables because it
>>extends the "natural naming" convention quite naturally. A
>>similar approach
>>could be applied to RecArray objects as well, although numarray might (and
>>probably do) have other usage conventions.
>>
>>    
>>
>>>I have not explored the assignment of a value to r.c1.[1], but it seems
>>>to be achievable.
>>>      
>>>
>>in the schema I've just proposed the next should be feasible:
>>
>>value = r.cols.c1[1]
>>r.cols.c1[1] = value
>>
>>    
>>
>This solution avoids name collisions but doesn't handle the other
>problem. This is worth considering, but I thought I'd hear comments
>about the other issue before deciding it (there is also the
>"more than one way" issue as well; but this guideline seems to bend
>quite often to pragmatic concerns).
>
To allow for multi-word column names, assignment could replace a space 
by an underscore
and, in retrieval, the reverse could be done - ie. underscore would be 
banned for a column name.

Colin W.

>
>We're still chewing on all the other issues and plan to start floating
>some proposals, rationales and questions before long.
>
>Perry
>
>
>  
>


From falted at pytables.org  Fri Jul 16 02:12:11 2004
From: falted at pytables.org (Francesc Alted)
Date: Fri Jul 16 02:12:11 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <40F71F9C.9040008@sympatico.ca>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu> <40F71F9C.9040008@sympatico.ca>
Message-ID: <200407161111.41626.falted@pytables.org>

A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
> To allow for multi-word column names, assignment could replace a space 
> by an underscore
> and, in retrieval, the reverse could be done - ie. underscore would be 
> banned for a column name.

That's not so easy. What about other chars like '/&%@$()' that cannot be
part of python names? Finding a biunivocal map between them and allowed
chars would be difficult (if possible at all). Besides, the resulting
colnames might become a real mess.

Regards,

-- 
Francesc Alted


From cjw at sympatico.ca  Fri Jul 16 05:41:12 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Fri Jul 16 05:41:12 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407161111.41626.falted@pytables.org>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu> <40F71F9C.9040008@sympatico.ca> <200407161111.41626.falted@pytables.org>
Message-ID: <40F7CBC6.2030607@sympatico.ca>

Francesc Alted wrote:

>A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
>  
>
>>To allow for multi-word column names, assignment could replace a space 
>>by an underscore
>>and, in retrieval, the reverse could be done - ie. underscore would be 
>>banned for a column name.
>>    
>>
>
>That's not so easy. What about other chars like '/&%@$()' that cannot be
>part of python names? Finding a biunivocal map between them and allowed
>chars would be difficult (if possible at all). Besides, the resulting
>colnames might become a real mess.
>
>Regards,
>  
>
Yes, if the objective is to include special characters or facilitate 
multi-lingual columns names and
it probably should be, then my suggestion is quite inadequate.

Perhaps there could be a simple name -> column number mapping in place 
of _names.  References
to a column, or a field in a record, could then be through this dictionary.

Basic access to data in a record would be by position number, rather 
than name, but the dictionary
would facilitate access by name.

Data could be referenced either through the column name: r1.c2[1] or
through the record r1[1].c2, with the possibility that the index is 
multi-dimensional in either case.

Colin W.


From rowen at u.washington.edu  Fri Jul 16 10:55:23 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Fri Jul 16 10:55:23 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <200407161111.41626.falted@pytables.org>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu>
 <40F71F9C.9040008@sympatico.ca> <200407161111.41626.falted@pytables.org>
Message-ID: <p06110402bd1db745d1bb@[128.95.99.44]>

>A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
>>  To allow for multi-word column names, assignment could replace a space
>>  by an underscore
>>  and, in retrieval, the reverse could be done - ie. underscore would be
>>  banned for a column name.
>
>That's not so easy. What about other chars like '/&%@$()' that cannot be
>part of python names? Finding a biunivocal map between them and allowed
>chars would be difficult (if possible at all). Besides, the resulting
>colnames might become a real mess.

Personally, I think the idea of allowing access to fields via 
attributes is fatally flawed. The problems raised (non-obvious 
mapping between field names with special characters and allowable 
attribute names and also the collision with existing instance 
variable and method names) clearly show it would be forced and 
non-pythonic.

The obvious solution seems to be some combination of the dict 
interface (an ordered dict that keeps its keys in original field 
order) and the list interface.

My personal leaning is:
- Offer most of the dict methods, including __get/setitem__, keys, 
values and all iterators but NOT set_default pop_item or anything 
else that adds or deletes a field.
- Offer the list version of __get/setitem__, as well, but NONE of 
list's methods.
- Make the default iterator iterate over values, not keys (field names),
   i.e have the item act like a list, not a dict when used as an iterator.

In other words, the following all work (where item is one element of 
a numarray.record array):
item[0] = 10  # set value of field 0 to 10
x = item[0:5]  # get value of fields 0 through 4
item[:] = list of replacement values
item["afield"] = 10
"%s(afield)" % item
the methods iterkeys, itervalues, iteritems, keys, values, has_key all work
the method update might work, but it's an error to add new fields

-- Russell

P.S. Folks are welcome to use my ordered dictionary implementation 
RO.Alg.OrderedDictionary, which is part of the RO package 
<http://www.astro.washington.edu/rowen/ROPython.html>. It is fully 
standalone (despite its location in my hierarchy) and is used in 
production code.


From barrett at stsci.edu  Fri Jul 16 11:49:01 2004
From: barrett at stsci.edu (Paul Barrett)
Date: Fri Jul 16 11:49:01 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <p06110402bd1db745d1bb@[128.95.99.44]>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu> <40F71F9C.9040008@sympatico.ca> <200407161111.41626.falted@pytables.org> <p06110402bd1db745d1bb@[128.95.99.44]>
Message-ID: <40F822E0.5010406@stsci.edu>

Russell E Owen wrote:

>> A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
>>
>>>  To allow for multi-word column names, assignment could replace a space
>>>  by an underscore
>>>  and, in retrieval, the reverse could be done - ie. underscore would be
>>>  banned for a column name.
>>
>>
>> That's not so easy. What about other chars like '/&%@$()' that cannot be
>> part of python names? Finding a biunivocal map between them and allowed
>> chars would be difficult (if possible at all). Besides, the resulting
>> colnames might become a real mess.
>
>
> Personally, I think the idea of allowing access to fields via 
> attributes is fatally flawed. The problems raised (non-obvious mapping 
> between field names with special characters and allowable attribute 
> names and also the collision with existing instance variable and 
> method names) clearly show it would be forced and non-pythonic.

+1

It also make it difficult to do the following:

a = item[:10, ('age', 'surname', 'firstname')]

where field (or column) 1 is 'firstname, field 2 is 'surname', and field 
10 is 'age'.

 -- Paul

-- 
Paul Barrett, PhD      Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Branch
FAX:   410-338-4767    Baltimore, MD 21218


From jmiller at stsci.edu  Fri Jul 16 12:43:02 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jul 16 12:43:02 2004
Subject: [Numpy-discussion] I move your "Bugs" reports...
Message-ID: <1090006936.7264.66.camel@halloween.stsci.edu>

Not infrequently even very experienced numarray contributors file bug
reports in the numpy "Bugs" tracker.  Because numpy is a shared SF
project with both Numeric and numarray,  numarray bugs are actually
tracked in the "Numarray Bugs" tracker, here:

http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse

"Numarray Bugs" can also be found through the "Tracker" link at the top
of any numpy SF web page.  

So, don't worry, your painstaking reports are not getting deleted,
they're getting relocated to a place where *only* numarray bugs live. 
There's probably a better way to do this,  but until I find it or
someone tells me about it,  I thought I should tell everyone what's
going on.  Thanks to everybody who takes the time to fill out bug
reports to make numarray better...

Regards,
Todd


From hsu at stsci.edu  Fri Jul 16 13:19:00 2004
From: hsu at stsci.edu (Jin-chung Hsu)
Date: Fri Jul 16 13:19:00 2004
Subject: [Numpy-discussion] multidimensional record arrays
Message-ID: <200407162018.ANW09710@donner.stsci.edu>

There have been a number of questions and suggestions about
how the record array facility in numarray could be improved.
We've been talking about these internally and thought it would
be useful to air some proposals along with discussions of the
rationale behind each proposal as well discussions of drawbacks,
and some remaining open questions. Rather than do this in one
long message, we will do this in pieces. The first addresses
how to improve handling multidimensional record arrays.

These will not discuss how or when we implement the proposed
enhancements or changes. We first want to come to some
consensus (or lacking that, decision) first about what the
target should be.

*********************************************************

Proposal for records module enhancement, to handle record arrays of
dimension (rank) higher than 1.

Background:

The current records module in numarray doesn't handle record arrays of
dimension higher than one well.  Even though most of the infrastructure
for higher dimensionality is already in place, the current implementation
for the record arrays was based on the implicit assumption that record
arrays are 1-D. This limitation is reflected in the areas of input user
interface, indexing, and output.

The indexing and output are more straightforward to modify, so I'll
discuss it first.

Although it is possible to create a multi-dimensional record array,
indexing does not work properly for 2 or more dimensions.  For example,
for a 2-D record array r, r[i,j] does not give correct result (but r[i][j]
does). This will be fixed.

At present, a user cannot print record arrays higher than 1-D.  This will
also be fixed as well as incorporating some numarray features (e.g.,
printing only the beginning and end of an array for large arrays--as is done
for numarrays now).

Input Interface:

There are currently several different ways to construct the record array
using the array() function These include setting the buffer argument to:

(1) None
(2) File object
(3) String object or appropriate buffer object (i.e., binary data)
(4) a list of records (in the form of sequences),
    for example:  [(1,'abc', 2.3), (2,'xyz', 2.4)]
(5) a list of numarrays/chararrays for each field (e.g., effectively
 'zipping' the arrays into records)

The first three types of input are very general and can be used to generate
multi-dimensional record arrays in the current implementation.  All these
options need to specify the "shape" argument.

The input options that do not work for multi-dimensional record arrays now
are the last two.

Option 4 (sequence of 'records')

If a user has a multi-dimensional record array and if one or more field is
also a multidimensional array, using this option is potentially confusing
since there can be ambiguity regarding what part of a nested sequence
structure is the structure of the record array and what should be considered
part of the record since record elements themselves may be arrays. (Some of
the same issues arise for object arrays)

As an example:

--> r=rec.array([([1,2],[3,4]),([11,12],[13,14])])

could be interpreted as a 1-D record array, where each cell is an
(num)array:

RecArray[
(array([1, 2]), array([3, 4])),
(array([11, 12]), array([13, 14]))
]

or a 2-D record array, where each cell is just a number:

RecArray(
             [[(1, 2),
              (3, 4)],

             [(11, 12),
              (13, 14)]])

Thus we propose a new argument "rank" (following the convention used in
object arrays) to specify the dimensionality of the output record array.  In
the first example above, rank is 1, and the second example rank=2.  If rank
is set to None, the highest possible rank will be assumed (in this example,
2).

We propose to eventually generalize that to accept any sequence object for
the array structure (though there will be the same requirement that exist
for
other arrays that the nested sequences be of the same type). As would be
expected, strings are not permitted as the enclosing sequence. In this
future implementation the record 'item' itself must either be:

1) A tuple
2) A subclass of tuple
3) A Record object (this may be taken care of by 2 if we make Record
   a subclass of tuple; this will be discussed in a subsequent proposal.

This requirement allows distinguishing the sequence of records from Option 5
below. For tuples (or tuple derived elements), the items of the tuple must
be one of the following: basic data types such as int, float, boolean, or
string; a numarray or chararray; or an object that can be converted to a
numarray or chararray.

Option 5 (List of Arrays)

Using a list of arrays to construct an N-D record array should be easier
Than using the previous option.  The input syntax is simply:

[array1, array2, array3,...]

The shape of the record array will be determined from the shape of the input
arrays as described below. All the user needs to do is to construct the
arrays in the list.  There is, similar to option 4, a possible ambiguity:
if all the arrays are of the shape, say, (2,3), then the user may intend a
1-D record array of 2 rows while each cell is an array of shape 3, or a 2-D
record array of shape (2,3) while each cell is a single number of string.
Thus, the user must either explicitly specify the "shape" or "rank".

We propose the following behavior via examples:

Example 1:

given:

array1.shape=(2,3,4,5)
array2.shape=(2,3,4)
array3.shape=(2,3)

Rank can only be specified as rank=1 (the record array's shape will then be
(2,)) or rank=2 (the record array's shape will then be (2,3)). For rank=None
the record shape will be (2,3), i.e. the "highest common denominator": each
cell in the first field will be an array of shape (4,5), each cell in the
second field will be an array of shape (4,), and each cell in the 3rd field
will be a single number or a string.  If "shape" is specified, it will take
precedence over "rank" and its allowed value in this example will be either
2, or (2,3).

Example 2:

array1.shape=(3,4,5)
array2.shape=(4,5)

this will raise exception because the 'slowest' axes do not match.

*********

For both the sequence of records and list-of-arrays input options, we
Propose the default value for "rank" be None (current default is 1).
This gives consistent behavior with object arrays but does change the
current behavior.

Also for both cases specifying a shape inconsistent with the supplied data
will raise an exception.


From cjw at sympatico.ca  Fri Jul 16 19:46:09 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Fri Jul 16 19:46:09 2004
Subject: [Numpy-discussion] RecArray.tolist() suggestion
In-Reply-To: <40F822E0.5010406@stsci.edu>
References: <CEELJPECNGEGKFNDLHFFOEGACEAA.perry@stsci.edu> <40F71F9C.9040008@sympatico.ca> <200407161111.41626.falted@pytables.org> <p06110402bd1db745d1bb@[128.95.99.44]> <40F822E0.5010406@stsci.edu>
Message-ID: <40F892B2.7090706@sympatico.ca>

Paul Barrett wrote:

> Russell E Owen wrote:
>
>>> A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
>>>
>>>>  To allow for multi-word column names, assignment could replace a 
>>>> space
>>>>  by an underscore
>>>>  and, in retrieval, the reverse could be done - ie. underscore 
>>>> would be
>>>>  banned for a column name.
>>>
>>>
>>>
>>> That's not so easy. What about other chars like '/&%@$()' that 
>>> cannot be
>>> part of python names? Finding a biunivocal map between them and allowed
>>> chars would be difficult (if possible at all). Besides, the resulting
>>> colnames might become a real mess.
>>
>>
>>
>> Personally, I think the idea of allowing access to fields via 
>> attributes is fatally flawed. The problems raised (non-obvious 
>> mapping between field names with special characters and allowable 
>> attribute names and also the collision with existing instance 
>> variable and method names) clearly show it would be forced and 
>> non-pythonic.
>
>
> +1

Paul,

Below, I've appended my response to Francesc's 08:36 message, it was 
copied to the list
but does not appear in the archive.

>
> It also make it difficult to do the following:
>
> a = item[:10, ('age', 'surname', 'firstname')]
>
> where field (or column) 1 is 'firstname, field 2 is 'surname', and 
> field 10 is 'age'.
>
> -- Paul

Could you clarify what you have in mind here please?  Is this a proposed
extension to records.py, as it exists in version 1.0?

Colin W.
------------------------------------------------------------------------
Yes, if the objective is to include special characters or facilitate 
multi-lingual columns names and
it probably should be, then my suggestion is quite inadequate.

Perhaps there could be a simple name -> column number mapping in place 
of _names.  References
to a column, or a field in a record, could then be through this dictionary.

Basic access to data in a record would be by position number, rather 
than name, but the dictionary
would facilitate access by name.

Data could be referenced either through the column name: r1.c2[1] or
through the record r1[1].c2, with the possibility that the index is 
multi-dimensional in either case.

Colin W.


From gerard.vermeulen at grenoble.cnrs.fr  Sun Jul 18 14:25:10 2004
From: gerard.vermeulen at grenoble.cnrs.fr (gerard.vermeulen at grenoble.cnrs.fr)
Date: Sun Jul 18 14:25:10 2004
Subject: [Numpy-discussion] Follow-up Numarray header PEP
In-Reply-To: <1088632459.7526.213.camel@halloween.stsci.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu>
Message-ID: <20040718212443.M21561@grenoble.cnrs.fr>

Hi Todd,

This is a follow-up on the 'header pep' discussion.

The attachment numnum-0.1.tar.gz contains the sources for the
extension modules pep and numnum.  At least on my systems, both
modules behave as described in the 'numarray header PEP' when the
extension modules implementing the C-API are not present (a situation
not foreseen by the macros import_array() of Numeric and especially
numarray).  IMO, my solution is 'bona fide', but requires further
testing.

The pep module shows how to handle the colliding C-APIs of the Numeric
and numarray extension modules and how to implement automagical
conversion between Numeric and numarray arrays.

For a technical reason explained in the README, the hard work of doing
the conversion between Numeric and numarray arrays has been delegated
to the numnum module.  The numnum module is useful when one needs to
convert from one array type to the other to use an extension module
which only exists for the other type (eg. combining numarray's image
processing extensions with pygame's Numeric interface):

Python 2.3+ (#1, Jan  7 2004, 09:17:35)
[GCC 3.3.1 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numnum; import Numeric as np; import numarray as na
>>> np1 = np.array([[1, 2], [3, 4]]); na1 = numnum.toNA(np1)
>>> na2 = na.array([[1, 2, 3], [4, 5, 6]]); np2 = numnum.toNP(na2)
>>> print type(np1); np1; type(np2); np2
<type 'array'>
array([[1, 2],
       [3, 4]])
<type 'array'>
array([[1, 2, 3],
       [4, 5, 6]],'i')
>>> print type(na1); na1; type(na2); na2
<class 'numarray.numarraycore.NumArray'>
array([[1, 2],
       [3, 4]])
<class 'numarray.numarraycore.NumArray'>
array([[1, 2, 3],
       [4, 5, 6]])
>>>

The pep module shows how to implement array processing functions which
use the Numeric, numarray or Sequence C-API:

static PyObject *
wysiwyg(PyObject *dummy, PyObject *args)
{
    PyObject *seq1, *seq2;
    PyObject *result;

    if (!PyArg_ParseTuple(args, "OO", &seq1, &seq2))
        return NULL;

    switch(API) {
    case NumericAPI:
    {
        PyObject *np1 = NN_API->toNP(seq1);
        PyObject *np2 = NN_API->toNP(seq2);
        result = np_wysiwyg(np1, np2);
        Py_XDECREF(np1);
        Py_XDECREF(np2);
        break;
    }
    case NumarrayAPI:
    {
        PyObject *na1 = NN_API->toNA(seq1);
        PyObject *na2 = NN_API->toNA(seq2);
        result = na_wysiwyg(na1, na2);
        Py_XDECREF(na1);
        Py_XDECREF(na2);
        break;
    }
    case SequenceAPI:
        result = seq_wysiwyg(seq1, seq2);
        break;
    default:
        PyErr_SetString(PyExc_RuntimeError, "Should never happen");
        return 0;
    }

    return result;
}

See the README for an example session using the pep module showing that
it is possible pass a mix of Numeric and numarray arrays to pep.wysiwyg().

Notes:

- it is straightforward to adapt pep and numnum so that the conversion
  functions are linked into pep instead of imported.

- numnum is still 'proof of concept'.  I am thinking about methods to
  make those techniques safer if the numarray (and Numeric?) header
  files make it never into the Python headers (or make it safer to
  use those techniques with Python < 2.4).  In particular it would
  be helpful if the numerical C-APIs export an API version number,
  similar to the versioning scheme of shared libraries -- see the
  libtool->versioning info pages. 

I am considering three possibilities to release a more polished
version of numnum (3rd party extension writers may prefer to link
rather than import numnum's functionality):

1. release it from PyQwt's project page
2. register an independent numnum project at SourceForge
3. hand numnum over to the Numerical Python project (frees me from
   worrying about API changes).


Regards -- Gerard Vermeulen

-------------- next part --------------
A non-text attachment was scrubbed...
Name: numnum-0.1.tar.gz
Type: application/gzip
Size: 12851 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040718/cf48f61d/attachment-0001.bin>

From jmiller at stsci.edu  Tue Jul 20 05:49:04 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jul 20 05:49:04 2004
Subject: [Numpy-discussion] Follow-up Numarray header PEP
In-Reply-To: <20040718212443.M21561@grenoble.cnrs.fr>
References: <1088451653.3744.200.camel@localhost.localdomain>
	 <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr>
	 <1088536183.17789.346.camel@halloween.stsci.edu>
	 <20040629211800.M55753@grenoble.cnrs.fr>
	 <1088632459.7526.213.camel@halloween.stsci.edu>
	 <20040718212443.M21561@grenoble.cnrs.fr>
Message-ID: <1090327693.3749.257.camel@localhost.localdomain>

On Sun, 2004-07-18 at 17:24, gerard.vermeulen at grenoble.cnrs.fr wrote:
> Hi Todd,
> 
> This is a follow-up on the 'header pep' discussion.

Great!  I was afraid you were going to disappear back into the ether.

Sorry I didn't respond to this yesterday...  I saw it but accidentally
marked it as "read" and then forgot about it as the day went on.

> The attachment numnum-0.1.tar.gz contains the sources for the
> extension modules pep and numnum.  At least on my systems, both
> modules behave as described in the 'numarray header PEP' when the
> extension modules implementing the C-API are not present (a situation
> not foreseen by the macros import_array() of Numeric and especially
> numarray).  

For numarray,  this was *definitely* foreseen at some point,  so I'm
wondering what doesn't work now...

> IMO, my solution is 'bona fide', but requires further
> testing.

I'll look it over today or tomorrow and comment more then.

> The pep module shows how to handle the colliding C-APIs of the Numeric
> and numarray extension modules and how to implement automagical
> conversion between Numeric and numarray arrays.

Nice;  the conversion code sounds like a good addition to me.  

> For a technical reason explained in the README, the hard work of doing
> the conversion between Numeric and numarray arrays has been delegated
> to the numnum module.  The numnum module is useful when one needs to
> convert from one array type to the other to use an extension module
> which only exists for the other type (eg. combining numarray's image
> processing extensions with pygame's Numeric interface):
> 
> Python 2.3+ (#1, Jan  7 2004, 09:17:35)
> [GCC 3.3.1 (SuSE Linux)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numnum; import Numeric as np; import numarray as na
> >>> np1 = np.array([[1, 2], [3, 4]]); na1 = numnum.toNA(np1)
> >>> na2 = na.array([[1, 2, 3], [4, 5, 6]]); np2 = numnum.toNP(na2)
> >>> print type(np1); np1; type(np2); np2
> <type 'array'>
> array([[1, 2],
>        [3, 4]])
> <type 'array'>
> array([[1, 2, 3],
>        [4, 5, 6]],'i')
> >>> print type(na1); na1; type(na2); na2
> <class 'numarray.numarraycore.NumArray'>
> array([[1, 2],
>        [3, 4]])
> <class 'numarray.numarraycore.NumArray'>
> array([[1, 2, 3],
>        [4, 5, 6]])
> >>>
> 
> The pep module shows how to implement array processing functions which
> use the Numeric, numarray or Sequence C-API:
> 
> static PyObject *
> wysiwyg(PyObject *dummy, PyObject *args)
> {
>     PyObject *seq1, *seq2;
>     PyObject *result;
> 
>     if (!PyArg_ParseTuple(args, "OO", &seq1, &seq2))
>         return NULL;
> 
>     switch(API) {

We'll definitely need to cover API in the PEP.  There is a design choice
here which needs to be discussed some and any resulting consensus
documented.  I haven't looked at the attachment yet.

>     case NumericAPI:
>     {
>         PyObject *np1 = NN_API->toNP(seq1);
>         PyObject *np2 = NN_API->toNP(seq2);
>         result = np_wysiwyg(np1, np2);
>         Py_XDECREF(np1);
>         Py_XDECREF(np2);
>         break;
>     }
>     case NumarrayAPI:
>     {
>         PyObject *na1 = NN_API->toNA(seq1);
>         PyObject *na2 = NN_API->toNA(seq2);
>         result = na_wysiwyg(na1, na2);
>         Py_XDECREF(na1);
>         Py_XDECREF(na2);
>         break;
>     }
>     case SequenceAPI:
>         result = seq_wysiwyg(seq1, seq2);
>         break;
>     default:
>         PyErr_SetString(PyExc_RuntimeError, "Should never happen");
>         return 0;
>     }
> 
>     return result;
> }
> 
> See the README for an example session using the pep module showing that
> it is possible pass a mix of Numeric and numarray arrays to pep.wysiwyg().
> 
> Notes:
> 
> - it is straightforward to adapt pep and numnum so that the conversion
>   functions are linked into pep instead of imported.
> 
> - numnum is still 'proof of concept'.  I am thinking about methods to
>   make those techniques safer if the numarray (and Numeric?) header
>   files make it never into the Python headers (or make it safer to
>   use those techniques with Python < 2.4).  In particular it would
>   be helpful if the numerical C-APIs export an API version number,
>   similar to the versioning scheme of shared libraries -- see the
>   libtool->versioning info pages. 

I've thought about this a few times;  there's certainly a need for it in
numarray anyway... and I'm always one release too late.  Thanks for the
tip on libtool->versioning.

> I am considering three possibilities to release a more polished
> version of numnum (3rd party extension writers may prefer to link
> rather than import numnum's functionality):
> 
> 1. release it from PyQwt's project page
> 2. register an independent numnum project at SourceForge
> 3. hand numnum over to the Numerical Python project (frees me from
>    worrying about API changes).
> 
> Regards -- Gerard Vermeulen

(3) sounds best to me, for the same reason that numarray is a part of
the numpy project and because numnum is a Numeric/numarray tool.  There
is a small issue of sub-project organization (seperate bug tracking,
etc.),  but I figure if SF can handle Python,  it can handle Numeric,
numarray, and probably a number of other packages as well.  Something
like numnum should not be a problem and so to promote it, it would be
good to keep it where people can find it without having to look too
hard.

For now,  I'm again marking your post as "unread" and will revisit it
later this week.  In the meantime, thanks very much for your efforts
with numnum and the PEP.

Regards,
Todd


From perry at stsci.edu  Tue Jul 20 09:05:02 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Jul 20 09:05:02 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest of the story
In-Reply-To: <BD22BA00.E9E8%perry@stsci.edu>
Message-ID: <BD22BAC0.E9EB%perry@stsci.edu>


We now turn to the behavior of Records. We'll note that many of the current
proposals had been considered in the past but not implemented with more of a
'wait and see' attitude towards what was really necessary and a desire to
prevent too many ways of doing the same thing without seeing that there was
a real call for them.

This proposal deals with the behavior of record array 'items', i.e., what we
call Record objects now.

The primary issues that have been raised with regard to Record behavior are
summarized as follows:

1) Items should be tuples instead of Records
2) Items should be objects, but present tuple and/or dictionary consistent
behavior.
3) Field (or column) names should be accessible as Record (and record
array) attributes.

Issue 1: Should record array items be tuples instead of Records?

Francesc Alted made this suggestion recently. Essentially the argument is
that tuples are a natural way of representing records. Unfortunately, tuples
do not provide a means of accessing fields of a record by name, but only by
number. For this reason alone, tuples don't appear to be adequate. Francesc
proposed allowing dictionary-like indexing to record arrays to facilitate
the field access to tuple entries by name. However, it seems that if
rarr is a record array, that both rarr['column 1'][2] and rarr[2]['column
1'] should work, not just the former. So the short answer is "No".

It should be noted that using tuples will force another change in current
behavior. Note that the current Record objects are actually views into the
record array. Changing the value within a record object changes the record
array. Use of tuples won't allow that since tuples are not mutable. Whole
records must be changed in their entirety if single elements of record
arrays were set by and returned from tuples.

But his comments (and well as those of others) do point out a number of
problems with the current implementation that could be improved, and making
the Record object support tuple behaviors is quite reasonable. Hence:

Issue 2: Should record array items present tuple and/or dictionary
compatible behaviors?

The short answer is, yes, we do agree that they should. This includes many
of the proposals made including:

1) supporting all Tuple capabilities with the following differences:
    a) fields are mutable (unlike tuple items) so long as the assigned value
is coerceable to the expected type. For example the current methods of doing
so are:

>>> cell = oneRec.field(1)
>>> oneRec.setfield(1, newValue)

This proposal would allow:

>>> cell = oneRec[1]
>>> oneRec[1] = newValue

    b) slice assignments are permitted so long as it doesn't change the size
of the record (i.e., no insertion of extra items) and the items can be
assigned as permitted for a. E.g.,

OneCell[2:4] = (3, 'abc')

    c) __str__ will result in a display looking like that for tuples,
__repr__ will show a Record constructor

>>> print oneRec # as is currently implemented
(1.1, 2, 'abc', 3)
>>> oneRec
Record((1.1, 2, 'abc', 3), formats=['1Float32', '1Int16', '1a3', '1Int32'])
    names=['abc', 'c2', 'xyz', 'c4'])

(note that how best to handle formats is still being thought about)

2) supporting all Dictionary capabilities with the following differences:
    a) keys and items are ordered.
    b) keys are restricted to being integers or strings only
    c) new keys cannot be dynamically added or deleted as for dictionaries
    d) no support for any other dictionary capabilities that can change the
number or names of items
    e) __str__ will not show a result looking like a dictionary (see 1c)
    f) values must meet Record object required type (or be coerceable to it)
    
For example the current

>>> cell = onRec.field('c2')
>>> oneRec.setfield('c2', newValue)

And the proposed added indexing capability:

>>> cell = oneRec['c2']
>>> oneRec['c2'] = newValue

Issue 3: Field (or column) names should be accessible as Record (and record
array) attributes.

As much as the attribute approach has appeal for simple usage, the problems
of name collisions and mismatches between acceptable field names
and attribute names strikes us as it does Russell Owen as being very
problematic. The technique of using a special attribute as Francesc suggests
(in his case, cols) that contains the field name attributes solves the name
collision problem, but not the legality issue (particularly with regard to
illegal characters, it's hard to imagine easily remembered mappings between
legal attribute representations and the actual field name. We are inclined
to try to pass (for now anyway) on mapping fields to attributes in any way.
It seems to us that indexing by name should be convenient enough, as well as
fully flexible to really satisfy all needs (and is needed in any case since
attributes are a clumsy way to use field access when using a variable to
specify the field (yes, one can use  getattr(), but it's clumsy)

*******************************************

Record array behavior changes:

1) It will be possible to assign any sequence to a record array item so long
as the sequence contains the right number of fields, and each item of the
sequence can be coerced to what the record array expects for the
corresponding field of the record. (addressing numarray feature request
928473 by Russell Owen).

I.e.,

>>> recArr[1] = (2, 3.2, 'xyz', 3)

2) One may assign a record to a record array so long as the record matches
the format of the record format of the record array (current behavior).
3) Easier construction and initialization of recarrays with default field
values as requested in numarray bug report 928479)
4) Support for lists of field names and formats as detailed in numarray bug
report 928488.
5) Field name indexing for record arrays. It will be possible to index
record arrays with a field name, i.e., if the index is a string, then what
will be returned is a numarray/chararray for that column. (Note that it
won't be possible to index record arrays by field number for obvious
reasons).

I.e. Currently

>>> col = recArr.field('doc')

Can also be

>>> col = recArr['abc']

But the current

>>> col = recArr.field(1)

Cannot become

>>> col = recArr[1]

On the other hand, it will not be permitted to mix a field index with an
array index in the same brackets, e.g., rarr[10, 'column 2'] will not be
supported. Allowing indexing to have two different interpretations is a bit
worrying. But if record array items may be indexed in this manner, it seems
natural to permit the same indexing for the record array. Mixing the two
kinds of indexing in one index seems of limited usefulness in the first
place and it makes inheriting the existing indexing machinery for NDArrays
more complicated (any efficiency gains in avoiding the intermediate object
creation by using two separate index operations will likely be offset by the
slowness of handling much more complicated mixed indices). Perhaps someone
can argue for why mixing field indices with array indices is important, but
for now we will prohibit this mode of indexing.

This does point to a possible enhancement for the field indexing, namely
being able to provide the equivalent of index arrays (e.g., a list of field
names) to generate a new  record array with a subset of fields.

Are there any other issues that should be addressed for improving record
arrays?


From rowen at u.washington.edu  Tue Jul 20 10:15:05 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Tue Jul 20 10:15:05 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest
 of the story
In-Reply-To: <BD22BAC0.E9EB%perry@stsci.edu>
References: <BD22BAC0.E9EB%perry@stsci.edu>
Message-ID: <p06110401bd22f702c420@[128.95.99.44]>

At 12:04 PM -0400 2004-07-20, Perry Greenfield wrote:
>...(a detailed summary of proposed changes to numarray record arrays)

+1 on all of it with one exception noted below. This sounds like a 
first-rate overhaul and is much appreciated.

Will it be possible, when creating a new records array, to specify 
types of a record array as a list of normal numarray types? Currently 
one has to specify the types as a "formats" string, which is 
nonstandard.

I'm unhappy about one proposal:
>...
>Record array behavior changes:
>...
>5) Field name indexing for record arrays. It will be possible to index
>record arrays with a field name, i.e., if the index is a string, then what
>will be returned is a numarray/chararray for that column. (Note that it
>won't be possible to index record arrays by field number for obvious
>reasons).
>
>I.e. Currently
>
>>>>  col = recArr.field('doc')
>
>Can also be
>
>>>>  col = recArr['abc']
>
>But the current
>
>>>>  col = recArr.field(1)
>
>Cannot become
>
>>>>  col = recArr[1]

I think recarray[field name] is too easily confused with 
recarray[index] and is unnecessary.

I suggest one of two solutions:
- Do nothing. Make users use field(field name or index)
or
- Allow access to the fields via an indexable entity. Simplest for 
the user would be to use "field" itself:
   recArr.field[1]
   recArr.field["abc"]
(i.e. field becomes an object that can be called or can be accessed 
via __getitem__)

This could easily support index arrays (a topic you brought up and 
that sound appealing to me):
   recArr.field[index array]
and it might even be practical to support:
   recArr.field[sequence of field indices and/or names]
e.g.
   recArr.field[(ind 1, field name 2, ind 3...)]

You asked about other issues. One that comes to mind is record arrays 
of record arrays. Should they be allowed? My gut reaction is yes if 
it's not too hard. Folks always seem to find a use for generality if 
it's offered. On the other hand, if it's hard, it's not worth the 
effort. If they are allowed, users are going to want some efficient 
way to get to a particular field (i.e. in one call even if the field 
is several recArrays deep). That could get messy.

Thanks for a great posting. The improvements to record arrays sound first-rate.

-- Russell


From hsu at stsci.edu  Wed Jul 21 11:53:40 2004
From: hsu at stsci.edu (Jin-chung Hsu)
Date: Wed Jul 21 11:53:40 2004
Subject: [Numpy-discussion] formats in record array
Message-ID: <200407211850.AOO09987@donner.stsci.edu>

> From: Russell E Owen <rowen at u.washington.edu>
> Subject: Re: [Numpy-discussion] Proposed record array behavior: the rest
> of the story


> Will it be possible, when creating a new records array, to specify 
> types of a record array as a list of normal numarray types? Currently 
> one has to specify the types as a "formats" string, which is 
> nonstandard.

In theory it is easy to do that except you can't specify cell arrays, i.e.
how do you specify the equivalent of:

formats=['3Int16', '(4,5)Float32']

with the numarray type instances?

JC Hsu


From rlw at stsci.edu  Wed Jul 21 12:23:07 2004
From: rlw at stsci.edu (Rick White)
Date: Wed Jul 21 12:23:07 2004
Subject: [Numpy-discussion] formats in record array
In-Reply-To: <200407211850.AOO09987@donner.stsci.edu>
Message-ID: <Pine.GSO.4.44.0407211513280.29474-100000@sundog.stsci.edu>

On Wed, 21 Jul 2004, Jin-chung Hsu wrote:

> > From: Russell E Owen <rowen at u.washington.edu>
> > Subject: Re: [Numpy-discussion] Proposed record array behavior: the rest
> > of the story
> >
> > Will it be possible, when creating a new records array, to specify
> > types of a record array as a list of normal numarray types? Currently
> > one has to specify the types as a "formats" string, which is
> > nonstandard.
>
> In theory it is easy to do that except you can't specify cell arrays, i.e.
> how do you specify the equivalent of:
>
> formats=['3Int16', '(4,5)Float32']
>
> with the numarray type instances?
>
> JC Hsu

Well, how about one (or both) of these:

formats = 3*(Int16,), 4*(5*(Float32,),)

formats = (3,Int16), ((4,5), Float32)


From kyeser at earthlink.net  Wed Jul 21 18:19:07 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Wed Jul 21 18:19:07 2004
Subject: [Numpy-discussion] Is there a better way to do this?
Message-ID: <16A7C641-DB7D-11D8-A37A-000393479EE8@earthlink.net>

My question is not directly related to NumPy, but since many people 
here deal with numbers, I was wondering if I could get some help; it 
would be even better if there is a NumPy (or Numarray) function that 
takes care of what I want!

I'm trying to write a program that computes six-digit numbers, in which 
the left digit is always smaller than its following digit (i.e., it's 
always ascending).  The best I could do was to have many embedded 'for' 
statement:

c = 1
for p0 in range(0, 7):
   for p1 in range(1, 12):
     for p2 in range(2, 12):
       for p3 in range(3, 12):
         for p4 in range(4, 12):
           for p5 in range(5, 12):
             if p0 < p1 < p2 < p3 < p4 < p5:
               print repr(c).rjust(3), "\t",
               print "%X %X %X %X %X %X" % (p0, p1, p2, p3, p4, p5)
               c += 1
print "...Done"

This works, except that it's very slow.  I need to get it up to 
nine-digit numbers, in which case it's significantly slow.  I was 
wondering if there is a more efficient way to do this.

I would highly appreciate it if anyone could help.

Many thanks.

-Kye


From jcollins_boulder at earthlink.net  Wed Jul 21 18:49:10 2004
From: jcollins_boulder at earthlink.net (Jeffery D. Collins)
Date: Wed Jul 21 18:49:10 2004
Subject: [Numpy-discussion] Is there a better way to do this?
In-Reply-To: <16A7C641-DB7D-11D8-A37A-000393479EE8@earthlink.net>
References: <16A7C641-DB7D-11D8-A37A-000393479EE8@earthlink.net>
Message-ID: <40FF1D11.8090606@earthlink.net>

Hee-Seng Kye wrote:

> My question is not directly related to NumPy, but since many people 
> here deal with numbers, I was wondering if I could get some help; it 
> would be even better if there is a NumPy (or Numarray) function that 
> takes care of what I want!
>
> I'm trying to write a program that computes six-digit numbers, in 
> which the left digit is always smaller than its following digit (i.e., 
> it's always ascending).  The best I could do was to have many embedded 
> 'for' statement:
>
> c = 1
> for p0 in range(0, 7):
>   for p1 in range(1, 12):
>     for p2 in range(2, 12):
>       for p3 in range(3, 12):
>         for p4 in range(4, 12):
>           for p5 in range(5, 12):
>             if p0 < p1 < p2 < p3 < p4 < p5:
>               print repr(c).rjust(3), "\t",
>               print "%X %X %X %X %X %X" % (p0, p1, p2, p3, p4, p5)
>               c += 1
> print "...Done"
>
> This works, except that it's very slow.  I need to get it up to 
> nine-digit numbers, in which case it's significantly slow.  I was 
> wondering if there is a more efficient way to do this.
>
> I would highly appreciate it if anyone could help.

This appears to give the same results and is significantly faster.

def vers1():
    c = 1
    for p0 in range(0, 7):
        for p1 in range(p0+1, 12):
            for p2 in range(p1+1, 12):
                for p3 in range(p2+1, 12):
                    for p4 in range(p3+1, 12):
                        for p5 in range(p4+1, 12):
                            print repr(c).rjust(3), "\t",
                            print "%X %X %X %X %X %X" % (p0, p1, p2, p3, 
p4, p5)
                            c += 1
    print "...Done"


>
> Many thanks.
>
> -Kye
>
--
Jeff


From rlw at stsci.edu  Wed Jul 21 22:03:03 2004
From: rlw at stsci.edu (Rick White)
Date: Wed Jul 21 22:03:03 2004
Subject: [Numpy-discussion] Is there a better way to do this?
In-Reply-To: <16A7C641-DB7D-11D8-A37A-000393479EE8@earthlink.net>
Message-ID: <Pine.GSO.4.44.0407212226350.3927-100000@sundog.stsci.edu>

On Wed, 21 Jul 2004, Hee-Seng Kye wrote:

> I'm trying to write a program that computes six-digit numbers, in which
> the left digit is always smaller than its following digit (i.e., it's
> always ascending).

Here's another version that is a little faster still:

def f3():
  c = 1
  for p0 in range(0, 7):
    for p1 in range(p0+1, 8):
      for p2 in range(p1+1, 9):
        for p3 in range(p2+1, 10):
          for p4 in range(p3+1, 11):
            for p5 in range(p4+1, 12):
              print repr(c).rjust(3), "\t",
              print "%X %X %X %X %X %X" % (p0, p1, p2, p3, p4, p5)
              c += 1
  print "...Done"

This is plenty fast even for 9-digit numbers.  In fact it gets
a little faster for larger numbers of digits.

This problem is completely equivalent to the problem of finding
all combinations of 6 numbers chosen from the digits 0..11.
If you sort the digits of each combination in ascending order,
you get your numbers.  So if you search for something like
"Python permutations combinations" you can find other algorithms
that work.  Here's a recursive version:

def f4(n, digits=range(12)):
    if n==0:
        return [[]]
    rv = []
    for i in range(len(digits)):
        for cc in f4(n-1,digits[i+1:]):
            rv.append([digits[i]]+cc)
    return rv

That returns a list of all the number sets having n digits.
It's slower than the loop version but is more general.  There
are fast C versions of this sort of thing out there, I think.
				Rick White


From falted at pytables.org  Thu Jul 22 02:47:27 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Jul 22 02:47:27 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest of the story
In-Reply-To: <p06110401bd22f702c420@[128.95.99.44]>
References: <BD22BAC0.E9EB%perry@stsci.edu> <p06110401bd22f702c420@[128.95.99.44]>
Message-ID: <200407221146.41319.falted@pytables.org>

Hi,

I agree that numarray team's overhaul of RecArray access modes is very good
and I agree most of it.

A Dimarts 20 Juliol 2004 19:14, Russell E Owen va escriure:
> I think recarray[field name] is too easily confused with 
> recarray[index] and is unnecessary.

Yeah, maybe you are right.

> I suggest one of two solutions:
> - Do nothing. Make users use field(field name or index)
> or
> - Allow access to the fields via an indexable entity. Simplest for 
> the user would be to use "field" itself:
>    recArr.field[1]
>    recArr.field["abc"]
> (i.e. field becomes an object that can be called or can be accessed 
> via __getitem__)

I prefer the second one. Although I know that you don't like the __getattr__
method, the field object can be used to host one. The main advantage I see
having such a __getattr__ method is that I'm very used to press TAB twice in
the python console with its completion capabilities activated. It would be a
very nice way of interactively discovering the fields of a RecArray object.
I don't know whether this feature is used a lot or not out there, but for me
is just great.  I understand, however, that having to include a map to
suport non-vbalid python names for field names can be quite inconvenient.

Regards,

-- 
Francesc Alted


From cjw at sympatico.ca  Thu Jul 22 05:22:01 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Thu Jul 22 05:22:01 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest of
 the story
In-Reply-To: <200407221146.41319.falted@pytables.org>
References: <BD22BAC0.E9EB%perry@stsci.edu> <p06110401bd22f702c420@[128.95.99.44]> <200407221146.41319.falted@pytables.org>
Message-ID: <40FFB132.10103@sympatico.ca>

Francesc Alted wrote:

>Hi,
>
>I agree that numarray team's overhaul of RecArray access modes is very good
>and I agree most of it.
>
>A Dimarts 20 Juliol 2004 19:14, Russell E Owen va escriure:
>  
>
>>I think recarray[field name] is too easily confused with 
>>recarray[index] and is unnecessary.
>>    
>>
>
>Yeah, maybe you are right.
>
>  
>
>>I suggest one of two solutions:
>>- Do nothing. Make users use field(field name or index)
>>or
>>- Allow access to the fields via an indexable entity. Simplest for 
>>the user would be to use "field" itself:
>>   recArr.field[1]
>>   recArr.field["abc"]
>>(i.e. field becomes an object that can be called or can be accessed 
>>via __getitem__)
>>    
>>
>
>I prefer the second one. Although I know that you don't like the __getattr__
>method, the field object can be used to host one. The main advantage I see
>having such a __getattr__ method is that I'm very used to press TAB twice in
>the python console with its completion capabilities activated. It would be a
>very nice way of interactively discovering the fields of a RecArray object.
>I don't know whether this feature is used a lot or not out there, but for me
>is just great.  I understand, however, that having to include a map to
>suport non-vbalid python names for field names can be quite inconvenient.
>
>Regards,
>  
>
Perry's issue 3.

Perhaps there is a need to separate the name or identifier of a column 
in a RecArray or a field in a Record from its label.  The labels, for 
display purposes, would default to the column names.  The column names 
would default, as at present, to the Cn form.

I like the use of attributes for the column names, it avoids the problem 
Russell Owen mentioned above.
Suppose we have a simple RecArray with the fields "name" and "age", it's 
much simpler to write rec.name or rec.age that rec["name"] or rec["age"].

The problems with the use of attributes, which must be Python names, are 
(1) they cannot have accented or special characters eg ?, ?, @, & * 
etc.  and (2) there is a danger of conflict with existing properties or 
attributes.  My guess is that the special characters would be required 
primarily for display purposes.  Thus, the label could meet that need.

The danger of conflict could be addressed by raising an exception.  
There remains a possible problem where identifiers are passed on from 
some other system, perhaps a database. 

Thus, the primary identifier of a row in a RecArray would be an integer 
index and that of a column or field would be a standard Python 
identifer.  Although, at times, it would be useful to be able to index 
the individual fields (or columns) as part of the usual indexing 
scheme.  Thus rec[2, 3, 4] could identify a record and rec[2, 3, 4].age 
or rec[2, 3, 4, 5] could identify the sixth field in that record.

The use of attributes raises the possibility that one could have nested 
records.  For example, suppose one has an address record:

addressRecord
   streetNumber
   streetName
   postalCode
   ...

There could then be a personal record:
personRecord
   ...
   officeAddress
   homeAddress
   ...

One could address a component as rec.homeAddress.postalCode.

Finally, there was mention, earlier in the discussion, of facilitating 
the indexing of a RecArray.  I hope that some way will be found to do this.

Colin W.


From kyeser at earthlink.net  Thu Jul 22 13:24:06 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Thu Jul 22 13:24:06 2004
Subject: [Numpy-discussion] Is there a better way to do this?
In-Reply-To: <Pine.GSO.4.44.0407212226350.3927-100000@sundog.stsci.edu>
References: <Pine.GSO.4.44.0407212226350.3927-100000@sundog.stsci.edu>
Message-ID: <FF40270C-DC1C-11D8-B198-000393479EE8@earthlink.net>

Thanks a lot everyone for suggestions.  On my slow machine (667 MHz), 
inefficient programs run even slower, and when I expand the program to 
calculate 9-digit numbers, there is almost a 2-minute difference!

Thanks again.

Best,

Kye


From sag at hydrosphere.com  Thu Jul 22 15:34:11 2004
From: sag at hydrosphere.com (sag at hydrosphere.com)
Date: Thu Jul 22 15:34:11 2004
Subject: [Numpy-discussion] Unpickling python 2.2 UserArray objs in python 2.3
Message-ID: <40FFF0A2.26467.FBF2E27@localhost>

I have a large bunch of objects that subclass UserArray from Numeric 
22.  These objects were created and pickled in binary mode in 
Python2.2 and stored in a mysql database on Red hat 8.  

Using Python2.2, I can easily retrieve and unpickle the objects.

I have just upgraded the system to Fedora Core 2 which supplies 
Python 2.3.3.  After much hassle, I have been able to compile Numeric 
1.0 (ver 23) and have tried to unpickle these objects.  Now, I get a 
failure in the loads call.  

The code is:
<retrieve blob from mysql>
import cPickle
obj = cPickle.loads(str(blob))

When this is called, the python interpreter (via IDLE) goes into a 
loop in the UserArray __getattr__ function.(line 198):
return getattr(self.array,attr)

>> File "/usr/lib/python2.3/site-packages/Numeric/UserArray.py" line 
198, in __getattr__
>>    return getattr(self.array,attr)

No other error is reported, just a stack full of these lines.

It seems that at this point, UserArray doesn't know that it has an 
'array' attr.

This worked just fine in Python2.2.

Has something changed in Python2.3 cPickle functions or in how 
Numeric 23 handles pickle/unpickle that would make my Python2.2 blobs 
unusable in Python 2.3?  Is there a solution for this, other than 
remaking my blobs (not an option - there are literally millions of 
them), or must I figure out how to access python2.2 for this code?

So far as I can tell, the string I get back is exactly the same for 
both versions.

Any help you can give me would be appreciated.  Thanks

sue giller


From kyeser at earthlink.net  Fri Jul 23 07:31:07 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Fri Jul 23 07:31:07 2004
Subject: [Numpy-discussion] A bit long, but would appreciate anyone's help, if time permits!
Message-ID: <E1393D5C-DCB4-11D8-A3FC-000393479EE8@earthlink.net>

Hi.  Like my previous post, my question is not directly related to 
Numpy, but I couldn't help posting it since many people here deal with 
numbers.  I have a question that requires a bit of explanation.  I 
would highly appreciate it if anyone could read this and offer any 
suggestions, whenever time permits.

I'm trying to write a program that 1) gives all possible rotations of 
an ordered list, 2) chooses the ordering that has the smallest 
difference from first to last element of the rotation, and 3) continues 
to compare the difference from first to second-to-last element, and so 
on, if there was a tie in step 2.

The following is the output of a function I wrote.  The first 6 lines 
are all possible rotations of [0,1,3,6,7,10], and this takes care of 
step 1 mentioned above.  The last line provides the differences (mod 
12).  If the last line were denoted as r, r[0] lists the differences 
from first to last element of each rotation (p0 through p5), r[1] the 
differences from first to second-to-last element, and so on.

 >>> from normal import normal
 >>> normal([0,1,3,6,7,10])
[0, 1, 3, 6, 7, 10]	#p0
[1, 3, 6, 7, 10, 0]	#p1
[3, 6, 7, 10, 0, 1]	#p2
[6, 7, 10, 0, 1, 3]	#p3
[7, 10, 0, 1, 3, 6]	#p4
[10, 0, 1, 3, 6, 7]	#p5

[[10, 11, 10, 9, 11, 9], [7, 9, 9, 7, 8, 8], [6, 6, 7, 6, 6, 5], [3, 5, 
4, 4, 5, 3], [1, 2, 3, 1, 3, 2]]     #r

Here is my question.  I'm having trouble realizing step 2 (and 3, if 
necessary).  In the above case, the smallest number in r[0] is 9, which 
is present in both r[0][3] and r[0][5].  This means that p3 and p5 and 
only p3 and p5 need to be further compared.  r[1][3] is 7, and r[1][5] 
is 8, so the comparison ends here, and the final result I'm looking for 
is p3, [6,7,10,0,1,3] (the final 'n' value for 'pn' corresponds to the 
final 'y' value for 'r[x][y]').

How would I find the smallest values of a list r[0], take only those 
values (r[0][3] and r[0][5]) for further comparison (r[1][3] and 
r[1][5]), and finally print a p3?

Thanks again for reading this.  If there is anything unclear, please 
let me know.

Best,
Kye

My code begins here:
#normal.py
def normal(s):
	s.sort()
	r = []
	q = []
	v = []

	for x in range(0, len(s)):
		k = s[x:]+s[0:x]
		r.append(k)

	for y in range(0, len(s)):
		print r[y], '\t'
		d = []
		for yy in range(len(s)-1, 0, -1):
			w = (r[y][yy]-r[y][0])%12
			d.append(w)
		q.append(d)

	for z in range(0, len(s)-1):
		d = []
		for zz in range(0, len(s)):
			w = q[zz][z]
			d.append(w)
		v.append(d)
	print '\n', v


From sag at hydrosphere.com  Fri Jul 23 10:09:11 2004
From: sag at hydrosphere.com (sag at hydrosphere.com)
Date: Fri Jul 23 10:09:11 2004
Subject: [Numpy-discussion] re: Unpickling python 2.2 userArray objs in python 2.3
Message-ID: <4100F5DD.17007.13BB9C82@localhost>

I have further information on my problem of unpickling an object that 
is based on Numeric.UserArray class.

I can recreate the endless getattr loop with the following code, 
which is a small subsection of my class:

data = Numeric.ones(31,savespace=1)
ua = UserArray(data)
blob = cPickle.dumps(ua)
obj = cPickle.loads(blob)      <-- fails here

If you pickle the data obj, everything works.  This code works in 
Python2.2.

Is this a bug?  Is it fixable?

sue


From jmiller at stsci.edu  Fri Jul 23 10:30:15 2004
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jul 23 10:30:15 2004
Subject: [Numpy-discussion] Follow-up Numarray header PEP
In-Reply-To: <20040718212443.M21561@grenoble.cnrs.fr>
References: <1088451653.3744.200.camel@localhost.localdomain>
	 <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr>
	 <1088536183.17789.346.camel@halloween.stsci.edu>
	 <20040629211800.M55753@grenoble.cnrs.fr>
	 <1088632459.7526.213.camel@halloween.stsci.edu>
	 <20040718212443.M21561@grenoble.cnrs.fr>
Message-ID: <1090603727.7138.33.camel@halloween.stsci.edu>

Hi Gerard,

I finally got to your numnum stuff today... awesome work!  You've got
lots of good suggestions.  Here are some comments:

1. Thanks for catching the early return problem with numarray's
import_array().  It's not just bad, it's wrong.  It'll be fixed for 1.1.

2. That said,  I think expanding the macros in-line in numnum is a
mistake.  It seems to me that "import_array(); PyErr_Clear();" or
something like it ought to be enough...  after numarray-1.1 anyway.  

3. I think there's a problem in numnum.toNP() because of numarray's
array "behavior" issues.  A test needs to be done to ensure that the
incoming array is not byteswapped or misaligned;  if it is, the easy fix
is to make a numarray copy of the array before copying it to Numeric.

4. Kudos for the LP64 stuff.  numconfig is a thorn in the side of the
PEP,  so I'll put your techniques into numarray for 1.1.  HAS_FLOAT128
is not currently used,  so it might be time to ditch it.  Anyway,
thanks!

5. PyArray_Present() and isArray() are superfluous *now*.  I was
planning to add them to Numeric. 

6. The LGPL may be a problem for us and is probably an issue if we ever
try to get numnum into the Python distribution.  It would be better to
release numnum under the modified BSD license,  same as numarray.

7. Your API struct was very clean.  Eventually I'll regenerate numarray
like that.

8. I logged your comments and bug reports on Source Forge and eventually
they'll get fixed.

A to Z the numnum/pep code is beautiful.  Next stop, header PEP update.

Regards,
Todd

On Sun, 2004-07-18 at 17:24, gerard.vermeulen at grenoble.cnrs.fr wrote: 
> Hi Todd,
> 
> This is a follow-up on the 'header pep' discussion.
> 
> The attachment numnum-0.1.tar.gz contains the sources for the
> extension modules pep and numnum.  At least on my systems, both
> modules behave as described in the 'numarray header PEP' when the
> extension modules implementing the C-API are not present (a situation
> not foreseen by the macros import_array() of Numeric and especially
> numarray).  IMO, my solution is 'bona fide', but requires further
> testing.
> 
> The pep module shows how to handle the colliding C-APIs of the Numeric
> and numarray extension modules and how to implement automagical
> conversion between Numeric and numarray arrays.
> 
> For a technical reason explained in the README, the hard work of doing
> the conversion between Numeric and numarray arrays has been delegated
> to the numnum module.  The numnum module is useful when one needs to
> convert from one array type to the other to use an extension module
> which only exists for the other type (eg. combining numarray's image
> processing extensions with pygame's Numeric interface):
> 
> Python 2.3+ (#1, Jan  7 2004, 09:17:35)
> [GCC 3.3.1 (SuSE Linux)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numnum; import Numeric as np; import numarray as na
> >>> np1 = np.array([[1, 2], [3, 4]]); na1 = numnum.toNA(np1)
> >>> na2 = na.array([[1, 2, 3], [4, 5, 6]]); np2 = numnum.toNP(na2)
> >>> print type(np1); np1; type(np2); np2
> <type 'array'>
> array([[1, 2],
>        [3, 4]])
> <type 'array'>
> array([[1, 2, 3],
>        [4, 5, 6]],'i')
> >>> print type(na1); na1; type(na2); na2
> <class 'numarray.numarraycore.NumArray'>
> array([[1, 2],
>        [3, 4]])
> <class 'numarray.numarraycore.NumArray'>
> array([[1, 2, 3],
>        [4, 5, 6]])
> >>>
> 
> The pep module shows how to implement array processing functions which
> use the Numeric, numarray or Sequence C-API:
> 
> static PyObject *
> wysiwyg(PyObject *dummy, PyObject *args)
> {
>     PyObject *seq1, *seq2;
>     PyObject *result;
> 
>     if (!PyArg_ParseTuple(args, "OO", &seq1, &seq2))
>         return NULL;
> 
>     switch(API) {
>     case NumericAPI:
>     {
>         PyObject *np1 = NN_API->toNP(seq1);
>         PyObject *np2 = NN_API->toNP(seq2);
>         result = np_wysiwyg(np1, np2);
>         Py_XDECREF(np1);
>         Py_XDECREF(np2);
>         break;
>     }
>     case NumarrayAPI:
>     {
>         PyObject *na1 = NN_API->toNA(seq1);
>         PyObject *na2 = NN_API->toNA(seq2);
>         result = na_wysiwyg(na1, na2);
>         Py_XDECREF(na1);
>         Py_XDECREF(na2);
>         break;
>     }
>     case SequenceAPI:
>         result = seq_wysiwyg(seq1, seq2);
>         break;
>     default:
>         PyErr_SetString(PyExc_RuntimeError, "Should never happen");
>         return 0;
>     }
> 
>     return result;
> }
> 
> See the README for an example session using the pep module showing that
> it is possible pass a mix of Numeric and numarray arrays to pep.wysiwyg().
> 
> Notes:
> 
> - it is straightforward to adapt pep and numnum so that the conversion
>   functions are linked into pep instead of imported.
> 
> - numnum is still 'proof of concept'.  I am thinking about methods to
>   make those techniques safer if the numarray (and Numeric?) header
>   files make it never into the Python headers (or make it safer to
>   use those techniques with Python < 2.4).  In particular it would
>   be helpful if the numerical C-APIs export an API version number,
>   similar to the versioning scheme of shared libraries -- see the
>   libtool->versioning info pages. 
> 
> I am considering three possibilities to release a more polished
> version of numnum (3rd party extension writers may prefer to link
> rather than import numnum's functionality):
> 
> 1. release it from PyQwt's project page
> 2. register an independent numnum project at SourceForge
> 3. hand numnum over to the Numerical Python project (frees me from
>    worrying about API changes).
> 
> 
> Regards -- Gerard Vermeulen


-- 


From eric at enthought.com  Fri Jul 23 10:56:07 2004
From: eric at enthought.com (eric jones)
Date: Fri Jul 23 10:56:07 2004
Subject: [Numpy-discussion] ANN: SciPy04 -- Last day for abstracts and early registration!
Message-ID: <4101510B.9050005@enthought.com>

Hey Group,

Just a reminder that this is the last day to submit abstracts for 
SciPy04.  It is also the last day for early registration.

More information is here:

http://www.scipy.org/wikis/scipy04

About the Conference and Keynote Speaker
---------------------------------------------
The 1st annual *SciPy Conference* will be held this year at Caltech, 
September 2-3, 2004.  As some of you may know, we've experienced great 
participation in two SciPy "Workshops" (with ~70 attendees in both 2002 
and 2003) and this year we're graduating to a "conference."  With the 
prestige of a conference comes the responsibility of a keynote address.  
This year, Jim Hugunin has answered the call and will be speaking to 
kickoff the meeting on Thursday September 2nd.  Jim is the creator of 
Numeric Python, Jython, and co-designer of AspectJ. Jim is currently 
working on IronPython--a fast implementation of Python for .NET and Mono.

Presenters
-----------
We still have room for a few more standard talks, and there is plenty of 
room for lightning talks. Because of this, we are extending the abstract 
deadline until July 23rd.  Please send your abstract to 
abstracts at scipy.org.  Travis Oliphant is organizing the presentations 
this year. (Thanks!)  Once accepted, papers and/or presentation slides 
are acceptable and are due by August 20, 2004.

Registration
-------------
Early registration ($100.00) has been extended to July 23rd.  Follow the 
links off of the main conference site:

http://www.scipy.org/wikis/scipy04

After July 23rd, registration will be $150.00.  Registration includes 
breakfast and lunch Thursday & Friday and a very nice dinner Thursday 
night.  Please register as soon as possible as it will help us in 
planning for food, room sizes, etc.

Sprints
--------
As of now, we really haven't had much of a call for coding sprints for 
the 3 days prior to SciPy 04.  Below is the original announcement about 
sprints.  If you would like to suggest a topic and see if others are 
interested, please send a message to the list.  Otherwise, we'll forgo 
the sprints session this year.

   We're also planning three days of informal "Coding Sprints" prior to
   the conference -- August 30 to September 1, 2004.  Conference
   registration is not required to participate in the sprints.  Please
   email the list, however, if you plan to attend.  Topics for these
   sprints will be determined via the mailing lists as well, so please
   submit any suggestions for topics to the scipy-user list:

   list signup: http://www.scipy.org/mailinglists/
   list address: scipy-user at scipy.org


thanks,
eric


From cjw at sympatico.ca  Sat Jul 24 07:18:04 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Sat Jul 24 07:18:04 2004
Subject: [Numpy-discussion] A bit long, but would appreciate anyone's
 help, if time permits!
In-Reply-To: <E1393D5C-DCB4-11D8-A3FC-000393479EE8@earthlink.net>
References: <E1393D5C-DCB4-11D8-A3FC-000393479EE8@earthlink.net>
Message-ID: <41026F91.3090706@sympatico.ca>

Hee-Seng Kye wrote:

> Hi.  Like my previous post, my question is not directly related to Numpy, 

True, but numarray can be of help.

> but I couldn't help posting it since many people here deal with 
> numbers.  I have a question that requires a bit of explanation.  I 
> would highly appreciate it if anyone could read this and offer any 
> suggestions, whenever time permits.
>
> I'm trying to write a program that 1) gives all possible rotations of 
> an ordered list, 2) chooses the ordering that has the smallest 
> difference from first to last element of the rotation, and 3) 
> continues to compare the difference from first to second-to-last 
> element, and so on, if there was a tie in step 2.
>
> The following is the output of a function I wrote.  The first 6 lines 
> are all possible rotations of [0,1,3,6,7,10], and this takes care of 
> step 1 mentioned above.  The last line provides the differences (mod 
> 12).  If the last line were denoted as r, r[0] lists the differences 
> from first to last element of each rotation (p0 through p5), r[1] the 
> differences from first to second-to-last element, and so on.
>
> >>> from normal import normal
> >>> normal([0,1,3,6,7,10])
> [0, 1, 3, 6, 7, 10]    #p0
> [1, 3, 6, 7, 10, 0]    #p1
> [3, 6, 7, 10, 0, 1]    #p2
> [6, 7, 10, 0, 1, 3]    #p3
> [7, 10, 0, 1, 3, 6]    #p4
> [10, 0, 1, 3, 6, 7]    #p5
>
> [[10, 11, 10, 9, 11, 9], [7, 9, 9, 7, 8, 8], [6, 6, 7, 6, 6, 5], [3, 
> 5, 4, 4, 5, 3], [1, 2, 3, 1, 3, 2]]     #r
>
> Here is my question.  I'm having trouble realizing step 2 (and 3, if 
> necessary).  In the above case, the smallest number in r[0] is 9, 
> which is present in both r[0][3] and r[0][5].  This means that p3 and 
> p5 and only p3 and p5 need to be further compared.  r[1][3] is 7, and 
> r[1][5] is 8, so the comparison ends here, and the final result I'm 
> looking for is p3, [6,7,10,0,1,3] (the final 'n' value for 'pn' 
> corresponds to the final 'y' value for 'r[x][y]').
>
> How would I find the smallest values of a list r[0], take only those 
> values (r[0][3] and r[0][5]) for further comparison (r[1][3] and 
> r[1][5]), and finally print a p3?
>
> Thanks again for reading this.  If there is anything unclear, please 
> let me know.
>
> Best,
> Kye
>
> My code begins here: 

[snip]
The following reproduces your result, but I'm not sure that it does what 
you want to do.

Best wishes.

Colin W.

# Kye.py
#normal.py
def normal(s):
    s.sort()
    r = []
    q = []
    v = []

    for x in range(0, len(s)):
        k = s[x:]+s[0:x]
        r.append(k)

    for y in range(0, len(s)):
        print r[y], '\t'
        d = []
        for yy in range(len(s)-1, 0, -1):
            w = (r[y][yy]-r[y][0])%12
            d.append(w)
        q.append(d)

    for z in range(0, len(s)-1):
        d = []
        for zz in range(0, len(s)):
            w = q[zz][z]
            d.append(w)
        v.append(d)
    print '\n', v

def findMinima(i, lst):
  global diff
  print 'lst:', lst, 'i:', i
  res= []
  dataRow= diff[i].take(lst)
  fnd= dataRow.argmin()
  val= val0= dataRow[fnd]
  while val == val0:
    fndRes= lst[fnd]                  # This will become the result iff 
no dupicate found
    res.append(fnd)
    dataRow[fnd]= 100
    fnd= dataRow.argmin()
    val0= dataRow[fnd]
  if len(res) == 1:
    return fndRes
  else:
    ret= findMinima(i-1, res) 
    return ret

def normal1(s):
  import numarray.numarraycore as _num
  import numarray.numerictypes as _nt
  global diff
  s= _num.array(s)
  s.sort()
  rl= len(s)
  r= _num.zeros(shape= (rl, rl), type= _nt.Int)
  for i in range(rl):
    r[i, 0:rl-i]= s[i:]
    if i:
      r[i, rl-i:]= s[0:i]
  subtr= r[0].repeat(5, 1).resize(6, 5)
  subtr.transpose()
  neg= r[1:] < subtr
  diff= r[1:]-subtr + 12 * neg
 
  return 'The selectect rotation is:', r[findMinima(diff._shape[0]-1, 
range(diff._shape[1]))]

if __name__ == '__main__':
  print normal1([0,1,3,6,7,10])


>
> #normal.py
> def normal(s):
>     s.sort()
>     r = []
>     q = []
>     v = []
>
>     for x in range(0, len(s)):
>         k = s[x:]+s[0:x]
>         r.append(k)
>
>     for y in range(0, len(s)):
>         print r[y], '\t'
>         d = []
>         for yy in range(len(s)-1, 0, -1):
>             w = (r[y][yy]-r[y][0])%12
>             d.append(w)
>         q.append(d)
>
>     for z in range(0, len(s)-1):
>         d = []
>         for zz in range(0, len(s)):
>             w = q[zz][z]
>             d.append(w)
>         v.append(d)
>     print '\n', v
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by BEA Weblogic Workshop
> FREE Java Enterprise J2EE developer tools!
> Get your free copy of BEA WebLogic Workshop 8.1 today.
> http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From riiuwjjnivge at yahoo.com  Sat Jul 24 08:38:04 2004
From: riiuwjjnivge at yahoo.com (riiuwjjnivge at yahoo.com)
Date: Sat Jul 24 08:38:04 2004
Subject: [Numpy-discussion] Hot Stock Newsflash, ARMM expecting Mass|ve M0nday Ga1ns R753KT98
Message-ID: <249974lbl4oi11j$1so1q6g39$95a678wba@airmen.yahoo.com>

E.fficiency Technologies, Inc.'s New Centrif.ugal Chiller Efficiency and Management Tool Can He.lp S.ave Industry Bi.llions in Energy C.osts 

ARMM lau.nch n.ew s.ervice (EffHVAC)

D.ont miss this g.reat inves.tment issue! ARMM is another ho.t public tr.aded comp.any that is set to so.ar on Monday, July 26th..


BIG PR camp.aign sta.rting on 26th of July for ARMM - S.t0ck will e.xpl0de - Just read the news 

---------------------
P.rice on Friday: 10Cents
In our o.pinion N.ext 3 days p.otential p.rice: 35Cents
In our o.pinion N.ext 10 days p.otential p.rice: 45Cents
---------------------

G.et on B.oard with ARMM and e.njoy some i.ncredible p.rofits in the n.ext 3-10 days_!_!

ALL T.ECHNICAL I.NDICATORS SAY - B.U.Y ARMM @ up to 35cents!

Significant short term t.rading p.rofits in ARMM are being p.redicted, great n.ews a.lready issued by the c.ompany and big PR c.ampaign on the way in the n.ext few days.
 

C.OMPANY P.ROFILE
-------------->
American Resource Management, Inc., through its w.holly-owned s.ubsidiary, E.fficiency T.echnologies, Inc. ("EffTec") is a Tulsa, Oklahoma based c.ompany d.edicated to developing energy efficiency m.onitoring programs for c.ommercial/i.ndustrial HVAC systems principally made up of c.entrifugal chillers and boilers.  Centrifugal chillers are the single largest energy-using components in most facilities and can typically consume more than 50% of the total electrical usage.  Centrifugal chillers running inefficiently result in substantially higher e.nergy c.osts, decreased equipment reliability and shortened l.ifespan.


EffTec has developed a p.owerful, easy-to-use, online d.iagnostic s.ervice called EffHVAC that gives f.acilities the a.bility to document, m.onitor, e.valuate and m.anage c.entrifugal c.hiller system p.erformance.  EffHVAC c.reated detailed reports that contain a w.ealth of i.nformation that can be used to improve operations and save t.housands of d.ollars in u.tility c.osts.
 

EffTec offers c.omprehensive and f.lexible HVAC consulting and training.  Our t.eam consists of industry-recognized e.xperts in HVAC system design, efficiency, preventive and proactive maintenance, repair, chemistry, computer programming and m.arketing.  Combine EffHVAC with our consulting services and start d.eveloping a w.orld-class HVAC program to improve your b.ottom line.  

   
Inform.ation within this email contains "f.orward look.ing state.ments" within the meaning of Sect.ion 27A of the Sec.urities Ac.t of 1933 and Sect.ion 21B of the Securit.ies Exc.hange Ac.t of 1934. Any stat.ements that express or involve discu.ssions with resp.ect to pre.dictions, goa.ls, expec.tations, be.liefs, pl.ans, proje.ctions, object.ives, assu.mptions or fut.ure eve.nts or perform.ance are not stat.ements of histo.rical fact and may be "forw.ard loo.king stat.ements."  For.ward looking state.ments are based on expect.ations, estim.ates and project.ions at the time the statem.ents are made that involve a number of risks and uncertainties which could cause actual results or events to differ materially from those prese.ntly anticipated. Forward look.ing statements in this action may be identified through the use of words su.ch as: "pro.jects", "for.esee", "expects", "est.imates," "be.lieves," "underst.ands" "wil.l," "part of: "anticip.ates," or that by stat.ements indi.cating certain actions "may,"
"cou.ld," or "might" occur. All information provided within this em.ail pertai.ning to inv.esting, st.ocks, securi.ties must be under.stood as informa.tion provided and not investm.ent advice. Eme.rging Equity Al.ert advi.ses all re.aders and subscrib.ers to seek advice from a registered profe.ssional secu.rities represent.ative before dec.iding to trade in sto.cks featured within this ema.il. None of the mate.rial within this rep.ort shall be constr.ued as any kind of invest.ment advi.ce. Please have in mind that the interpr.etation of the witer of this newsl.etter about the news published by the company does not represent the com.pany official sta.tement and in fact may differ from the real meaning of what the news rele.ase meant to say. Please read the news release by your.self and judge by yourself about the detai.ls in it.  In compli.ance with Sec.tion 17(b), we discl.ose the hol.ding of ARMM s.hares prior to the publi.cation of this report. Be aware of an inher.ent co.nflict of interest res.ulting from such holdi.ngs due to our intent to pro.fit from the liqui.dation of these shares. Sh.ares may be s.old at any time, even after posi.tive state.ments have been made regard.ing the above company. Since we own sh.ares, there is an inher.ent conf.lict of inte.rest in our statem.ents and opin.ions. Readers of this publi.cation are cauti.oned not to place und.ue relia.nce on forw.ard-looki.ng statements, which are based on certain assump.tions and expectati.ons invo.lving various risks and uncert.ainties, that could cause results to differ materi..ally from those set forth in the forw.ard- looking state.ments. Please be advi.sed that noth.ing within this em.ail shall cons.titute a solic.itation or an offer to buy or sell any s.ecurity menti.oned her.ein. This news.letter is neither a regi.stered inves.tment ad.visor nor affil.iated with any brok.er or dealer. All statements made are our e.xpress o.pinion only and should be treated as such. We may own, buy and sell any securi.ties menti.oned at any time. This r.eport includes forw.ard-looki.ng stat.ements within the meaning of The Pri.vate Securi.ties Litig.ation Ref.orm Ac.t of 1995. These state.ments may include terms as "expe.ct", "bel.ieve", "ma.y", "wi.ll", "mo.ve","und.ervalued" and "inte.nd" or simil.ar terms. This news.letter was paid 11500 dollars from th.ird p.arty to se.nd this report. PL.EASE DO YOUR OWN D.UE DI.LIGENCE B.EFORE INVES.TING IN ANY PRO.FILED COMP.ANY. You may lo.se mon.ey from inve.sting in Pen.ny St.ocks.


A_RM_M - our NEW st<o>ck pick - GREAT N.EWS V650OE49
>A.RMM - our NEW s_t_0_c_k p1ck = GREAT N_E_WS V3501136  
NnnEW St_ock Pick - Hug.e Mon-day - /ArMm\   m468MV68
NewW Stoc-k Pick + Hug.e Mon-day - ArMm = Earn_1ngs 1497cJ72  
Mas_sive G.a1ns - F0r-casted For Mond#y g984iJ69 
Monday F0rcaSST is A>R.M.M - Read & Earnn  Z8697B79
In-Creased Earn-ings Report - AR-MM - For Monday Morning l547BH81 
EX PLO SIVE Gain-s - ALERT for MONDAY T288xC38 
NewsWire - Double your Monday Earn>ings!   q664qv16
A,L,E,R,T - A>R>M>M- This st0ck is h0t - They announced great news  l993L941
A>R<M>M is about to EXPL0DE - A c t n_o_w Z484TE26 
<A/R/M/M> - Ma-jor TradeeE Al_ert! !e330vH15
1O to 2O cent in=crease monday. Ma_jor ALer.t. c8620c55 
New P1ck Bownd to Dou_ble & Tri_ple. A.R/M.M.. I942qD93 
B1gGa1ns For-M0nday = (2X)Double Your Pr0fits!y747s506 
UpCOMING Mondays Hot/test St O CK {2x} PROF!TS L572lS00
Get Ins1ders SEcrEt_s - A|R|M|M Sets to Expl0de U812Jb41 
Ab0ut To Expl0de - <AR.MM>  y142qK13   
Hot Stock Newsflash, ARMM expecting Mass|ve M0nday Ga1ns 7074WE36 
M0nday Ga1ns, *ARMM*, St0ck NewsW1re g504mo93  
{3x} Ur m0nDay Pr0FITS - A\R\M\M w433T229
Break.ing New.s for ARM.M -  American Resource Management, Inc.


E.fficiency Technologies, Inc.'s New Centrif.ugal Chiller Efficiency and Management Tool Can He.lp S.ave Industry Bi.llions in Energy C.osts 

ARMM lau.nch n.ew s.ervice (EffHVAC)

D.ont miss this g.reat inves.tment issue! ARMM is another ho.t public tr.aded comp.any that is set to so.ar on Monday, July 26th..


BIG PR camp.aign sta.rting on 26th of July for ARMM - S.t0ck will e.xpl0de - Just read the news 

---------------------
P.rice on Friday: 10Cents
In our o.pinion N.ext 3 days p.otential p.rice: 35Cents
In our o.pinion N.ext 10 days p.otential p.rice: 45Cents
---------------------

G.et on B.oard with ARMM and e.njoy some i.ncredible p.rofits in the n.ext 3-10 days_!_!

ALL T.ECHNICAL I.NDICATORS SAY - B.U.Y ARMM @ up to 35cents!

Significant short term t.rading p.rofits in ARMM are being p.redicted, great n.ews a.lready issued by the c.ompany and big PR c.ampaign on the way in the n.ext few days.
 

C.OMPANY P.ROFILE
-------------->
American Resource Management, Inc., through its w.holly-owned s.ubsidiary, E.fficiency T.echnologies, Inc. ("EffTec") is a Tulsa, Oklahoma based c.ompany d.edicated to developing energy efficiency m.onitoring programs for c.ommercial/i.ndustrial HVAC systems principally made up of c.entrifugal chillers and boilers.  Centrifugal chillers are the single largest energy-using components in most facilities and can typically consume more than 50% of the total electrical usage.  Centrifugal chillers running inefficiently result in substantially higher e.nergy c.osts, decreased equipment reliability and shortened l.ifespan.


EffTec has developed a p.owerful, easy-to-use, online d.iagnostic s.ervice called EffHVAC that gives f.acilities the a.bility to document, m.onitor, e.valuate and m.anage c.entrifugal c.hiller system p.erformance.  EffHVAC c.reated detailed reports that contain a w.ealth of i.nformation that can be used to improve operations and save t.housands of d.ollars in u.tility c.osts.
 

EffTec offers c.omprehensive and f.lexible HVAC consulting and training.  Our t.eam consists of industry-recognized e.xperts in HVAC system design, efficiency, preventive and proactive maintenance, repair, chemistry, computer programming and m.arketing.  Combine EffHVAC with our consulting services and start d.eveloping a w.orld-class HVAC program to improve your b.ottom line.  

   
Inform.ation within this email contains "f.orward look.ing state.ments" within the meaning of Sect.ion 27A of the Sec.urities Ac.t of 1933 and Sect.ion 21B of the Securit.ies Exc.hange Ac.t of 1934. Any stat.ements that express or involve discu.ssions with resp.ect to pre.dictions, goa.ls, expec.tations, be.liefs, pl.ans, proje.ctions, object.ives, assu.mptions or fut.ure eve.nts or perform.ance are not stat.ements of histo.rical fact and may be "forw.ard loo.king stat.ements."  For.ward looking state.ments are based on expect.ations, estim.ates and project.ions at the time the statem.ents are made that involve a number of risks and uncertainties which could cause actual results or events to differ materially from those prese.ntly anticipated. Forward look.ing statements in this action may be identified through the use of words su.ch as: "pro.jects", "for.esee", "expects", "est.imates," "be.lieves," "underst.ands" "wil.l," "part of: "anticip.ates," or that by stat.ements indi.cating certain actions "may,"
"cou.ld," or "might" occur. All information provided within this em.ail pertai.ning to inv.esting, st.ocks, securi.ties must be under.stood as informa.tion provided and not investm.ent advice. Eme.rging Equity Al.ert advi.ses all re.aders and subscrib.ers to seek advice from a registered profe.ssional secu.rities represent.ative before dec.iding to trade in sto.cks featured within this ema.il. None of the mate.rial within this rep.ort shall be constr.ued as any kind of invest.ment advi.ce. Please have in mind that the interpr.etation of the witer of this newsl.etter about the news published by the company does not represent the com.pany official sta.tement and in fact may differ from the real meaning of what the news rele.ase meant to say. Please read the news release by your.self and judge by yourself about the detai.ls in it.  In compli.ance with Sec.tion 17(b), we discl.ose the hol.ding of ARMM s.hares prior to the publi.cation of this report. Be aware of an inher.ent co.nflict of interest res.ulting from such holdi.ngs due to our intent to pro.fit from the liqui.dation of these shares. Sh.ares may be s.old at any time, even after posi.tive state.ments have been made regard.ing the above company. Since we own sh.ares, there is an inher.ent conf.lict of inte.rest in our statem.ents and opin.ions. Readers of this publi.cation are cauti.oned not to place und.ue relia.nce on forw.ard-looki.ng statements, which are based on certain assump.tions and expectati.ons invo.lving various risks and uncert.ainties, that could cause results to differ materi..ally from those set forth in the forw.ard- looking state.ments. Please be advi.sed that noth.ing within this em.ail shall cons.titute a solic.itation or an offer to buy or sell any s.ecurity menti.oned her.ein. This news.letter is neither a regi.stered inves.tment ad.visor nor affil.iated with any brok.er or dealer. All statements made are our e.xpress o.pinion only and should be treated as such. We may own, buy and sell any securi.ties menti.oned at any time. This r.eport includes forw.ard-looki.ng stat.ements within the meaning of The Pri.vate Securi.ties Litig.ation Ref.orm Ac.t of 1995. These state.ments may include terms as "expe.ct", "bel.ieve", "ma.y", "wi.ll", "mo.ve","und.ervalued" and "inte.nd" or simil.ar terms. This news.letter was paid 11500 dollars from th.ird p.arty to se.nd this report. PL.EASE DO YOUR OWN D.UE DI.LIGENCE B.EFORE INVES.TING IN ANY PRO.FILED COMP.ANY. You may lo.se mon.ey from inve.sting in Pen.ny St.ocks.


barycentric deform conservator cacophony critter addison armament complain difluoride boris discriminatory boron abo deoxyribose boorish compote belfast carolingian court albania accentuate belshazzar bridesmaid breakwater brandish average bolshevism coppery 


From riiuwjjnivge at yahoo.com  Sat Jul 24 08:42:06 2004
From: riiuwjjnivge at yahoo.com (riiuwjjnivge at yahoo.com)
Date: Sat Jul 24 08:42:06 2004
Subject: [Numpy-discussion] Hot Stock Newsflash, ARMM expecting Mass|ve M0nday Ga1ns R753KT98
Message-ID: <249974lbl4oi11j$1so1q6g39$95a678wba@airmen.yahoo.com>

E.fficiency Technologies, Inc.'s New Centrif.ugal Chiller Efficiency and Management Tool Can He.lp S.ave Industry Bi.llions in Energy C.osts 

ARMM lau.nch n.ew s.ervice (EffHVAC)

D.ont miss this g.reat inves.tment issue! ARMM is another ho.t public tr.aded comp.any that is set to so.ar on Monday, July 26th..


BIG PR camp.aign sta.rting on 26th of July for ARMM - S.t0ck will e.xpl0de - Just read the news 

---------------------
P.rice on Friday: 10Cents
In our o.pinion N.ext 3 days p.otential p.rice: 35Cents
In our o.pinion N.ext 10 days p.otential p.rice: 45Cents
---------------------

G.et on B.oard with ARMM and e.njoy some i.ncredible p.rofits in the n.ext 3-10 days_!_!

ALL T.ECHNICAL I.NDICATORS SAY - B.U.Y ARMM @ up to 35cents!

Significant short term t.rading p.rofits in ARMM are being p.redicted, great n.ews a.lready issued by the c.ompany and big PR c.ampaign on the way in the n.ext few days.
 

C.OMPANY P.ROFILE
-------------->
American Resource Management, Inc., through its w.holly-owned s.ubsidiary, E.fficiency T.echnologies, Inc. ("EffTec") is a Tulsa, Oklahoma based c.ompany d.edicated to developing energy efficiency m.onitoring programs for c.ommercial/i.ndustrial HVAC systems principally made up of c.entrifugal chillers and boilers.  Centrifugal chillers are the single largest energy-using components in most facilities and can typically consume more than 50% of the total electrical usage.  Centrifugal chillers running inefficiently result in substantially higher e.nergy c.osts, decreased equipment reliability and shortened l.ifespan.


EffTec has developed a p.owerful, easy-to-use, online d.iagnostic s.ervice called EffHVAC that gives f.acilities the a.bility to document, m.onitor, e.valuate and m.anage c.entrifugal c.hiller system p.erformance.  EffHVAC c.reated detailed reports that contain a w.ealth of i.nformation that can be used to improve operations and save t.housands of d.ollars in u.tility c.osts.
 

EffTec offers c.omprehensive and f.lexible HVAC consulting and training.  Our t.eam consists of industry-recognized e.xperts in HVAC system design, efficiency, preventive and proactive maintenance, repair, chemistry, computer programming and m.arketing.  Combine EffHVAC with our consulting services and start d.eveloping a w.orld-class HVAC program to improve your b.ottom line.  

   
Inform.ation within this email contains "f.orward look.ing state.ments" within the meaning of Sect.ion 27A of the Sec.urities Ac.t of 1933 and Sect.ion 21B of the Securit.ies Exc.hange Ac.t of 1934. Any stat.ements that express or involve discu.ssions with resp.ect to pre.dictions, goa.ls, expec.tations, be.liefs, pl.ans, proje.ctions, object.ives, assu.mptions or fut.ure eve.nts or perform.ance are not stat.ements of histo.rical fact and may be "forw.ard loo.king stat.ements."  For.ward looking state.ments are based on expect.ations, estim.ates and project.ions at the time the statem.ents are made that involve a number of risks and uncertainties which could cause actual results or events to differ materially from those prese.ntly anticipated. Forward look.ing statements in this action may be identified through the use of words su.ch as: "pro.jects", "for.esee", "expects", "est.imates," "be.lieves," "underst.ands" "wil.l," "part of: "anticip.ates," or that by stat.ements indi.cating certain actions "may,"
"cou.ld," or "might" occur. All information provided within this em.ail pertai.ning to inv.esting, st.ocks, securi.ties must be under.stood as informa.tion provided and not investm.ent advice. Eme.rging Equity Al.ert advi.ses all re.aders and subscrib.ers to seek advice from a registered profe.ssional secu.rities represent.ative before dec.iding to trade in sto.cks featured within this ema.il. None of the mate.rial within this rep.ort shall be constr.ued as any kind of invest.ment advi.ce. Please have in mind that the interpr.etation of the witer of this newsl.etter about the news published by the company does not represent the com.pany official sta.tement and in fact may differ from the real meaning of what the news rele.ase meant to say. Please read the news release by your.self and judge by yourself about the detai.ls in it.  In compli.ance with Sec.tion 17(b), we discl.ose the hol.ding of ARMM s.hares prior to the publi.cation of this report. Be aware of an inher.ent co.nflict of interest res.ulting from such holdi.ngs due to our intent to pro.fit from the liqui.dation of these shares. Sh.ares may be s.old at any time, even after posi.tive state.ments have been made regard.ing the above company. Since we own sh.ares, there is an inher.ent conf.lict of inte.rest in our statem.ents and opin.ions. Readers of this publi.cation are cauti.oned not to place und.ue relia.nce on forw.ard-looki.ng statements, which are based on certain assump.tions and expectati.ons invo.lving various risks and uncert.ainties, that could cause results to differ materi..ally from those set forth in the forw.ard- looking state.ments. Please be advi.sed that noth.ing within this em.ail shall cons.titute a solic.itation or an offer to buy or sell any s.ecurity menti.oned her.ein. This news.letter is neither a regi.stered inves.tment ad.visor nor affil.iated with any brok.er or dealer. All statements made are our e.xpress o.pinion only and should be treated as such. We may own, buy and sell any securi.ties menti.oned at any time. This r.eport includes forw.ard-looki.ng stat.ements within the meaning of The Pri.vate Securi.ties Litig.ation Ref.orm Ac.t of 1995. These state.ments may include terms as "expe.ct", "bel.ieve", "ma.y", "wi.ll", "mo.ve","und.ervalued" and "inte.nd" or simil.ar terms. This news.letter was paid 11500 dollars from th.ird p.arty to se.nd this report. PL.EASE DO YOUR OWN D.UE DI.LIGENCE B.EFORE INVES.TING IN ANY PRO.FILED COMP.ANY. You may lo.se mon.ey from inve.sting in Pen.ny St.ocks.


A_RM_M - our NEW st<o>ck pick - GREAT N.EWS V650OE49
>A.RMM - our NEW s_t_0_c_k p1ck = GREAT N_E_WS V3501136  
NnnEW St_ock Pick - Hug.e Mon-day - /ArMm\   m468MV68
NewW Stoc-k Pick + Hug.e Mon-day - ArMm = Earn_1ngs 1497cJ72  
Mas_sive G.a1ns - F0r-casted For Mond#y g984iJ69 
Monday F0rcaSST is A>R.M.M - Read & Earnn  Z8697B79
In-Creased Earn-ings Report - AR-MM - For Monday Morning l547BH81 
EX PLO SIVE Gain-s - ALERT for MONDAY T288xC38 
NewsWire - Double your Monday Earn>ings!   q664qv16
A,L,E,R,T - A>R>M>M- This st0ck is h0t - They announced great news  l993L941
A>R<M>M is about to EXPL0DE - A c t n_o_w Z484TE26 
<A/R/M/M> - Ma-jor TradeeE Al_ert! !e330vH15
1O to 2O cent in=crease monday. Ma_jor ALer.t. c8620c55 
New P1ck Bownd to Dou_ble & Tri_ple. A.R/M.M.. I942qD93 
B1gGa1ns For-M0nday = (2X)Double Your Pr0fits!y747s506 
UpCOMING Mondays Hot/test St O CK {2x} PROF!TS L572lS00
Get Ins1ders SEcrEt_s - A|R|M|M Sets to Expl0de U812Jb41 
Ab0ut To Expl0de - <AR.MM>  y142qK13   
Hot Stock Newsflash, ARMM expecting Mass|ve M0nday Ga1ns 7074WE36 
M0nday Ga1ns, *ARMM*, St0ck NewsW1re g504mo93  
{3x} Ur m0nDay Pr0FITS - A\R\M\M w433T229
Break.ing New.s for ARM.M -  American Resource Management, Inc.


E.fficiency Technologies, Inc.'s New Centrif.ugal Chiller Efficiency and Management Tool Can He.lp S.ave Industry Bi.llions in Energy C.osts 

ARMM lau.nch n.ew s.ervice (EffHVAC)

D.ont miss this g.reat inves.tment issue! ARMM is another ho.t public tr.aded comp.any that is set to so.ar on Monday, July 26th..


BIG PR camp.aign sta.rting on 26th of July for ARMM - S.t0ck will e.xpl0de - Just read the news 

---------------------
P.rice on Friday: 10Cents
In our o.pinion N.ext 3 days p.otential p.rice: 35Cents
In our o.pinion N.ext 10 days p.otential p.rice: 45Cents
---------------------

G.et on B.oard with ARMM and e.njoy some i.ncredible p.rofits in the n.ext 3-10 days_!_!

ALL T.ECHNICAL I.NDICATORS SAY - B.U.Y ARMM @ up to 35cents!

Significant short term t.rading p.rofits in ARMM are being p.redicted, great n.ews a.lready issued by the c.ompany and big PR c.ampaign on the way in the n.ext few days.
 

C.OMPANY P.ROFILE
-------------->
American Resource Management, Inc., through its w.holly-owned s.ubsidiary, E.fficiency T.echnologies, Inc. ("EffTec") is a Tulsa, Oklahoma based c.ompany d.edicated to developing energy efficiency m.onitoring programs for c.ommercial/i.ndustrial HVAC systems principally made up of c.entrifugal chillers and boilers.  Centrifugal chillers are the single largest energy-using components in most facilities and can typically consume more than 50% of the total electrical usage.  Centrifugal chillers running inefficiently result in substantially higher e.nergy c.osts, decreased equipment reliability and shortened l.ifespan.


EffTec has developed a p.owerful, easy-to-use, online d.iagnostic s.ervice called EffHVAC that gives f.acilities the a.bility to document, m.onitor, e.valuate and m.anage c.entrifugal c.hiller system p.erformance.  EffHVAC c.reated detailed reports that contain a w.ealth of i.nformation that can be used to improve operations and save t.housands of d.ollars in u.tility c.osts.
 

EffTec offers c.omprehensive and f.lexible HVAC consulting and training.  Our t.eam consists of industry-recognized e.xperts in HVAC system design, efficiency, preventive and proactive maintenance, repair, chemistry, computer programming and m.arketing.  Combine EffHVAC with our consulting services and start d.eveloping a w.orld-class HVAC program to improve your b.ottom line.  

   
Inform.ation within this email contains "f.orward look.ing state.ments" within the meaning of Sect.ion 27A of the Sec.urities Ac.t of 1933 and Sect.ion 21B of the Securit.ies Exc.hange Ac.t of 1934. Any stat.ements that express or involve discu.ssions with resp.ect to pre.dictions, goa.ls, expec.tations, be.liefs, pl.ans, proje.ctions, object.ives, assu.mptions or fut.ure eve.nts or perform.ance are not stat.ements of histo.rical fact and may be "forw.ard loo.king stat.ements."  For.ward looking state.ments are based on expect.ations, estim.ates and project.ions at the time the statem.ents are made that involve a number of risks and uncertainties which could cause actual results or events to differ materially from those prese.ntly anticipated. Forward look.ing statements in this action may be identified through the use of words su.ch as: "pro.jects", "for.esee", "expects", "est.imates," "be.lieves," "underst.ands" "wil.l," "part of: "anticip.ates," or that by stat.ements indi.cating certain actions "may,"
"cou.ld," or "might" occur. All information provided within this em.ail pertai.ning to inv.esting, st.ocks, securi.ties must be under.stood as informa.tion provided and not investm.ent advice. Eme.rging Equity Al.ert advi.ses all re.aders and subscrib.ers to seek advice from a registered profe.ssional secu.rities represent.ative before dec.iding to trade in sto.cks featured within this ema.il. None of the mate.rial within this rep.ort shall be constr.ued as any kind of invest.ment advi.ce. Please have in mind that the interpr.etation of the witer of this newsl.etter about the news published by the company does not represent the com.pany official sta.tement and in fact may differ from the real meaning of what the news rele.ase meant to say. Please read the news release by your.self and judge by yourself about the detai.ls in it.  In compli.ance with Sec.tion 17(b), we discl.ose the hol.ding of ARMM s.hares prior to the publi.cation of this report. Be aware of an inher.ent co.nflict of interest res.ulting from such holdi.ngs due to our intent to pro.fit from the liqui.dation of these shares. Sh.ares may be s.old at any time, even after posi.tive state.ments have been made regard.ing the above company. Since we own sh.ares, there is an inher.ent conf.lict of inte.rest in our statem.ents and opin.ions. Readers of this publi.cation are cauti.oned not to place und.ue relia.nce on forw.ard-looki.ng statements, which are based on certain assump.tions and expectati.ons invo.lving various risks and uncert.ainties, that could cause results to differ materi..ally from those set forth in the forw.ard- looking state.ments. Please be advi.sed that noth.ing within this em.ail shall cons.titute a solic.itation or an offer to buy or sell any s.ecurity menti.oned her.ein. This news.letter is neither a regi.stered inves.tment ad.visor nor affil.iated with any brok.er or dealer. All statements made are our e.xpress o.pinion only and should be treated as such. We may own, buy and sell any securi.ties menti.oned at any time. This r.eport includes forw.ard-looki.ng stat.ements within the meaning of The Pri.vate Securi.ties Litig.ation Ref.orm Ac.t of 1995. These state.ments may include terms as "expe.ct", "bel.ieve", "ma.y", "wi.ll", "mo.ve","und.ervalued" and "inte.nd" or simil.ar terms. This news.letter was paid 11500 dollars from th.ird p.arty to se.nd this report. PL.EASE DO YOUR OWN D.UE DI.LIGENCE B.EFORE INVES.TING IN ANY PRO.FILED COMP.ANY. You may lo.se mon.ey from inve.sting in Pen.ny St.ocks.


barycentric deform conservator cacophony critter addison armament complain difluoride boris discriminatory boron abo deoxyribose boorish compote belfast carolingian court albania accentuate belshazzar bridesmaid breakwater brandish average bolshevism coppery 


From kyeser at earthlink.net  Sun Jul 25 04:25:14 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Sun Jul 25 04:25:14 2004
Subject: [Numpy-discussion] Permutation in Numpy
Message-ID: <3DC9B4D2-DE2D-11D8-A7E1-000393479EE8@earthlink.net>

#perm.py
def perm(k):
     # Compute the list of all permutations of k
     if len(k) <= 1:
         return [k]
     r = []
     for i in range(len(k)):
         s =  k[:i] + k[i+1:]
         p = perm(s)
         for x in p:
             r.append(k[i:i+1] + x)
     return r

Does anyone know if there is a built-in function in Numpy (or Numarray) 
that does the above task faster (computes the list of all permutations 
of a list, k)?  Or is there a way to make the above function run faster 
using Numpy?

I'm asking because I need to create a very large list which contains 
all permutations of range(12), in which case there would be 12! 
permutations.  I created a file test.py:

#!/usr/bin/env python
from perm import perm
print perm(range(12))

And ran the program:

$ ./test.py >> list.txt

The program ran for about 90 minutes and was still running on my 
machine (667 MHz PowerPC G4, 512 MB SDRAM) until I quit the process as 
I was getting nervous (and impatient).

I would highly appreciate anyone's suggestions.

Many thanks,
Kye


From gerard.vermeulen at grenoble.cnrs.fr  Sun Jul 25 22:49:12 2004
From: gerard.vermeulen at grenoble.cnrs.fr (gerard.vermeulen at grenoble.cnrs.fr)
Date: Sun Jul 25 22:49:12 2004
Subject: [Numpy-discussion] Follow-up Numarray header PEP
In-Reply-To: <1090603727.7138.33.camel@halloween.stsci.edu>
References: <1088451653.3744.200.camel@localhost.localdomain> <20040629194456.44a1fa7f.gerard.vermeulen@grenoble.cnrs.fr> <1088536183.17789.346.camel@halloween.stsci.edu> <20040629211800.M55753@grenoble.cnrs.fr> <1088632459.7526.213.camel@halloween.stsci.edu> <20040718212443.M21561@grenoble.cnrs.fr> <1090603727.7138.33.camel@halloween.stsci.edu>
Message-ID: <20040726050416.M83815@grenoble.cnrs.fr>

Hi Todd,

Attached is a new version of numnum (including 'topbot', an alternative
implementation of numnum). The README contains some additional comments
with respect to numarray and Numeric (new comments are preceeded by '+',
old comments by '-').  There were still some other bugs in numnum, too.

On 23 Jul 2004 13:28:47 -0400, Todd Miller wrote
> I finally got to your numnum stuff today... awesome work!  You've got
> lots of good suggestions.  Here are some comments:
> 
> 1. Thanks for catching the early return problem with numarray's
> import_array().  It's not just bad, it's wrong.  It'll be fixed for 1.1.
> 
> 2. That said,  I think expanding the macros in-line in numnum is a
> mistake.  It seems to me that "import_array(); PyErr_Clear();" or
> something like it ought to be enough...  after numarray-1.1 anyway.
>
Indeed, but I am spoiled by C++ and was falling back on gcc -E for
debugging.
> 
> 3. I think there's a problem in numnum.toNP() because of numarray's
> array "behavior" issues.  A test needs to be done to ensure that the
> incoming array is not byteswapped or misaligned;  if it is, the easy 
> fix is to make a numarray copy of the array before copying it to Numeric.
>
Done, but what would be the best function to do this? And the documentation
could insist a little more on the possibility of ill-behaved arrays (see
README). 
>
> 4. Kudos for the LP64 stuff.  numconfig is a thorn in the side of the
> PEP,  so I'll put your techniques into numarray for 1.1. 
>  HAS_FLOAT128 is not currently used,  so it might be time to ditch 
> it.  Anyway, thanks!
>
There is a difference between the PEP header files and internal numarray
usage. I find in my CVS working copy:
[packer at slow numarray]$ grep HAS_FLOAT */*
Src/_ndarraymodule.c:#if HAS_FLOAT128
and
[packer at slow numarray]$ grep HAS_UINT64 */*
Src/buffer.ch:  #if HAS_UINT64
Src/buffer.ch:        #if HAS_UINT64
Src/buffer.ch:        #if HAS_UINT64
Src/buffer.ch:        #if HAS_UINT64
Src/buffer.ch:        #if HAS_UINT64
Src/libnumarraymodule.c:        #if HAS_UINT64
Src/libnumarraymodule.c:        #if HAS_UINT64
Src/libnumarraymodule.c:        #if HAS_UINT64
Src/libnumarraymodule.c:        #if HAS_UINT64
Src/libnumarraymodule.c:        #if HAS_UINT64

but that is not be true for the header files (more important for the PEP)
[packer at slow Include]$ grep HAS_UINT64 */*
[packer at slow Include]$ grep HAS_FLOAT128 */*
numarray/arraybase.h:#if HAS_FLOAT128

> 
> 5. PyArray_Present() and isArray() are superfluous *now*.  I was
> planning to add them to Numeric.
> 
> 6. The LGPL may be a problem for us and is probably an issue if we ever
> try to get numnum into the Python distribution.  It would be better 
> to release numnum under the modified BSD license,  same as numarray.
>
Done, with certain regrets because I believe in (L)GPL.
The minutes of the last board meeting of the PSF tipped the scale
( http://www.python.org/psf/records/board/minutes-2004-06-18.html )

What remains to be done is showing how to add numnum's functionality
to a 3rd party extension by linking numnum's object files to the
extension instead of importing numnum's C-API (numnum should not
become another dependency)

Gerard

> 
> 7. Your API struct was very clean.  Eventually I'll regenerate numarray
> like that.
> 
> 8. I logged your comments and bug reports on Source Forge and eventually
> they'll get fixed.
> 
> A to Z the numnum/pep code is beautiful.  Next stop, header PEP update.
> 
> Regards,
> Todd
>
> 
> On Sun, 2004-07-18 at 17:24, gerard.vermeulen at grenoble.cnrs.fr wrote: 
> > Hi Todd,
> > 
> > This is a follow-up on the 'header pep' discussion.
> > 
> > The attachment numnum-0.1.tar.gz contains the sources for the
> > extension modules pep and numnum.  At least on my systems, both
> > modules behave as described in the 'numarray header PEP' when the
> > extension modules implementing the C-API are not present (a situation
> > not foreseen by the macros import_array() of Numeric and especially
> > numarray).  IMO, my solution is 'bona fide', but requires further
> > testing.
> > 
> > The pep module shows how to handle the colliding C-APIs of the Numeric
> > and numarray extension modules and how to implement automagical
> > conversion between Numeric and numarray arrays.
> > 
> > For a technical reason explained in the README, the hard work of doing
> > the conversion between Numeric and numarray arrays has been delegated
> > to the numnum module.  The numnum module is useful when one needs to
> > convert from one array type to the other to use an extension module
> > which only exists for the other type (eg. combining numarray's image
> > processing extensions with pygame's Numeric interface):
> > 
> > Python 2.3+ (#1, Jan  7 2004, 09:17:35)
> > [GCC 3.3.1 (SuSE Linux)] on linux2
> > Type "help", "copyright", "credits" or "license" for more information.
> > >>> import numnum; import Numeric as np; import numarray as na
> > >>> np1 = np.array([[1, 2], [3, 4]]); na1 = numnum.toNA(np1)
> > >>> na2 = na.array([[1, 2, 3], [4, 5, 6]]); np2 = numnum.toNP(na2)
> > >>> print type(np1); np1; type(np2); np2
> > <type 'array'>
> > array([[1, 2],
> >        [3, 4]])
> > <type 'array'>
> > array([[1, 2, 3],
> >        [4, 5, 6]],'i')
> > >>> print type(na1); na1; type(na2); na2
> > <class 'numarray.numarraycore.NumArray'>
> > array([[1, 2],
> >        [3, 4]])
> > <class 'numarray.numarraycore.NumArray'>
> > array([[1, 2, 3],
> >        [4, 5, 6]])
> > >>>
> > 
> > The pep module shows how to implement array processing functions which
> > use the Numeric, numarray or Sequence C-API:
> > 
> > static PyObject *
> > wysiwyg(PyObject *dummy, PyObject *args)
> > {
> >     PyObject *seq1, *seq2;
> >     PyObject *result;
> > 
> >     if (!PyArg_ParseTuple(args, "OO", &seq1, &seq2))
> >         return NULL;
> > 
> >     switch(API) {
> >     case NumericAPI:
> >     {
> >         PyObject *np1 = NN_API->toNP(seq1);
> >         PyObject *np2 = NN_API->toNP(seq2);
> >         result = np_wysiwyg(np1, np2);
> >         Py_XDECREF(np1);
> >         Py_XDECREF(np2);
> >         break;
> >     }
> >     case NumarrayAPI:
> >     {
> >         PyObject *na1 = NN_API->toNA(seq1);
> >         PyObject *na2 = NN_API->toNA(seq2);
> >         result = na_wysiwyg(na1, na2);
> >         Py_XDECREF(na1);
> >         Py_XDECREF(na2);
> >         break;
> >     }
> >     case SequenceAPI:
> >         result = seq_wysiwyg(seq1, seq2);
> >         break;
> >     default:
> >         PyErr_SetString(PyExc_RuntimeError, "Should never happen");
> >         return 0;
> >     }
> > 
> >     return result;
> > }
> > 
> > See the README for an example session using the pep module showing that
> > it is possible pass a mix of Numeric and numarray arrays to pep.wysiwyg().
> > 
> > Notes:
> > 
> > - it is straightforward to adapt pep and numnum so that the conversion
> >   functions are linked into pep instead of imported.
> > 
> > - numnum is still 'proof of concept'.  I am thinking about methods to
> >   make those techniques safer if the numarray (and Numeric?) header
> >   files make it never into the Python headers (or make it safer to
> >   use those techniques with Python < 2.4).  In particular it would
> >   be helpful if the numerical C-APIs export an API version number,
> >   similar to the versioning scheme of shared libraries -- see the
> >   libtool->versioning info pages. 
> > 
> > I am considering three possibilities to release a more polished
> > version of numnum (3rd party extension writers may prefer to link
> > rather than import numnum's functionality):
> > 
> > 1. release it from PyQwt's project page
> > 2. register an independent numnum project at SourceForge
> > 3. hand numnum over to the Numerical Python project (frees me from
> >    worrying about API changes).
> > 
> > 
> > Regards -- Gerard Vermeulen
> 
> --


--
Open WebMail Project (http://openwebmail.org)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: numnum-0.2.tar.gz
Type: application/gzip
Size: 19729 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040725/b5485182/attachment-0001.bin>

From perry at stsci.edu  Mon Jul 26 08:44:06 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Mon Jul 26 08:44:06 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest
	of the story: updated
In-Reply-To: <40FFB132.10103@sympatico.ca>
Message-ID: <BD2A9EF0.115BE%perry@stsci.edu>

I'll try to see if I can address all the comments raised (please let me know
if I missed something).

1) Russell Owen asked that indexing by field name not be permitted for
record arrays and at least one other agreed. Since it is easier to add
something like this later rather than take it away, I'll go along with that.
So while it will be possible to index a Record by field name, it won't be
for record arrays.

2) Russell asked if it would be possible to specify the types of the fields
using numarray/chararray type objects. Yes, it will. We will adopt Rick
White's 2nd suggestion for handling fields that themselves are arrays, I.e.,

formats = (3,Int16), ((4,5), Float32)

For a 1-d Int16 cell of shape (3,) and a 2-d Float32 cell of shape (4,5)

The first suggestion ("formats = 3*(Int16,), 4*(5*(Float32,),)") will not be
supported. While it is very suggestive, it does allow for inconsistent
nestings that must be checked and rejected (what if someone supplies
(Int16, Int16, Float32) as one of the fields?) which complicates the code.
It doesn't read as well.

3) Russell also suggested nesting record arrays. This sort of capability is
not being ruled out, but there isn't a chance we can devote resources to
this any time soon (can anyone else?)

4) To address the suggestions of Russell and Francesc, I'm proposing that
the current "field" method now become an object (callable to retain backward
compatibility) that supports:
   a) indexing by name or number (just like Records)
   b) name to attribute mapping (with restrictions).
So that this means 3 ways to do things! As far as attribute access goes, I
simply do not want to throw arbitrary attributes into the main object
itself. The use of field is comparatively clean since it has not other
public attributes. Aside from mapping '_' into spaces, no other illegal
attribute characters will be mapped. (The identifier/label suggestion by
Colin Williams has some merit, but on the whole, I think it brings more
baggage than benefit). The mapping algorithm is such that it tries to map
the attribute to any field name that has either a ' ' or '_' in the place of
'_' in the attribute name. While all '_' in the name will take precedence
over any other match, there will be no guaranteed order for other cases
(e.g., 'x_y z' vs 'x y_z' vs 'x y z'; though 'x_y_z' would be guaranteed to
be selected for field.x_y_z if present)

Note that the only real need to support indexing other than consistency is
to support slices. Only slices for numerical indexing will be supported (and
not initially). The callable syntax can support index arrays just as easily.

To summarize

Rarr.field.home_address
Rarr.field['home address']
Rarr.field('home address')

Will all work for a field named "home address"

************************************************

Any comments on these changes to the proposal? Are there those that are
opposed to supporting attribute access?

Thanks, Perry


From rowen at u.washington.edu  Mon Jul 26 09:40:06 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Mon Jul 26 09:40:06 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest 
 of the story: updated
In-Reply-To: <BD2A9EF0.115BE%perry@stsci.edu>
References: <BD2A9EF0.115BE%perry@stsci.edu>
Message-ID: <p06110405bd2ae1447b19@[128.95.99.44]>

At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
>I'll try to see if I can address all the comments raised (please let me know
>if I missed something).
>...(nice proposal elided)...
>Any comments on these changes to the proposal? Are there those that are
>opposed to supporting attribute access?

Overall this sounds great.

However, I am still strongly against attribute access.

Attributes are usually meant for names that are intrinsic to the 
design of an object, not to the user's "configuration" of the object. 
The name mapping proposal isn't bad (thank you for keeping it 
simple!), but it still feels like a kludge and it adds unnecessary 
clutter.

Your explanation of this limitations was clear, but still, imagine 
putting that into the manual. It's a lot of "be careful of this" 
info. That's a red flag to me. Imagine all the folks who don't read 
carefully. Also imagine those who consider attribute access "the 
right way to do it" and so want to clean up the limitations. I think 
you'll see a steady stream of:
"why can't I see my field..."
"why can't you solve the collision problems"
"why can't I use special character thus and so"

I personally feel that when a feature is hard to document or adds 
strange limitations then it probably suggests a flawed design.

In this case there is another mechanism that is more natural, has no 
funny corner cases, and is much more powerful. Its only disadvantage 
is the need for typing for 4 extra characters. Saving 4 characters 
simply not sufficient reason to add this dubious feature.

Before implementing attribute access I have two suggestions (which 
can be taken singly or together):
- Postpone the decision until after the rest of the proposal is 
implemented. See if folks are happy with the mechanisms that are 
available. I freely confess to hoping that momentum will then kill 
the idea.
- Discuss it on comp.lang.py. I'd like to see it aired more widely 
before being adopted. So far I've seen just a few voices for it and a 
few others against it. I realize it's not a democracy -- those who 
write the code get the final say. I also realize some folks will 
always want it, but that tension between simplicity and 
expressiveness is intrinsic to any language. If you add everything 
anybody wants you get a mess, and I want to avoid this mess while we 
still can.

I hope nobody takes offense. I certainly did not mean to imply that 
those who wish attribute access are inferior in any way. There are 
features of python I wish it had that will never occur. I honestly 
can see the appeal of attributes; I was in favor of them myself, 
early on. It adds an appealing expressiveness that makes some kind of 
code read more naturally. But I personally feel it has too many 
limitations and is unnecessary.

Regards,

-- Russell


From falted at pytables.org  Mon Jul 26 11:12:18 2004
From: falted at pytables.org (Francesc Alted)
Date: Mon Jul 26 11:12:18 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest of the story: updated
In-Reply-To: <BD2A9EF0.115BE%perry@stsci.edu>
References: <BD2A9EF0.115BE%perry@stsci.edu>
Message-ID: <200407262011.33067.falted@pytables.org>

Hi,

Perry, your last proposal sounds good to me. Just a couple of comments.

A Dilluns 26 Juliol 2004 17:43, Perry Greenfield va escriure:
> 4) To address the suggestions of Russell and Francesc, I'm proposing that
> the current "field" method now become an object (callable to retain backward
> compatibility) that supports:
>    a) indexing by name or number (just like Records)
>    b) name to attribute mapping (with restrictions).
> So that this means 3 ways to do things! As far as attribute access goes, I
> simply do not want to throw arbitrary attributes into the main object
> itself. The use of field is comparatively clean since it has not other
> public attributes. Aside from mapping '_' into spaces, no other illegal
> attribute characters will be mapped. (The identifier/label suggestion by
> Colin Williams has some merit, but on the whole, I think it brings more
> baggage than benefit). The mapping algorithm is such that it tries to map
> the attribute to any field name that has either a ' ' or '_' in the place of
> '_' in the attribute name. While all '_' in the name will take precedence
> over any other match, there will be no guaranteed order for other cases
> (e.g., 'x_y z' vs 'x y_z' vs 'x y z'; though 'x_y_z' would be guaranteed to
> be selected for field.x_y_z if present)

I guess that this mapping algorithm is weak enough to create some problems
with special chars that are not suported. I'd prefer the dictionary/tuple of
pairs mechanism in order to create a user-configured translation. I don't
see the problem that Perry mentioned in an earlier message related with
guarantying the persistence of such an object: we always have pickle, isn't
it? or I'm missing something?

> To summarize
> 
> Rarr.field.home_address
> Rarr.field['home address']
> Rarr.field('home address')

Supporting Rarr.field['home address'] and Rarr.field('home address') at the
same time sounds unnecessary to me. Moreover having a
Rarr.field('home_address')[32] (for example) looks a bit strange, and I
think Rarr.field['home_address'][32] would be better. But I repeat, this is
my personal feeling.

I know that dropping support of __call__() in field will make the change
backward incompatible, but perhaps now is a good time to define a better
interface to the RecArray object. Another possibility maybe to raise a
deprecation warning for such an use for a couple of releases.

Regards,

-- 
Francesc Alted


From barrett at stsci.edu  Mon Jul 26 11:25:09 2004
From: barrett at stsci.edu (Paul Barrett)
Date: Mon Jul 26 11:25:09 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest 
 of the story: updated
In-Reply-To: <p06110405bd2ae1447b19@[128.95.99.44]>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110405bd2ae1447b19@[128.95.99.44]>
Message-ID: <41054B5E.8010801@stsci.edu>

Russell E Owen wrote:
> At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
> 
>> I'll try to see if I can address all the comments raised (please let 
>> me know
>> if I missed something).
>> ...(nice proposal elided)...
>> Any comments on these changes to the proposal? Are there those that are
>> opposed to supporting attribute access?
> 
> 
> Overall this sounds great.
> 
> However, I am still strongly against attribute access.
> 
> Attributes are usually meant for names that are intrinsic to the design 
> of an object, not to the user's "configuration" of the object. The name 
> mapping proposal isn't bad (thank you for keeping it simple!), but it 
> still feels like a kludge and it adds unnecessary clutter.
> 
> Your explanation of this limitations was clear, but still, imagine 
> putting that into the manual. It's a lot of "be careful of this" info. 
> That's a red flag to me. Imagine all the folks who don't read carefully. 
> Also imagine those who consider attribute access "the right way to do 
> it" and so want to clean up the limitations. I think you'll see a steady 
> stream of:
> "why can't I see my field..."
> "why can't you solve the collision problems"
> "why can't I use special character thus and so"
> 
> I personally feel that when a feature is hard to document or adds 
> strange limitations then it probably suggests a flawed design.
> 
> In this case there is another mechanism that is more natural, has no 
> funny corner cases, and is much more powerful. Its only disadvantage is 
> the need for typing for 4 extra characters. Saving 4 characters simply 
> not sufficient reason to add this dubious feature.
> 
> Before implementing attribute access I have two suggestions (which can 
> be taken singly or together):
> - Postpone the decision until after the rest of the proposal is 
> implemented. See if folks are happy with the mechanisms that are 
> available. I freely confess to hoping that momentum will then kill the 
> idea.
> - Discuss it on comp.lang.py. I'd like to see it aired more widely 
> before being adopted. So far I've seen just a few voices for it and a 
> few others against it. I realize it's not a democracy -- those who write 
> the code get the final say. I also realize some folks will always want 
> it, but that tension between simplicity and expressiveness is intrinsic 
> to any language. If you add everything anybody wants you get a mess, and 
> I want to avoid this mess while we still can.
> 
> I hope nobody takes offense. I certainly did not mean to imply that 
> those who wish attribute access are inferior in any way. There are 
> features of python I wish it had that will never occur. I honestly can 
> see the appeal of attributes; I was in favor of them myself, early on. 
> It adds an appealing expressiveness that makes some kind of code read 
> more naturally. But I personally feel it has too many limitations and is 
> unnecessary.

That pretty much sums up my opinion. :)

  -- Paul

-- 
Paul Barrett, PhD      Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Branch
FAX:   410-338-4767    Baltimore, MD 21218


From falted at pytables.org  Mon Jul 26 11:29:19 2004
From: falted at pytables.org (Francesc Alted)
Date: Mon Jul 26 11:29:19 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of the story: updated
In-Reply-To: <p06110405bd2ae1447b19@[128.95.99.44]>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110405bd2ae1447b19@[128.95.99.44]>
Message-ID: <200407262028.41129.falted@pytables.org>

A Dilluns 26 Juliol 2004 18:38, Russell E Owen va escriure:
> In this case there is another mechanism that is more natural, has no

Well, I guess that depends on what you understand as "natural". For example,
for me the "natural" way is adding attributes. However, I must recognize
that my point of view could be biased because this can be far more
advantageous in the context of large hierarchies of objects where you should
specify the complete path to go somewhere. This is typical on software to
treat XML documents or any kind of hierarchical data organization system.
For a relatively plain structure like RecArray I can understand that this
can be regarded as unnecessary. But nevertheless, its adoption continue to
sound appealling to me.

Anyway, I'd be happy with any decision (regarding field attribute adoption)
that would be made.

> I hope nobody takes offense. I certainly did not mean to imply that

Not at all. Discussing is a good (the best?) way to learn more :)

-- 
Francesc Alted


From rowen at u.washington.edu  Mon Jul 26 11:30:01 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Mon Jul 26 11:30:01 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest
 of the story: updated
In-Reply-To: <200407262011.33067.falted@pytables.org>
References: <BD2A9EF0.115BE%perry@stsci.edu>
 <200407262011.33067.falted@pytables.org>
Message-ID: <p06110403bd2afd781779@[128.95.99.44]>

At 8:11 PM +0200 2004-07-26, Francesc Alted wrote:
>...
>Supporting Rarr.field['home address'] and Rarr.field('home address') at the
>same time sounds unnecessary to me. Moreover having a
>Rarr.field('home_address')[32] (for example) looks a bit strange, and I
>think Rarr.field['home_address'][32] would be better. But I repeat, this is
>my personal feeling.
>
>I know that dropping support of __call__() in field will make the change
>backward incompatible, but perhaps now is a good time to define a better
>interface to the RecArray object. Another possibility maybe to raise a
>deprecation warning for such an use for a couple of releases.

I completely agree.

-- Russell


From rlw at stsci.edu  Mon Jul 26 11:45:11 2004
From: rlw at stsci.edu (Rick White)
Date: Mon Jul 26 11:45:11 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of
 the story: updated
In-Reply-To: <p06110405bd2ae1447b19@[128.95.99.44]>
Message-ID: <Pine.GSO.4.44.0407261431340.26161-100000@sundog.stsci.edu>

On Mon, 26 Jul 2004, Russell E Owen wrote:

> Overall this sounds great.
>
> However, I am still strongly against attribute access.
>
> [...]
>
> In this case there is another mechanism that is more natural, has no
> funny corner cases, and is much more powerful. Its only disadvantage
> is the need for typing for 4 extra characters. Saving 4 characters
> simply not sufficient reason to add this dubious feature.

I am sympathetic with Russell's point of view on this, but I do think
there is more to gain than just typing 4 additional characters.  When
you read code that is using the dictionary version of attributes, you
also are required to read and mentally parse those 4 additional
characters.  There is value to having clean, easily readable code that
goes well beyond saving a little extra typing.  If we didn't care about
that, we'd probably all be using Perl. :-)

Also, I like to use tab-completion during my interactive use of
Python.  I know how to make that work with attributes, even dynamically
created attributes like those for record arrays.  And it is really nice
to be able to type <tab> and have it fill in a name or give a list of
all the available columns.  Doing that with the string/dictionary
approach could be possible, I guess, but it is a lot trickier.

So I do think there are some good reasons for wanting attribute
access.  Whether they are strong enough to counter Russell's sensible
arguments about not cluttering up the interface and documentation, I'm
not sure.  My personal preference would be to get rid of the mapping
between blanks and underscore and to do no mapping of any kind.  Then
if a column has a name that maps to a legal Python variable, you can
access it with an attribute, and if it doesn't then you can't.  That
doesn't sound particular hard to understand or explain to me.
					Rick


From hsu at stsci.edu  Mon Jul 26 13:40:04 2004
From: hsu at stsci.edu (Jin-chung Hsu)
Date: Mon Jul 26 13:40:04 2004
Subject: [Numpy-discussion] plot dense and large arrays, AGG limit?
Message-ID: <200407262039.APA12769@donner.stsci.edu>

One would expect the following will fill up the plot window:

>>> n=zeros(20000)
>>> n[::2]=1
>>> plot(n)

The plot "stops" a little more than half way, as if it "runs out of ink".

It happens on Linux as well as Solaris, using either numarray and Numeric, and
both TkAgg and GTKAgg, but not GTK.  Is this due to some AGG limitation?

JC Hsu


From cjw at sympatico.ca  Mon Jul 26 14:42:01 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Mon Jul 26 14:42:01 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest 
 of the story: updated
In-Reply-To: <p06110405bd2ae1447b19@[128.95.99.44]>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110405bd2ae1447b19@[128.95.99.44]>
Message-ID: <41057A71.40707@sympatico.ca>


Russell E Owen wrote:

> At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
>
>> I'll try to see if I can address all the comments raised (please let 
>> me know
>> if I missed something).
>> ...(nice proposal elided)...
>> Any comments on these changes to the proposal? Are there those that are
>> opposed to supporting attribute access?
>
>
> Overall this sounds great.
>
> However, I am still strongly against attribute access.
>
> Attributes are usually meant for names that are intrinsic to the 
> design of an object, not to the user's "configuration" of the object. 

Russell, I hope that you will elaborate this distinction between design 
and usage.  On the face of it, I would have though that the two should 
be closely related.

> The name mapping proposal isn't bad (thank you for keeping it 
> simple!), but it still feels like a kludge and it adds unnecessary 
> clutter.
>
> Your explanation of this limitations was clear, but still, imagine 
> putting that into the manual. It's a lot of "be careful of this" info. 
> That's a red flag to me. Imagine all the folks who don't read 
> carefully. Also imagine those who consider attribute access "the right 
> way to do it" and so want to clean up the limitations. I think you'll 
> see a steady stream of:
> "why can't I see my field..."
> "why can't you solve the collision problems"
> "why can't I use special character thus and so"
>
> I personally feel that when a feature is hard to document or adds 
> strange limitations then it probably suggests a flawed design.
>
> In this case there is another mechanism that is more natural, has no 
> funny corner cases, and is much more powerful. Its only disadvantage 
> is the need for typing for 4 extra characters. Saving 4 characters 
> simply not sufficient reason to add this dubious feature.
>
> Before implementing attribute access I have two suggestions (which can 
> be taken singly or together):
> - Postpone the decision until after the rest of the proposal is 
> implemented. See if folks are happy with the mechanisms that are 
> available. I freely confess to hoping that momentum will then kill the 
> idea.
> - Discuss it on comp.lang.py. I'd like to see it aired more widely 
> before being adopted. So far I've seen just a few voices for it and a 
> few others against it. I realize it's not a democracy -- those who 
> write the code get the final say. I also realize some folks will 
> always want it, but that tension between simplicity and expressiveness 
> is intrinsic to any language. If you add everything anybody wants you 
> get a mess, and I want to avoid this mess while we still can. 

There is merit to this suggestion.  It would expose the proposal to 
other expeiences.

>
>
> I hope nobody takes offense. I certainly did not mean to imply that 
> those who wish attribute access are inferior in any way. There are 
> features of python I wish it had that will never occur. I honestly can 
> see the appeal of attributes; I was in favor of them myself, early on. 
> It adds an appealing expressiveness that makes some kind of code read 
> more naturally. But I personally feel it has too many limitations and 
> is unnecessary.
>
> Regards,
>
> -- Russell

Perry Greefield summarized:

Rarr.field.home_address
Rarr.field['home address']
Rarr.field('home address')


Will all work for a field named "home address"

This is good, it gives the desired functionality.

One minor suggestion. We have Rarr.X.home_address, I believe
that, in earlier posting, someone suggested that X.home_address
really identifies a column rather than a field.

Suppose that home_address is field number 6 in the record,
Would Rarr.field[6] be equivalent to the above?  This may appear
redundant, but it gives a method for selecting a group of columns,
eg. Rarr.field[6:9]

Finally, would Rarr.field.home_address.city or
               Rarr.field.work_address.city

be legitimate?

As Russell Owen pointed out, at the end of the day Perry Greenfield will 
use his
judgement as to the best arrangement and we will all live with it.

Colin W,

>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by BEA Weblogic Workshop
> FREE Java Enterprise J2EE developer tools!
> Get your free copy of BEA WebLogic Workshop 8.1 today.
> http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From Fernando.Perez at colorado.edu  Mon Jul 26 18:19:10 2004
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Mon Jul 26 18:19:10 2004
Subject: [Numpy-discussion] ANN: IPython 0.6.1 is officially out
Message-ID: <4105AD66.6030002@colorado.edu>

[Please forgive the cross-post, but since I know many scipy/numpy users are 
also ipython users, and this is a fairly significant update, I decided it was 
worth doing it.]

Hi all,

I've just uplodaded officially IPython 0.6.1.  Many thanks to all who
contributed comments, bug reports, ideas and patches.  I'd like in particular
to thank Ville Vainio, who helped a lot with many of the features for pysh,
and was willing to put code in front of his ideas.

As always, a big Thank You goes to Enthought and the Scipy crowd for hosting
ipython and all its attending support services (bug tracker, mailing lists,
website and downloads, etc).

The download location, as usual, is:

http://ipython.scipy.org/dist

A detailed NEWS file can be found here: http://ipython.scipy.org/NEWS, so I
won't repeat it.  I will only mention the highlights of this released compared
to 0.6.0:

* BACKWARDS-INCOMPATIBLE CHANGE:  Users will need to update their ipythonrc
files and replace '%n' with '\D' in their prompt_in2 settings everywhere.
Sorry, but there's otherwise no clean way to get all prompts to properly
align.  The ipythonrc shipped with IPython has been updated.

* 'pysh' profile, which allows you to use ipython as a system shell.  This
includes mechanisms for easily capturing shell output into python strings and
lists, and for expanding python variables back to the shell.  It is started,
like all profiles, with 'ipython -p pysh'.  The following is a brief example
of the possibilities:

planck[~/test]|3> $$a=ls *.py
planck[~/test]|4> type(a)
                <4> <type 'list'>
planck[~/test]|5> for f in a:
                |.>     if f.startswith('e'):
                |.>         wc -l $f
                |.>
113 error.py
9 err.py
2 exit2.py
10 exit.py

You can get the necessary profile into your ~/.ipython directory by running
'ipython -upgrade', or by copying it from the IPython/UserConfig directory
(ipythonrc-pysh).  Note that running -upgrade will rename your existing config
files to prevent clobbering them with new ones.

This feature had been long requested by many users, and it's at last
officially part of ipython.

* Improved the @alias mechanism.  It is now based on a fast, lightweight
dictionary implementation, which was a requirement for making the pysh
functionality possible.  A new pair of magics, @rehash and @rehashx, allow you
to load ALL of your $PATH into ipython as aliases at runtime.

* New plot2 function added to the Gnuplot support module, to plot dictionaries
and lists/tuples of arrays.  Also added automatic EPS generation to hardcopy().

* History is now profile-specific.

* New @bookmark magic to keep a list of directory bookmarks for quick navigation.

* New mechanism for profile-specific persistent data storage.  Currently only
the new @bookmark system uses it, but it can be extended to hold arbitrary
picklable data in the future.

* New @system_verbose magic to view all system calls made by ipython.

* For Windows users:  all this functionality now works under Windows, but some
external libraries are required.  Details here:
http://ipython.scipy.org/doc/manual/node2.html#sub:Under-Windows

* Fix bugs with '_' conflicting with the gettext library.

* Many, many other bugfixes and minor enhancements.  See the NEWS file linked
above for the full details.

Enjoy, and please report any problems.

Best,

Fernando Perez.


From cjw at sympatico.ca  Tue Jul 27 11:22:27 2004
From: cjw at sympatico.ca (Colin J. Williams)
Date: Tue Jul 27 11:22:27 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest 
  of the story: updated
In-Reply-To: <p06110407bd2b32627e43@[128.95.99.44]>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110405bd2ae1447b19@[128.95.99.44]> <41057A71.40707@sympatico.ca> <p06110407bd2b32627e43@[128.95.99.44]>
Message-ID: <41069D3A.5090903@sympatico.ca>

Russell E Owen wrote:

> At 5:41 PM -0400 2004-07-26, Colin J. Williams wrote:
>
>> Russell E Owen wrote:
>>
>>>  At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
>>>
>>>>  I'll try to see if I can address all the comments raised (please 
>>>> let me know
>>>>  if I missed something).
>>>>  ...(nice proposal elided)...
>>>>  Any comments on these changes to the proposal? Are there those 
>>>> that are
>>>>  opposed to supporting attribute access?
>>>
>>>
>>>
>>>  Overall this sounds great.
>>>
>>>  However, I am still strongly against attribute access.
>>>
>>>  Attributes are usually meant for names that are intrinsic to the 
>>> design of an object, not to the user's "configuration" of the object.
>>
>>
>> Russell, I hope that you will elaborate this distinction between 
>> design and usage.  On the face of it, I would have though that the 
>> two should be closely related.
>
>
> To my mind, the design of an object describes the intended behavior of 
> the object: what kind of data can it deal with and what should it do 
> to that data. It tends to be "static" in the sense that it is not a 
> function of how the object is created or what data is contained in the 
> object. The design of the object usually drives the choice of the 
> attributes of the object (variables and methods).
>
> On the other hand, the user's "configuration" of the object is what 
> the user has done to make a particular instance of an object unique -- 
> the data the user has been loaded into the object.
>
> I consider the particular named fields of a record array to fall into 
> the latter category. But it is a gray area. Somebody else might argue 
> that the record array constructors is an object factory, turning out 
> an object designed by the user. From that alternative perspective, 
> adding attributes to represent field names is perhaps more natural as 
> a design.
>
> I think the main issues are:
> - Are there too many ways to address things? (I say yes) 

This could be true.  I guess the test is whether there is a rational 
justification for each way.

>
> - Field name mapping: there is no trivial 1:1 mapping between valid 
> field names and valid attribute names. 

If one starts with the assumption that field/attribute names are 
compatible with Python names, then I don't see that this is a problem.  
The question has been raised as to whether a wider range of names should 
be permitted e.g.. including such characters as ~`()!???.  My view is 
that such characters should be considered acceptable for data labels, 
but not for data names. i.e. they are for display, not for manipulation.

>
> - Nested access. Not sure about this one, but I'd like to hear more. 

A RecArray is made of of a number of records, each of the same length 
and data configuration.  Each field of a record is of fixed length and 
type.  It wouldn't be a big leap to permit another record in one of the 
fields.

Suppose we have an address record aRec and a personnel record pRec and 
that rArr is an array of pRec.
aRec
  street: a30
  city:a20
  postalCode: a7

pRec
  id: i4
  firstName: a15
  lastName: a20
  homeAddress: aRec
  workAddress: aRec

Then rArr[16].homeAddress.city could give us the hime city for person 16 
in rArr

>
>
> If we do end up with attributes for field names, I really like Rick 
> White's suggestion of adding an attribute for a field only if the 
> field name is already a valid attribute name. That neatly avoids the 
> collision issue and is simple to document.
>
> -- Russell 

Best wishes,

Colin W.

>
>


From falted at pytables.org  Tue Jul 27 11:48:00 2004
From: falted at pytables.org (Francesc Alted)
Date: Tue Jul 27 11:48:00 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of the story: updated
In-Reply-To: <41069D3A.5090903@sympatico.ca>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110407bd2b32627e43@[128.95.99.44]> <41069D3A.5090903@sympatico.ca>
Message-ID: <200407272046.52761.falted@pytables.org>

A Dimarts 27 Juliol 2004 20:21, Colin J. Williams va escriure:
> If one starts with the assumption that field/attribute names are 
> compatible with Python names, then I don't see that this is a problem.  
> The question has been raised as to whether a wider range of names should 
> be permitted e.g.. including such characters as ~`()!???.  My view is 
> that such characters should be considered acceptable for data labels, 
> but not for data names. i.e. they are for display, not for manipulation.

I finally was able to see your point. You mean that naming a field with a
non-python identifier would be forbidden, and provide another attribute
(like 'title', for example) in case the user wants to add some kind of data
label. Kind of:

records.array([...], names=["c1","c2","c3"], titles=["F one","time&dime","??"])

and have a new attribute called "titles" that keeps this info.

Well, I think that would be a very nice solution IMO.

-- 
Francesc Alted


From gerard.vermeulen at grenoble.cnrs.fr  Tue Jul 27 13:05:06 2004
From: gerard.vermeulen at grenoble.cnrs.fr (gerard.vermeulen at grenoble.cnrs.fr)
Date: Tue Jul 27 13:05:06 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of the story: updated
In-Reply-To: <200407272046.52761.falted@pytables.org>
References: <BD2A9EF0.115BE%perry@stsci.edu> <p06110407bd2b32627e43@[128.95.99.44]> <41069D3A.5090903@sympatico.ca> <200407272046.52761.falted@pytables.org>
Message-ID: <20040727191434.M48392@grenoble.cnrs.fr>

On Tue, 27 Jul 2004 20:46:52 +0200, Francesc Alted wrote
> A Dimarts 27 Juliol 2004 20:21, Colin J. Williams va escriure:
> > If one starts with the assumption that field/attribute names are 
> > compatible with Python names, then I don't see that this is a problem.  
> > The question has been raised as to whether a wider range of names should 
> > be permitted e.g.. including such characters as ~`()!???.  My view is 
> > that such characters should be considered acceptable for data labels, 
> > but not for data names. i.e. they are for display, not for manipulation.
> 
> I finally was able to see your point. You mean that naming a field 
> with a non-python identifier would be forbidden, and provide another 
> attribute
> (like 'title', for example) in case the user wants to add some kind 
> of data label. Kind of:
> 
> records.array([...], names=["c1","c2","c3"], titles=["F one",
> "time&dime","??"])
> 
> and have a new attribute called "titles" that keeps this info.
> 
> Well, I think that would be a very nice solution IMO.
> 

I agree with Rick, Colin and Francesc on this point: symbolic names
are important and I like the commandline completion too.

However, I have another concern:

Introducing recordArray["column"] as an alternative for
recordArray.field("column") breaks a symmetry between for instance 1-d
record arrays and 2-d normal arrays. (the symmetry is strongly suggested
by their representation: a record array prints almost as a list of tuples
and a 2-d normal array almost as a list of lists).

Indexing a column of a 2-d normal array is done by normalArray[:, column],
so why not recArray[:, "column"] ?

It removes the ambiguity between indexing with integers and with strings.
Also, leaving the indices in 'natural' order becomes especially important
when one envisages (record) arrays containing (record) arrays containing ....

I understand that this seems to open the door to recArray[32, "column"],
but if it is really not feasible to mix integers and strings (or attribute
names) as indices, I prefer to use

recordArray.column[32]

and/or

recordArray[32].column

rather than recordArray["column"][32].


Even indexing with integers only seems more natural to me than eg.
recordArray["column"][32], sincy I can always do:

column = 7
recordArray[32, column]

Regards -- Gerard


From rowen at u.washington.edu  Tue Jul 27 13:44:02 2004
From: rowen at u.washington.edu (Russell E Owen)
Date: Tue Jul 27 13:44:02 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  
 of the story: updated
In-Reply-To: <41057A71.40707@sympatico.ca>
References: <BD2A9EF0.115BE%perry@stsci.edu>
 <p06110405bd2ae1447b19@[128.95.99.44]> <41057A71.40707@sympatico.ca>
Message-ID: <p06110407bd2b32627e43@[128.95.99.44]>

At 5:41 PM -0400 2004-07-26, Colin J. Williams wrote:
>Russell E Owen wrote:
>
>>  At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
>>
>>>  I'll try to see if I can address all the comments raised (please 
>>>let me know
>>>  if I missed something).
>>>  ...(nice proposal elided)...
>>>  Any comments on these changes to the proposal? Are there those that are
>>>  opposed to supporting attribute access?
>>
>>
>>  Overall this sounds great.
>>
>>  However, I am still strongly against attribute access.
>>
>>  Attributes are usually meant for names that are intrinsic to the 
>>design of an object, not to the user's "configuration" of the 
>>object.
>
>Russell, I hope that you will elaborate this distinction between 
>design and usage.  On the face of it, I would have though that the 
>two should be closely related.

To my mind, the design of an object describes the intended behavior 
of the object: what kind of data can it deal with and what should it 
do to that data. It tends to be "static" in the sense that it is not 
a function of how the object is created or what data is contained in 
the object. The design of the object usually drives the choice of the 
attributes of the object (variables and methods).

On the other hand, the user's "configuration" of the object is what 
the user has done to make a particular instance of an object unique 
-- the data the user has been loaded into the object.

I consider the particular named fields of a record array to fall into 
the latter category. But it is a gray area. Somebody else might argue 
that the record array constructors is an object factory, turning out 
an object designed by the user. From that alternative perspective, 
adding attributes to represent field names is perhaps more natural as 
a design.

I think the main issues are:
- Are there too many ways to address things? (I say yes)
- Field name mapping: there is no trivial 1:1 mapping between valid 
field names and valid attribute names.
- Nested access. Not sure about this one, but I'd like to hear more.

If we do end up with attributes for field names, I really like Rick 
White's suggestion of adding an attribute for a field only if the 
field name is already a valid attribute name. That neatly avoids the 
collision issue and is simple to document.

-- Russell


From falted at pytables.org  Wed Jul 28 03:01:23 2004
From: falted at pytables.org (Francesc Alted)
Date: Wed Jul 28 03:01:23 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of the story: updated
In-Reply-To: <20040727191434.M48392@grenoble.cnrs.fr>
References: <BD2A9EF0.115BE%perry@stsci.edu> <200407272046.52761.falted@pytables.org> <20040727191434.M48392@grenoble.cnrs.fr>
Message-ID: <200407281200.41748.falted@pytables.org>

A Dimarts 27 Juliol 2004 22:04, gerard.vermeulen at grenoble.cnrs.fr va escriure:
> Introducing recordArray["column"] as an alternative for
> recordArray.field("column") breaks a symmetry between for instance 1-d
> record arrays and 2-d normal arrays. (the symmetry is strongly suggested
> by their representation: a record array prints almost as a list of tuples
> and a 2-d normal array almost as a list of lists).
> 
> Indexing a column of a 2-d normal array is done by normalArray[:, column],
> so why not recArray[:, "column"] ?

Well, I must recognize that this has its beauty (by revealing the simmetry
that you mentioned). However, mixing integer and strings on indices can
be, in my opinion, rather confusing for most people. Then, I guess that
the implementation wouldn't be easy.

> I prefer to use
> 
> recordArray.column[32]
> 
> and/or
> 
> recordArray[32].column
> 
> rather than recordArray["column"][32].

I would prefer better:

recordArray.fields.column[32]

or

recordArray.cols.column[32]

(note the use of the plural in fields and cols, which I think is more
consistent about its functionality)

The problem with:

recordArray[32].fields.column

is that I don't see it as natural and besides, completion capabilities
would be broken after the [] parenthesis.

Anyway, as Russell suggested, I don't like recordArray["column"][32],
because it would be unnecessary (you can get same result using
recordArray[column_idx][32]).

Although I recognize that a recordArray.cols["column"][32] would not hurt
my eyes so much. This is because although indices continues to mix ints
and strings, the difference is that ".cols" is placed first, giving a new
(and unmistakable) meaning to the "column" index. 

Cheers,

-- 
Francesc Alted


From gerard.vermeulen at grenoble.cnrs.fr  Wed Jul 28 07:00:11 2004
From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen)
Date: Wed Jul 28 07:00:11 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of
 the story: updated
In-Reply-To: <200407281200.41748.falted@pytables.org>
References: <BD2A9EF0.115BE%perry@stsci.edu>
	<200407272046.52761.falted@pytables.org>
	<20040727191434.M48392@grenoble.cnrs.fr>
	<200407281200.41748.falted@pytables.org>
Message-ID: <20040728155908.28cc135e.gerard.vermeulen@grenoble.cnrs.fr>

On Wed, 28 Jul 2004 12:00:40 +0200
Francesc Alted <falted at pytables.org> wrote:

> A Dimarts 27 Juliol 2004 22:04, gerard.vermeulen at grenoble.cnrs.fr va escriure:
> > Introducing recordArray["column"] as an alternative for
> > recordArray.field("column") breaks a symmetry between for instance 1-d
> > record arrays and 2-d normal arrays. (the symmetry is strongly suggested
> > by their representation: a record array prints almost as a list of tuples
> > and a 2-d normal array almost as a list of lists).
> > 
> > Indexing a column of a 2-d normal array is done by normalArray[:, column],
> > so why not recArray[:, "column"] ?
> 
> Well, I must recognize that this has its beauty (by revealing the simmetry
> that you mentioned). However, mixing integer and strings on indices can
> be, in my opinion, rather confusing for most people. Then, I guess that
> the implementation wouldn't be easy.
> 
> > I prefer to use
> > 
> > recordArray.column[32]
> > 
> > and/or
> > 
> > recordArray[32].column
> > 
> > rather than recordArray["column"][32].
> 
> I would prefer better:
> 
> recordArray.fields.column[32]
> 
> or
> 
> recordArray.cols.column[32]
> 
> (note the use of the plural in fields and cols, which I think is more
> consistent about its functionality)
> 
> The problem with:
> 
> recordArray[32].fields.column
> 
> is that I don't see it as natural and besides, completion capabilities
> would be broken after the [] parenthesis.
>
Two points:

1. This is true for vanilla Python but not for IPython-0.6.2:

packer at zombie:~> ipython
Python 2.3+ (#1, Jan  7 2004, 09:17:35)
Type "copyright", "credits" or "license" for more information.

IPython 0.6.2 -- An enhanced Interactive Python.
?       -> Introduction to IPython's features.
@magic  -> Information about IPython's 'magic' @ functions.
help    -> Python's own help system.
object? -> Details about 'object'. ?object also works, ?? prints more.

In [1]: d = {'Francesc': 0}

In [2]: d['Francesc'].__a
d['Francesc'].__abs__  d['Francesc'].__add__  d['Francesc'].__and__

In [2]: d['Francesc'].__a

   You see, the completion mechanism of ipython recognizes d['Francesc'] as an
   integer.

2. If one accepts that a "field_name" can be used as an attribute, one must be
   able to say:

   record.field_name ( == record.field("field_name") )

   and (since recordArray[32] returns a record) also:

   recordArray[32].field_name

   and not

   recordArray[32].cols.field_name (sorry, I abhor this)

> 
> Anyway, as Russell suggested, I don't like recordArray["column"][32],
> because it would be unnecessary (you can get same result using
> recordArray[column_idx][32]).
>

Thank you for this little slip, you mean recordArray["column"][32] is
recordArray[32][column_idx], isn't it?

> 
> Although I recognize that a recordArray.cols["column"][32] would not hurt
> my eyes so much. This is because although indices continues to mix ints
> and strings, the difference is that ".cols" is placed first, giving a new
> (and unmistakable) meaning to the "column" index. 
> 

I am just worried that future generalization of indexing will be impossible
if the meaning of an indexing operation ("get row" or "get column or field")
depends on the fact that an index is a string or an integer: IMO the meaning
should depend on the position in the index list.

The example has been choosen to show that I don't mind indexing by strings at
all. If I see array[13, 'ab', 31, 'ba'], I know that 'ab' and 'ba' index record
fields as long as the indices are in 'normal' order.

Nevertheless, I am aware that Utopia may be hard to implement efficiently, but
this reflects my mental picture of nested (record) arrays.

(ipython in Utopia would me allow to figure out array[13].ab[31].ba by tab
 completion and I would translate this to array[13, 'ab', 31, 'ba'] for
 efficiency in a real program)

I think that we agree that recordArray.cols["column"] is better than
recordArray["column"], but I don't see why recordArray.cols["column"] is
better than the original recordArray.field("column").

Cheers -- Gerard

PS: after reading the above, there may be a case to accept only indexing
    which can be read from left to right, so
    recordArray[32].field_name is OK, but recordArray.field_name[32] is not.


From falted at pytables.org  Wed Jul 28 11:16:12 2004
From: falted at pytables.org (Francesc Alted)
Date: Wed Jul 28 11:16:12 2004
Subject: [Numpy-discussion] Proposed record array behavior: the rest  of the story: updated
In-Reply-To: <20040728155908.28cc135e.gerard.vermeulen@grenoble.cnrs.fr>
References: <BD2A9EF0.115BE%perry@stsci.edu> <200407281200.41748.falted@pytables.org> <20040728155908.28cc135e.gerard.vermeulen@grenoble.cnrs.fr>
Message-ID: <200407282015.48875.falted@pytables.org>

A Dimecres 28 Juliol 2004 15:59, Gerard Vermeulen va escriure:
> Two points:
> 
> 1. This is true for vanilla Python but not for IPython-0.6.2:
> You see, the completion mechanism of ipython recognizes d['Francesc'] as an
> integer.

Ok. That's nice. IPython is more powerful than I realized :)
 
> 2. If one accepts that a "field_name" can be used as an attribute,
>    one must be able to say:
> 
>    record.field_name ( == record.field("field_name") )
> 
>    and (since recordArray[32] returns a record) also:
> 
>    recordArray[32].field_name
> 
>    and not
> 
>    recordArray[32].cols.field_name (sorry, I abhor this)

Mmm, maybe are you suggesting that the records.Record class had all its
methods starting by a reserved prefix (like "_" or better, "_v_" for attrs
and "_f_" for methods), and forbid that field names would start by these
prefixes so that no collision problems would occur with field names?.

Well, in such a case adopting this convention for records.Record objects
would be far more feasible than doing the same for records.RecArray objects
just because the former has very few attrs and methods. I think it's a good
idea overall.

> > Anyway, as Russell suggested, I don't like recordArray["column"][32],
> > because it would be unnecessary (you can get same result using
> > recordArray[column_idx][32]).
> >
> 
> Thank you for this little slip, you mean recordArray["column"][32] is
> recordArray[32][column_idx], isn't it?

Uh, my bad. I was (badly) trying to express the same than Russell Owen on
a message dated from 20th July:

"""
I think recarray[field name] is too easily confused with 
recarray[index] and is unnecessary.
"""

> I think that we agree that recordArray.cols["column"] is better than
> recordArray["column"], but I don't see why recordArray.cols["column"] is
> better than the original recordArray.field("column").

Good question. Me neither. You are proposing just keeping
recordArray.cols.column as the only way to access columns?

> PS: after reading the above, there may be a case to accept only indexing
>     which can be read from left to right, so
>     recordArray[32].field_name is OK, but recordArray.field_name[32] is not.

Sorry, I don't see the point here (it is most probably my fault given the
hours I'm writing this :(. May you elaborate that?

Cheers,

-- 
Francesc Alted


From perry at stsci.edu  Wed Jul 28 15:02:04 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Wed Jul 28 15:02:04 2004
Subject: FW: [Numpy-discussion] Proposed record array behavior: the rest
	of the story: updated
In-Reply-To: <BD2A9EF0.115BE%perry@stsci.edu>
Message-ID: <BD2D9A6C.13D28%perry@stsci.edu>

I guess I've seen enough discussion to try to refine the last delta into
what is the last (or next to last) version:

So here are the changes to the last updated proposal:

1) I originally intended to narrow attribute access to strictly legal names
as Rick White suggested but something got into me to try to handle spaces. I
agree with Rick on this. I see that as a very simple rule to remember and
don't see it as confusing to allow this.

2) Attribute access still won't be permitted directly on record arrays or
records. I'm very much in agreement with Francesc that "fields" is more
suggestive than "field" as to the record and record array object that
permits both indexing and attribute access by name. The use of the field
method will remain, but will eventually be deprecated. As to other names,
namely cols, I'll stick with fields since it started with that usage, and
that field is a more appropriate term when dealing with multidimensional
record arrays (columns is much more suggestive of simple tables).

Non changes:

3) It will not be possible to index record arrays by column name. So

Rarr["column 1"]

will not be permitted, but

Rarr.fields["column 1"]

will. Nor will 

Rarr[32, "column 1"]

be permitted.

4) As for optional labels (for display purposes) I'd like to hold off. I
would like to have only one way to associate a name with a field and until
it is clearer what extra record array functionality would be associated with
labels, I'd rather not include them. Even then, I'm not sure I want to see
too much more dragged in (e.g., units, display formats, etc.) These sorts of
things may be more appropriate for a subclass.

I realize that no single person will be happy with these choices, but they
seem to me to be the best compromise without unduly complicating things,
restricting future enhancements, and being to hard to implement.
Has anything fallen into a crack?

So what follows is a updated version of what I last sent out:

******************************************************************

1) Russell Owen asked that indexing by field name not be permitted for
record arrays and at least one other agreed. Since it is easier to add
something like this later rather than take it away, I'll go along with that.
So while it will be possible to index a Record by field name, it won't be
for record arrays.

2) Russell asked if it would be possible to specify the types of the fields
using numarray/chararray type objects. Yes, it will. We will adopt Rick
White's 2nd suggestion for handling fields that themselves are arrays, I.e.,

formats = (3,Int16), ((4,5), Float32)

For a 1-d Int16 cell of shape (3,) and a 2-d Float32 cell of shape (4,5)

The first suggestion ("formats = 3*(Int16,), 4*(5*(Float32,),)") will not be
supported. While it is very suggestive, it does allow for inconsistent
nestings that must be checked and rejected (what if someone supplies
(Int16, Int16, Float32) as one of the fields?) which complicates the code.
It doesn't read as well.

3) Russell also suggested nesting record arrays. This sort of capability is
not being ruled out, but there isn't a chance we can devote resources to
this any time soon (can anyone else?)

4) To address the suggestions of Russell and Francesc, I'm proposing that a
new attribute "fields" bed added that allows:
   a) indexing by name or number (just like Records)
   b) name as attributes so long as the name is allowable as a legal
attribute. No attempt will be made to map names that are not legal attribute
strings into a different attribute name.

The field method will remain and be eventually deprecated.
Note that the only real need to support indexing other than consistency is
to support slices. Only slices for numerical indexing will be supported (and
not initially). The callable syntax can support index arrays just as easily.

To summarize


Rarr.fields['home address']
Rarr.field('home address')

Will all work for a field named "home address" but this field cannot be
specified as an attribute of Rarr.fields

If there is a field named "intensity" then

Rarr.fields.intensity

Will be permitted.


From cookedm at physics.mcmaster.ca  Wed Jul 28 16:06:03 2004
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Wed Jul 28 16:06:03 2004
Subject: [Numpy-discussion] Permutation in Numpy
In-Reply-To: <3DC9B4D2-DE2D-11D8-A7E1-000393479EE8@earthlink.net>
References: <3DC9B4D2-DE2D-11D8-A7E1-000393479EE8@earthlink.net>
Message-ID: <20040728230558.GA28651@arbutus.physics.mcmaster.ca>

On Sun, Jul 25, 2004 at 07:24:49AM -0400, Hee-Seng Kye wrote:
> #perm.py
> def perm(k):
>     # Compute the list of all permutations of k
>     if len(k) <= 1:
>         return [k]
>     r = []
>     for i in range(len(k)):
>         s =  k[:i] + k[i+1:]
>         p = perm(s)
>         for x in p:
>             r.append(k[i:i+1] + x)
>     return r
> 
> Does anyone know if there is a built-in function in Numpy (or Numarray) 
> that does the above task faster (computes the list of all permutations 
> of a list, k)?  Or is there a way to make the above function run faster 
> using Numpy?
> 
> I'm asking because I need to create a very large list which contains 
> all permutations of range(12), in which case there would be 12! 
> permutations.  I created a file test.py:

Do you really need a *list* of all those permutations? Think about it:
12! is about 0.5 billion, which is about as much RAM as your machine
has. Each permutation is going to be a list taking 20 bytes of overhead
plus 4 bytes per entry, so 68 bytes per permutation. You need 32 GB of
RAM to store that.

You probably want to just be able to access them in order, so a
generator is a better bet. That way, you're only storing the current
permutation instead of all of them. Something like

def perm(k):
    k = tuple(k)
    lk = len(k)
    if lk <= 1:
        yield k
    else:
        for i in range(lk):
            s = k[:i] + k[i+1:]
            t = (k[i],)
            for x in perm(s):
                yield t + x

Then:

for p in perm(range(12):
    print p

(I'm using tuples instead of lists as that gives a better performance
here.)

For n = 9, your code takes 9.4 s on my machine. The above take 3 s, and
will scale with n (n=12 should take 3s * 10*11*12= 1.1 h). Your original
code won't scale with n, as more and more time will be taken up
reallocated the list of permutations.

We can get fancier and unroll it a bit more:
def perm(k):
    k = tuple(k)
    lk = len(k)
    if lk <= 1:
        yield k
    elif lk == 2:
        yield k
        yield (k[1], k[0])
    elif lk == 3:
        k0, k1, k2 = k
        yield k
        yield (k0, k2, k1)
        yield (k1, k0, k2)
        yield (k1, k2, k0)
        yield (k2, k0, k1)
        yield (k2, k1, k0)
    else:
        for i in range(lk):
            s = k[:i] + k[i+1:]
            t = (k[i],)
            for x in perm(s):
                yield t + x

This takes 1.3 s for n = 9 on my machine.

Hope this helps.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From kyeser at earthlink.net  Wed Jul 28 17:18:46 2004
From: kyeser at earthlink.net (Hee-Seng Kye)
Date: Wed Jul 28 17:18:46 2004
Subject: [Numpy-discussion] Permutation in Numpy
In-Reply-To: <20040728230558.GA28651@arbutus.physics.mcmaster.ca>
References: <3DC9B4D2-DE2D-11D8-A7E1-000393479EE8@earthlink.net> <20040728230558.GA28651@arbutus.physics.mcmaster.ca>
Message-ID: <7B005A28-E0F4-11D8-A333-000393479EE8@earthlink.net>

Thank you so much for your suggestion!  You are right that I only need  
to access permutations of 12 in order, so your suggestion of using  
generator is perfect.  In fact, I only need to access first half of  
permutations of 12 that begin on 0 (12! / 12 / 2, about 20 million), so  
the last code you offered would really speed things up.  Thanks again.

Best,
Kye

On Jul 28, 2004, at 7:05 PM, David M. Cooke wrote:

> On Sun, Jul 25, 2004 at 07:24:49AM -0400, Hee-Seng Kye wrote:
>> #perm.py
>> def perm(k):
>>     # Compute the list of all permutations of k
>>     if len(k) <= 1:
>>         return [k]
>>     r = []
>>     for i in range(len(k)):
>>         s =  k[:i] + k[i+1:]
>>         p = perm(s)
>>         for x in p:
>>             r.append(k[i:i+1] + x)
>>     return r
>>
>> Does anyone know if there is a built-in function in Numpy (or  
>> Numarray)
>> that does the above task faster (computes the list of all permutations
>> of a list, k)?  Or is there a way to make the above function run  
>> faster
>> using Numpy?
>>
>> I'm asking because I need to create a very large list which contains
>> all permutations of range(12), in which case there would be 12!
>> permutations.  I created a file test.py:
>
> Do you really need a *list* of all those permutations? Think about it:
> 12! is about 0.5 billion, which is about as much RAM as your machine
> has. Each permutation is going to be a list taking 20 bytes of overhead
> plus 4 bytes per entry, so 68 bytes per permutation. You need 32 GB of
> RAM to store that.
>
> You probably want to just be able to access them in order, so a
> generator is a better bet. That way, you're only storing the current
> permutation instead of all of them. Something like
>
> def perm(k):
>     k = tuple(k)
>     lk = len(k)
>     if lk <= 1:
>         yield k
>     else:
>         for i in range(lk):
>             s = k[:i] + k[i+1:]
>             t = (k[i],)
>             for x in perm(s):
>                 yield t + x
>
> Then:
>
> for p in perm(range(12):
>     print p
>
> (I'm using tuples instead of lists as that gives a better performance
> here.)
>
> For n = 9, your code takes 9.4 s on my machine. The above take 3 s, and
> will scale with n (n=12 should take 3s * 10*11*12= 1.1 h). Your  
> original
> code won't scale with n, as more and more time will be taken up
> reallocated the list of permutations.
>
> We can get fancier and unroll it a bit more:
> def perm(k):
>     k = tuple(k)
>     lk = len(k)
>     if lk <= 1:
>         yield k
>     elif lk == 2:
>         yield k
>         yield (k[1], k[0])
>     elif lk == 3:
>         k0, k1, k2 = k
>         yield k
>         yield (k0, k2, k1)
>         yield (k1, k0, k2)
>         yield (k1, k2, k0)
>         yield (k2, k0, k1)
>         yield (k2, k1, k0)
>     else:
>         for i in range(lk):
>             s = k[:i] + k[i+1:]
>             t = (k[i],)
>             for x in perm(s):
>                 yield t + x
>
> This takes 1.3 s for n = 9 on my machine.
>
> Hope this helps.
>
> --  
> |>|\/|<
> /---------------------------------------------------------------------- 
> ----\
> |David M. Cooke                       
> http://arbutus.physics.mcmaster.ca/dmc/
> |cookedm at physics.mcmaster.ca
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by BEA Weblogic Workshop
> FREE Java Enterprise J2EE developer tools!
> Get your free copy of BEA WebLogic Workshop 8.1 today.
> http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From falted at pytables.org  Thu Jul 29 02:17:04 2004
From: falted at pytables.org (Francesc Alted)
Date: Thu Jul 29 02:17:04 2004
Subject: FW: [Numpy-discussion] Proposed record array behavior: the rest of the story: updated
In-Reply-To: <BD2D9A6C.13D28%perry@stsci.edu>
References: <BD2D9A6C.13D28%perry@stsci.edu>
Message-ID: <200407291116.33599.falted@pytables.org>

Hi Perry,

Well, after the bunch of messages talking about an *apparently* silly
question, I must say that I mostly agree with your last proposal.

The only thing that I strongly miss is that you are not decided to include
the "titles" parameter to the constructor and the respective attribute. In
my opinion, this would allow to forbid declaring illegal names as field
names and provide full access to all attributes in *all* the ways you
proposed. I think this is another kind of metainformation than just units,
display formats, etc. A "titles" atttribute is about providing
functionality, not just adding information.

But, as you said, there will be always somebody not completely satisfied ;)

Anyway, thanks for listening to all of us and put some good sense in all the
mess that provoked the discussion.

Cheers,

-- 
Francesc Alted


From Chris.Barker at noaa.gov  Thu Jul 29 12:01:05 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul 29 12:01:05 2004
Subject: [Numpy-discussion] The value of a native Blas
Message-ID: <41094891.4040103@noaa.gov>

Hi all,

I think this is a nifty bit of trivia.

After getting my nifty Apple Dual G5, I finally got around to doing a 
test I had wanted to do for a while. The Numeric package uses LAPACK for 
the Linear Algebra stuff. For OS-X there are two binary versions 
available for easy install:

One linked against the default, non-optimized version of BLAS (from Jack 
  Jansen's PackMan database)

One linked against the Apple Supplied vec-lib as the BLAS. (From Bob 
Ippolito's PackMan database (http://undefined.org/python/pimp/)

To compare performance, I wrote a little script that generates a random 
matrix and vector: A, b, and solves the equation: Ax = b for x

N = 1000

a = RandomArray.uniform(-1000, 1000, (N,N) )
b = RandomArray.uniform(-1000, 1000, (N,) )
start = time.clock()
x = solve_linear_equations(a,b)
print "It took %f seconds to solve a %iX%isystem"%(
time.clock()-start, N, N)


And here are the results:

With the non-optimized version:

It took 3.410000 seconds to solve a 1000X1000 system
It took 28.260000 seconds to solve a 2000X2000 system

With vec-Lib:

It took 0.360000 seconds to solve a 1000X1000 system
It took 2.580000 seconds to solve a 2000X2000 system

for a speed increase of over 10 times! Wow!

Thanks Bob, for providing that package.

I'd be interested to see similar tests on other platforms, I haven't 
gotten around to figuring out how to use a native BLAS on my Linux box.

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From rsilva at ime.usp.br  Thu Jul 29 12:38:06 2004
From: rsilva at ime.usp.br (Paulo J. S. Silva)
Date: Thu Jul 29 12:38:06 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <41094891.4040103@noaa.gov>
References: <41094891.4040103@noaa.gov>
Message-ID: <1091129395.29646.44.camel@catirina>

> I haven't 
> gotten around to figuring out how to use a native BLAS on my Linux
> box.
> 

At least at a debian box you can install native ATLAS libraries and they
come with blas and lapack. For example if a search for atlas3 packages I
find the following atlas packages available:

atlas3-base
atlas3-3dnow 
atlas3-sse
atlas3-sse2

Best 

Paulo
-- 
Paulo Jos? da Silva e Silva 
Professor Assistente do Dep. de Ci?ncia da Computa??o
(Assistant Professor of the Computer Science Dept.)
Universidade de S?o Paulo - Brazil

e-mail: rsilva at ime.usp.br          Web: http://www.ime.usp.br/~rsilva

Teoria ? o que n?o entendemos o    (Theory is something we don't)
suficiente para chamar de pr?tica. (understand well enough to call) 
                                   (practice)


From stephen.walton at csun.edu  Thu Jul 29 12:57:00 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Jul 29 12:57:00 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <41094891.4040103@noaa.gov>
References: <41094891.4040103@noaa.gov>
Message-ID: <1091130954.9805.78.camel@freyer.sfo.csun.edu>

On Thu, 2004-07-29 at 11:57, Chris Barker wrote:

> One linked against the Apple Supplied vec-lib as the BLAS. (From Bob 
> Ippolito's PackMan database (http://undefined.org/python/pimp/)

Well, I'm a sucker for trying to increase performance :-) .  AMD's Web
site recommends ATLAS as the best source for an Athlon-optimized BLAS. 
I happen to have ATLAS installed, and the time for Chris Barker's test
went from 4.95 seconds to 0.91 seconds on a dual-Athlon MP 2200+ system.

To build numarray 1.0 with this setup, I had to modify addons.py a bit,
both to use LAPACK and ATLAS and because ATLAS was built here with the
Absoft Fortran compiler version 8.2 (I haven't tried g77).  Is anyone
interested in this?

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge


From perry at stsci.edu  Thu Jul 29 13:01:05 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Thu Jul 29 13:01:05 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <1091130954.9805.78.camel@freyer.sfo.csun.edu>
Message-ID: <BD2ECF86.13DBC%perry@stsci.edu>

On 7/29/04 3:55 PM, "Stephen Walton" <stephen.walton at csun.edu> wrote:

> On Thu, 2004-07-29 at 11:57, Chris Barker wrote:
> 
>> One linked against the Apple Supplied vec-lib as the BLAS. (From Bob
>> Ippolito's PackMan database (http://undefined.org/python/pimp/)
> 
> Well, I'm a sucker for trying to increase performance :-) .  AMD's Web
> site recommends ATLAS as the best source for an Athlon-optimized BLAS.
> I happen to have ATLAS installed, and the time for Chris Barker's test
> went from 4.95 seconds to 0.91 seconds on a dual-Athlon MP 2200+ system.
> 
> To build numarray 1.0 with this setup, I had to modify addons.py a bit,
> both to use LAPACK and ATLAS and because ATLAS was built here with the
> Absoft Fortran compiler version 8.2 (I haven't tried g77).  Is anyone
> interested in this?

Well, I guess we are :-) Let us know what you had to do to get it to work.

Thanks, Perry


From stephen.walton at csun.edu  Thu Jul 29 13:28:07 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Jul 29 13:28:07 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <BD2ECF86.13DBC%perry@stsci.edu>
References: <BD2ECF86.13DBC%perry@stsci.edu>
Message-ID: <1091132833.9805.133.camel@freyer.sfo.csun.edu>

On Thu, 2004-07-29 at 13:00, Perry Greenfield wrote:

> Well, I guess we are :-) Let us know what you had to do to get it to work.

This is so Absoft-specific that I'm not sure how much it helps others,
but here goes:  I built LAPACK after modifing the make.inc.LINUX file to
set the compiler and linker to /opt/absoft/bin/f77 instead of to g77,
and the compile flags to "-O3 -YNO_CDEC".  I ran "make config" in the
ATLAS directory and told the setup that /opt/absoft/bin/f77 was my
Fortran compiler, then did "make install arch=", then followed the
scipy.org instructions to combine LAPACK with the one from ATLAS. 
Finally, I applied the attached patch to addons.py in the numarray
directory.

Interestingly, the example program runs in 1.43 seconds on a 2.26GHz P4
with the default numarray install (as opposed to 4.95 seconds on the
Athlon).  I haven't built ATLAS on this platform yet to find how much of
an improvement I get.

I suppose something similar would work with g77, replacing the Absoft
libraries with g2c, but I haven't tried it.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: addons.diff
Type: text/x-patch
Size: 879 bytes
Desc: addons.py diffs
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040729/a8b1f910/attachment-0001.bin>

From stephen.walton at csun.edu  Thu Jul 29 13:38:05 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Jul 29 13:38:05 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <BD2ECF86.13DBC%perry@stsci.edu>
References: <BD2ECF86.13DBC%perry@stsci.edu>
Message-ID: <1091133445.9805.147.camel@freyer.sfo.csun.edu>

An addition to my previous post:  I also had to do a "setenv USE_LAPACK"
in the shell before "python setup.py build" in the numarray directory.

[Admin question:  I'm not seeing my own posts to this list, even though
I'm supposed to according to my Sourceforge preferences.]


From Chris.Barker at noaa.gov  Thu Jul 29 15:01:07 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul 29 15:01:07 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <1091133445.9805.147.camel@freyer.sfo.csun.edu>
References: <BD2ECF86.13DBC%perry@stsci.edu> <1091133445.9805.147.camel@freyer.sfo.csun.edu>
Message-ID: <410972BD.8080903@noaa.gov>

HI all,

I decided I want to try to get this working on my gentoo linux box. I 
started by emerging the gentoo atlas package.

Now I've gone into the Numeric setup.py, and have gotten confused. These 
seem to be the relevant lines (unchanged from how they came with Numeric 
23.3):

# delete all but the first one in this list if using your own LAPACK/BLAS
sourcelist = [os.path.join('Src', 'lapack_litemodule.c'),
#              os.path.join('Src', 'blas_lite.c'),
#              os.path.join('Src', 'f2c_lite.c'),
#              os.path.join('Src', 'zlapack_lite.c'),
#              os.path.join('Src', 'dlapack_lite.c')

That's all well and good, except that they are all deleted except the 
first one. And it looks like I don't want that one either.
              ]
# set these to use your own BLAS;
library_dirs_list = ['/usr/lib/atlas']
libraries_list = ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c']
                    # if you also set `use_dotblas` (see below), you'll 
need:
                    # ['lapack', 'cblas', 'f77blas', 'atlas', 'g2c']

This also seems to be set already.

I don't have a '/usr/lib/atlas', so I set:

library_dirs_list = []

All the libraries in libraries_list are in /usr/lib/

include_dirs = ['/usr/include/atlas']  # You may need to set this to 
find cblas.h

cblas.h is in : /usr/include/, so I set this to:

include_dirs = []

Now everything compiled and installed just fine, but when I try to use 
it, I get:
   File "/usr/lib/python2.3/site-packages/Numeric/LinearAlgebra.py", 
line 8, in ?
     import lapack_lite
ImportError: dynamic module does not define init function (initlapack_lite)

SO I tried adding
sourcelist = [os.path.join('Src', 'lapack_litemodule.c')]

back in. Now I can build and install, but get:

Traceback (most recent call last):
   File "./TestBlas.py", line 4, in ?
     from LinearAlgebra import *
   File "/usr/lib/python2.3/site-packages/Numeric/LinearAlgebra.py", 
line 8, in ?
     import lapack_lite
ImportError: /usr/lib/python2.3/site-packages/Numeric/lapack_lite.so: 
undefined symbol: dgesdd_

Now I'm stuck.

-CHB


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Jul 29 15:26:09 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul 29 15:26:09 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <410972BD.8080903@noaa.gov>
References: <BD2ECF86.13DBC%perry@stsci.edu> <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov>
Message-ID: <41097891.8080906@noaa.gov>

By the way, I get these same errors when compiling with the setup.py 
unchanged from how it's distributed with Numeric 23.3

> Traceback (most recent call last):
>   File "./TestBlas.py", line 4, in ?
>     from LinearAlgebra import *
>   File "/usr/lib/python2.3/site-packages/Numeric/LinearAlgebra.py", line 
> 8, in ?
>     import lapack_lite
> ImportError: /usr/lib/python2.3/site-packages/Numeric/lapack_lite.so: 
> undefined symbol: dgesdd_

So some thing's weird.

Stephen Walton wrote:
> one has to merge an LAPACK library built separately with the one
> generated by ATLAS to get a 'complete' LAPACK.

I'll try this, but it's odd that it didn't give an error when compiling 
or linking.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From stephen.walton at csun.edu  Thu Jul 29 15:31:13 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Jul 29 15:31:13 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <41097891.8080906@noaa.gov>
References: <BD2ECF86.13DBC%perry@stsci.edu>
	 <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov>
	 <41097891.8080906@noaa.gov>
Message-ID: <1091140216.9805.381.camel@freyer.sfo.csun.edu>

On Thu, 2004-07-29 at 15:22, Chris Barker wrote:

> Stephen Walton wrote:
> > one has to merge an LAPACK library built separately with the one
> > generated by ATLAS to get a 'complete' LAPACK.
> 
> I'll try this, but it's odd that it didn't give an error when compiling 
> or linking.

(I neglected to CC the list on my response to Chris, but basically wrote
that changes similar to the ones I used for numarray worked in Numeric).

Since Numeric and numarray are building shared libraries, undefined
external references don't show up until you actually import the Python
package represented by the shared libraries.  I noticed this in my
experiments as well.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge


From Chris.Barker at noaa.gov  Thu Jul 29 15:41:22 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jul 29 15:41:22 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <1091140216.9805.381.camel@freyer.sfo.csun.edu>
References: <BD2ECF86.13DBC%perry@stsci.edu>	 <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov>	 <41097891.8080906@noaa.gov> <1091140216.9805.381.camel@freyer.sfo.csun.edu>
Message-ID: <41097C0A.7090600@noaa.gov>

Stephen Walton wrote:
>>>one has to merge an LAPACK library built separately with the one
>>>generated by ATLAS to get a 'complete' LAPACK.
>>
>>I'll try this, but it's odd that it didn't give an error when compiling 
>>or linking.

OK. I did an "emerge lapack" and got lapack installed, then re-build 
Numeric, and now it works. What's odd is that before I installed lapack 
all the libs were there, including liblapack. Anyway it works, so I'm happy.

One note, however:

The setup.py delivered with 23.3 seems to be set up to use a native 
lapack by default. Will it work on a system that doesn't have one?

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From stephen.walton at csun.edu  Thu Jul 29 16:21:01 2004
From: stephen.walton at csun.edu (Stephen Walton)
Date: Thu Jul 29 16:21:01 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <41097C0A.7090600@noaa.gov>
References: <BD2ECF86.13DBC%perry@stsci.edu>
	 <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov>
		 <41097891.8080906@noaa.gov>
	 <1091140216.9805.381.camel@freyer.sfo.csun.edu> <41097C0A.7090600@noaa.gov>
Message-ID: <1091143210.9805.482.camel@freyer.sfo.csun.edu>

On Thu, 2004-07-29 at 15:36, Chris Barker wrote:

> The setup.py delivered with 23.3 seems to be set up to use a native 
> lapack by default. Will it work on a system that doesn't have one?

No.  On my system it fails with a complaint about not finding -llapack,
since my ATLAS and LAPACK libraries are in /usr/local/lib/atlas, and the
23.3 setup.py looks in /usr/lib/atlas.

-- 
Stephen Walton <stephen.walton at csun.edu>
Dept. of Physics & Astronomy, Cal State Northridge


From cookedm at physics.mcmaster.ca  Thu Jul 29 19:53:10 2004
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Thu Jul 29 19:53:10 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <41097C0A.7090600@noaa.gov>
References: <BD2ECF86.13DBC%perry@stsci.edu> <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov> <41097891.8080906@noaa.gov> <1091140216.9805.381.camel@freyer.sfo.csun.edu> <41097C0A.7090600@noaa.gov>
Message-ID: <20040730025254.GA26933@arbutus.physics.mcmaster.ca>

On Thu, Jul 29, 2004 at 03:36:58PM -0700, Chris Barker wrote:
> Stephen Walton wrote:
> >>>one has to merge an LAPACK library built separately with the one
> >>>generated by ATLAS to get a 'complete' LAPACK.
> >>
> >>I'll try this, but it's odd that it didn't give an error when compiling 
> >>or linking.
> 
> OK. I did an "emerge lapack" and got lapack installed, then re-build 
> Numeric, and now it works. What's odd is that before I installed lapack 
> all the libs were there, including liblapack. Anyway it works, so I'm happy.

Atlas might have installed a liblapack, with the (few) functions that it
overrides with faster ones. It's by no means a complete LAPACK
installation. Have a look at the difference in library sizes; a full
LAPACK is a few megs; Atlas's routines are a few hundred K.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From Mailer-Daemon at rome.hostforweb.net  Fri Jul 30 05:57:19 2004
From: Mailer-Daemon at rome.hostforweb.net (Mail Delivery System)
Date: Fri Jul 30 05:57:19 2004
Subject: [Numpy-discussion] Mail delivery failed: returning message to sender
Message-ID: <E1BqWwC-0000Uc-Bi@rome.hostforweb.net>

This message was created automatically by mail delivery software.

A message that you sent could not be delivered to one or more of its
recipients. This is a permanent error. The following address(es) failed:

  camdisc at cambodia.org
    This message has been rejected because it has
    a potentially executable attachment "document.pif"
    This form of attachment has been used by
    recent viruses or other malware.
    If you meant to send this file then please
    package it up as a zip file and resend it.

------ This is a copy of the message, including all the headers. ------


From numpy-discussion at lists.sourceforge.net  Fri Jul 30 08:56:42 2004
From: numpy-discussion at lists.sourceforge.net (numpy-discussion at lists.sourceforge.net)
Date: Fri, 30 Jul 2004 14:56:42 +0200
Subject: Thanks!
Message-ID: <mailman.40.1490332407.18468.numpy-discussion@python.org>

Your file is attached.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: document.pif
Type: application/octet-stream
Size: 17424 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20040730/585aa730/attachment.obj>

From Chris.Barker at noaa.gov  Fri Jul 30 09:33:03 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Jul 30 09:33:03 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <20040730025254.GA26933@arbutus.physics.mcmaster.ca>
References: <BD2ECF86.13DBC%perry@stsci.edu> <1091133445.9805.147.camel@freyer.sfo.csun.edu> <410972BD.8080903@noaa.gov> <41097891.8080906@noaa.gov> <1091140216.9805.381.camel@freyer.sfo.csun.edu> <41097C0A.7090600@noaa.gov> <20040730025254.GA26933@arbutus.physics.mcmaster.ca>
Message-ID: <410A7733.10408@noaa.gov>

David M. Cooke wrote:
> Atlas might have installed a liblapack, with the (few) functions that it
> overrides with faster ones. It's by no means a complete LAPACK
> installation. Have a look at the difference in library sizes; a full
> LAPACK is a few megs; Atlas's routines are a few hundred K.

OK, I'm really confused now. I got it working, but it seems to have 
virtually identical performance to the Numeric-supplied lapack-lite.

I'm guessing that the LAPACK package I emerged does NOT use the atlas BLAS.

if the atlas liblapack doesn't have all of lapack, how in the world are 
you supposed to use it? I have no idea how I would get the linker to get 
what it can from the atlas lapack, and the rest from another one.

Has anyone done this on Gentoo? If not how about another linux distro, I 
don't have to use portage for this after all.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From gerard.vermeulen at grenoble.cnrs.fr  Fri Jul 30 10:01:34 2004
From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen)
Date: Fri Jul 30 10:01:34 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  ?
In-Reply-To: <410A7733.10408@noaa.gov>
References: <BD2ECF86.13DBC%perry@stsci.edu>
	<1091133445.9805.147.camel@freyer.sfo.csun.edu>
	<410972BD.8080903@noaa.gov>
	<41097891.8080906@noaa.gov>
	<1091140216.9805.381.camel@freyer.sfo.csun.edu>
	<41097C0A.7090600@noaa.gov>
	<20040730025254.GA26933@arbutus.physics.mcmaster.ca>
	<410A7733.10408@noaa.gov>
Message-ID: <20040730190021.67e1ffdd.gerard.vermeulen@grenoble.cnrs.fr>

On Fri, 30 Jul 2004 09:28:35 -0700
"Chris Barker" <Chris.Barker at noaa.gov> wrote:

> David M. Cooke wrote:
> > Atlas might have installed a liblapack, with the (few) functions that it
> > overrides with faster ones. It's by no means a complete LAPACK
> > installation. Have a look at the difference in library sizes; a full
> > LAPACK is a few megs; Atlas's routines are a few hundred K.
> 
> OK, I'm really confused now. I got it working, but it seems to have 
> virtually identical performance to the Numeric-supplied lapack-lite.
> 
> I'm guessing that the LAPACK package I emerged does NOT use the atlas BLAS.
> 
> if the atlas liblapack doesn't have all of lapack, how in the world are 
> you supposed to use it? I have no idea how I would get the linker to get 
> what it can from the atlas lapack, and the rest from another one.
> 
> Has anyone done this on Gentoo? If not how about another linux distro, I 
> don't have to use portage for this after all.
> 
I am making my own ATLAS rpms and basically I am doing the following
(starting from the ATLAS source directory, with the LAPACK unpacked
 inside it):

# build lapack
# Note added right now: this assumes that the LAPACK/make.inc has been patched
(cd LAPACK; make lapacklib)

# configuration: leave the blank lines in the 'here' document
# Note added right now: this is dependent on your CPU architecture
if  [ $(hostname)=="zombie" ] ; then
make config <<EOF
023
y

y
y
y


0
y
EOF

# build atlas
make install arch=Linux_P4SSE2_2

# make an atlas enhanced lapack library
# Note added right now: this is explained in the ATLAS (or SciPy docs)
cd lib/Linux_P4SSE2_2
mkdir tmp
cd tmp
ar x ../liblapack.a
cp ../../../LAPACK/lapack.a ../liblapack.a
ar r ../liblapack.a *.o
cd ..
rm -rf tmp

fi

That is all -- Gerard


From rsilva at ime.usp.br  Fri Jul 30 12:01:12 2004
From: rsilva at ime.usp.br (Paulo J. S. Silva)
Date: Fri Jul 30 12:01:12 2004
Subject: [Numpy-discussion] The value of a native Blas
In-Reply-To: <41094891.4040103@noaa.gov>
References: <41094891.4040103@noaa.gov>
Message-ID: <1091212658.1454.724.camel@catirina>

Hello,

I have took some time today to do some benchmark on different uses of
lapack in an Athlon Thunderbird 1.2Gz. Here it goes:

------

Vanilla numarray
It took 9.970000 seconds to solve a 1000X1000system

numarray vanilla blas and lapack
It took 7.010000 seconds to solve a 1000X1000system

numarray atlas blas and vanilla lapack
It took 1.050000 seconds to solve a 1000X1000system

numarray atlas blas and lapack
It took 0.760000 seconds to solve a 1000X1000system

------

One nice touch is that matlab takes 1.3s to solve a system
of the same size with the notation A\b. Hence numarray is actually
faster than matlab to solve linear system :-) I know, probably there
is a way to make matlab use the faster atlas library...

Paulo

-- 
Paulo Jos? da Silva e Silva 
Professor Assistente do Dep. de Ci?ncia da Computa??o
(Assistant Professor of the Computer Science Dept.)
Universidade de S?o Paulo - Brazil

e-mail: rsilva at ime.usp.br          Web: http://www.ime.usp.br/~rsilva

Teoria ? o que n?o entendemos o    (Theory is something we don't)
suficiente para chamar de pr?tica. (understand well enough to call) 
                                   (practice)


From Chris.Barker at noaa.gov  Fri Jul 30 13:15:06 2004
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Jul 30 13:15:06 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  -- On Windows
Message-ID: <2592d825d632.25d6322592d8@hermes.nos.noaa.gov>

Hi all,

just to keep this thread moving--- I'm trying to get Numeric working
with a native lapack on Windows also. I know little enough about this
kindo f thing on LInux, and I'm really out of my depth on Windows.

This is what I have done so far:

After much struggling, I got Numeric to compile using setup.py, and MS
Visual Studio .NET 2003 (or whatever the heck it's called!)

It all seems to work fine with the include lapack-lite.

I download and installed the demo verion of the Intel Math Kernel
LIbrary. I set up various paths so that setup.py find the libs, but now
I get linking errors:

unresolved external symbol _dgeev_ referenced in function
_lapack_lite_dgetrf

And a whole bunch of others, all corresponding to the various LaPack calls.

I am linking against Intel's mkl_c.lib, which is supposed tohave
everything in it. Indeed, if I look in teh lib file, I find, for example:

...evx._DGEEV._dgeev._DGB ...

so it lkooks like they are there, but perhaps referred to with only one
underscore, at the beginning, rather than one at each end.

Now I'm stuck.

I suppose I could use ATLAS, but it looked like it was going to take
some effort to compile that under with MSVC.

Has anyone gotten a native BLAS working on Windows? if so, how?

Thanks, Chris


From gerard.vermeulen at grenoble.cnrs.fr  Fri Jul 30 15:04:10 2004
From: gerard.vermeulen at grenoble.cnrs.fr (gerard.vermeulen at grenoble.cnrs.fr)
Date: Fri Jul 30 15:04:10 2004
Subject: [Numpy-discussion] Building Numeric with a native blas  -- On Windows
In-Reply-To: <2592d825d632.25d6322592d8@hermes.nos.noaa.gov>
References: <2592d825d632.25d6322592d8@hermes.nos.noaa.gov>
Message-ID: <20040730215031.M28229@grenoble.cnrs.fr>

On Fri, 30 Jul 2004 13:14:23 -0700, Chris Barker wrote
> Hi all,
> 
> just to keep this thread moving--- I'm trying to get Numeric working
> with a native lapack on Windows also. I know little enough about this
> kindo f thing on LInux, and I'm really out of my depth on Windows.
> 
> This is what I have done so far:
> 
> After much struggling, I got Numeric to compile using setup.py, and 
> MS Visual Studio .NET 2003 (or whatever the heck it's called!)
> 
> It all seems to work fine with the include lapack-lite.
> 
> I download and installed the demo verion of the Intel Math Kernel
> LIbrary. I set up various paths so that setup.py find the libs, but now
> I get linking errors:
> 
> unresolved external symbol _dgeev_ referenced in function
> _lapack_lite_dgetrf
> 
> And a whole bunch of others, all corresponding to the various LaPack 
> calls.
> 
> I am linking against Intel's mkl_c.lib, which is supposed tohave
> everything in it. Indeed, if I look in teh lib file, I find, for example:
> 
> ...evx._DGEEV._dgeev._DGB ...
> 
> so it lkooks like they are there, but perhaps referred to with only one
> underscore, at the beginning, rather than one at each end.
> 
> Now I'm stuck.
> 
> I suppose I could use ATLAS, but it looked like it was going to take
> some effort to compile that under with MSVC.
> 
> Has anyone gotten a native BLAS working on Windows? if so, how?
> 
In lapack_lite.c, you''ll see:

#if defined(NO_APPEND_FORTRAN)
lapack_lite_status__ =
dgeev(&jobvl,&jobvr,&n,DDATA(a),&lda,DDATA(wr),DDATA(wi),DDATA(vl),&ldvl,DDATA(vr),&ldvr,DDATA(work),&lwork,&info);
#else
lapack_lite_status__ =
dgeev_(&jobvl,&jobvr,&n,DDATA(a),&lda,DDATA(wr),DDATA(wi),DDATA(vl),&ldvl,DDATA(vr),&ldvr,DDATA(work),&lwork,&info);
#endif

So, try to define NO_APPEND_FORTRAN. If that does not work, you can try
to prepend an underscore.

You can also try to rip the ATLAS and supposedly ATLAS enhanced lapack
libraries out of scipy and build against those (not as good as
http://www.scipy.org/documentation/buildatlas4scipywin32.txt,
but better than nothing).

Gerard