[Numpy-discussion] Notes from meeting with Guido regarding inclusion of array package in Python core

Chris Barker Chris.Barker at noaa.gov
Thu Mar 10 11:26:29 EST 2005


Perry Greenfield wrote:
> So what about supporting arrays as an interchange format?

I'd like to see some kind of definition of what this means, or maybe a 
set of examples, to help clarify this discussion. I'll start with my 
personal example:

wxPython has a number of methods that can potentially deal with large 
datasets being passed between Python and C++. My personal example is 
drawing routines. For instance, drawing a large polyline or set of many 
points. When I need these, I invariably use NumPy arrays to store and 
manipulate the data in Python, then pass it in to wxPython to draw or 
whatever. Robin has created a set of functions like: "wxPointListHelper" 
that convert between Python sequences and the wxList of wxPoints that 
are required by wx. Early on, only lists of tuples (for this example) 
could be used. At some point, the Helper functions were extended (thanks 
to Tim Hochberg, I think) to use the generic sequence access methods so 
that Numeric arrays and other data structures could be used. This was 
fabulous, but at the moment, it is faster to pass in a list of tuples 
than it is to pass in a NX2 Numeric array, and numarrays are much slower 
still.

A long time ago I suggested that Robin add (with help from me and 
others), Numeric-specific version of wxPointListHelper and friends. 
Robin declined, as he (quite reasonably) doesn't want a dependency on 
Numeric in wxPython. However, I still very much want wxPython to be able 
to work efficiently with numerix arrays.

I'm going to comment on the following in light of this example.

> a) So long as the extension package has access to the necessary array 
> include files, it can build the extension to use the arrays as a format 
> without actually having the array package installed.
 > The
> extension would, when requested to use arrays would see if it could 
> import the array package, if not, then all use of arrays would result in 
> exceptions.

I'm not sure this is even necessary. In fact, in the above example, what 
would most likely happen is that the **Helper functions would check to 
see if the input object was an array, and then fork the code if it were. 
An array couldn't be passed in unless the package were there, so there 
would be no need for checking imports or raising exceptions.

> It could be built, and then later the array package could be 
> installed and no rebuilding would be necessary.

That is a great feature.

I'm concerned about the inclusion of all the headers in either the core 
or with the package, as that would lock you to a different upgrade cycle 
than the main numerix upgrade cycle. It's my experience that Numeric has 
not been binary compatible across versions.

> b) One could modify the extension build process to see if the package is 
> installed and the include files are available, if so, it is built with 
> the support, otherwise not.The disadvantage is that later adding the array package
> require the extension to be rebuilt

This is a very big deal as most users on Windows and OS-X (and maybe 
even Linux) don't build packages themselves.

A while back this was discussed on this very list, and it seemed like 
there was some idea about including not the whole numerix header 
package, but just the code for PyArray_Check or an equivalent. This 
would allow code to check if an input object was an array, and do 
something special if it was. That array-specific code would only get run 
if an array was passed in, so you'd know numerix was installed at run 
time. This would require Numerix to be installed at build time, but it 
would be optional at run time. I like this, because anyone capable of 
building wxPython (it can be tricky) is capable of installing Numeric, 
but folks that are using binaries don't need to know anything about it.

This would only really work for extensions that use arrays, but don't 
create them. We'd still have the version mismatch problem too.

> c) One could provide the support at the Python level by instead relying 
> on the use of buffer objects by the extension at the C level, thus 
> avoiding any dependence on the array C api.

This sounds great, but is a little beyond me technically.

> c) return rank-0 array
> 

> Particularly with regard to ieee exception handling

major pro here for me!

> Guido was very receptive to 
> supporting a special method, __index__ which would allow any Python 
> object to be used as an index to a sequence or mapping object.

yeah!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov




More information about the NumPy-Discussion mailing list