[Numpy-discussion] Notes from meeting with Guido regarding inclusion of array package in Python core
Chris Barker
Chris.Barker at noaa.gov
Thu Mar 10 11:26:29 EST 2005
Perry Greenfield wrote:
> So what about supporting arrays as an interchange format?
I'd like to see some kind of definition of what this means, or maybe a
set of examples, to help clarify this discussion. I'll start with my
personal example:
wxPython has a number of methods that can potentially deal with large
datasets being passed between Python and C++. My personal example is
drawing routines. For instance, drawing a large polyline or set of many
points. When I need these, I invariably use NumPy arrays to store and
manipulate the data in Python, then pass it in to wxPython to draw or
whatever. Robin has created a set of functions like: "wxPointListHelper"
that convert between Python sequences and the wxList of wxPoints that
are required by wx. Early on, only lists of tuples (for this example)
could be used. At some point, the Helper functions were extended (thanks
to Tim Hochberg, I think) to use the generic sequence access methods so
that Numeric arrays and other data structures could be used. This was
fabulous, but at the moment, it is faster to pass in a list of tuples
than it is to pass in a NX2 Numeric array, and numarrays are much slower
still.
A long time ago I suggested that Robin add (with help from me and
others), Numeric-specific version of wxPointListHelper and friends.
Robin declined, as he (quite reasonably) doesn't want a dependency on
Numeric in wxPython. However, I still very much want wxPython to be able
to work efficiently with numerix arrays.
I'm going to comment on the following in light of this example.
> a) So long as the extension package has access to the necessary array
> include files, it can build the extension to use the arrays as a format
> without actually having the array package installed.
> The
> extension would, when requested to use arrays would see if it could
> import the array package, if not, then all use of arrays would result in
> exceptions.
I'm not sure this is even necessary. In fact, in the above example, what
would most likely happen is that the **Helper functions would check to
see if the input object was an array, and then fork the code if it were.
An array couldn't be passed in unless the package were there, so there
would be no need for checking imports or raising exceptions.
> It could be built, and then later the array package could be
> installed and no rebuilding would be necessary.
That is a great feature.
I'm concerned about the inclusion of all the headers in either the core
or with the package, as that would lock you to a different upgrade cycle
than the main numerix upgrade cycle. It's my experience that Numeric has
not been binary compatible across versions.
> b) One could modify the extension build process to see if the package is
> installed and the include files are available, if so, it is built with
> the support, otherwise not.The disadvantage is that later adding the array package
> require the extension to be rebuilt
This is a very big deal as most users on Windows and OS-X (and maybe
even Linux) don't build packages themselves.
A while back this was discussed on this very list, and it seemed like
there was some idea about including not the whole numerix header
package, but just the code for PyArray_Check or an equivalent. This
would allow code to check if an input object was an array, and do
something special if it was. That array-specific code would only get run
if an array was passed in, so you'd know numerix was installed at run
time. This would require Numerix to be installed at build time, but it
would be optional at run time. I like this, because anyone capable of
building wxPython (it can be tricky) is capable of installing Numeric,
but folks that are using binaries don't need to know anything about it.
This would only really work for extensions that use arrays, but don't
create them. We'd still have the version mismatch problem too.
> c) One could provide the support at the Python level by instead relying
> on the use of buffer objects by the extension at the C level, thus
> avoiding any dependence on the array C api.
This sounds great, but is a little beyond me technically.
> c) return rank-0 array
>
> Particularly with regard to ieee exception handling
major pro here for me!
> Guido was very receptive to
> supporting a special method, __index__ which would allow any Python
> object to be used as an index to a sequence or mapping object.
yeah!
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list