[Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays

Guido van Rossum guido@python.org
Fri, 08 Dec 2000 11:10:50 -0500


> What is the status of PEP 209?  I see David Ascher is the champion of
> this PEP, but nothing has been written up.  Is the intention of this
> PEP to make the current Numeric a built-in feature of Python or to
> re-implement and replace the current Numeric module?

David has already explained why his name is on it -- basically,
David's name is on several PEPs but he doesn't currently have any time
to work on these, so other volunteers are most welcome to join.

It is my understanding that the current Numeric is sufficiently messy
in implementation and controversial in semantics that it would not be
a good basis to start from.

However, I do think that a basic multi-dimensional array object would
be a welcome addition to core Python.

> The reason that I ask these questions is because I'm working on a
> prototype of a new N-dimensional Array module which I call Numeric 2.
> This new module will be much more extensible than the current Numeric.
> For example, new array types and universal functions can be loaded or
> imported on demand.  We also intend to implement a record (or
> C-structure) type, because 1-D arrays or lists of records are a common
> data structure for storing photon events in astronomy and related
> fields.

I'm not familiar with the use of computers in astronomy and related
fields, so I'll take your word for that! :-)

> The current Numeric does not handle record types efficiently,
> particularly when the data type is not aligned and is in non-native
> endian format.  To handle such data, temporary arrays must be created
> and alignment and byte-swapping done on them.  Numeric 2 does such
> pre- and post-processing inside the inner-most loop which is more
> efficient in both time and memory.  It also does type conversion at
> this level which is consistent with that proposed for PEP 208.
> 
> Since many scientific users would like direct access to the array data
> via C pointers, we have investigated using the buffer object.  We have
> not had much success with it, because of its implementation.  I have
> scanned the python-dev mailing list for discussions of this issue and
> found that it now appears to be deprecated.

Indeed.  I think it's best to leave the buffer object out of your
implementation plans.  There are several problems with it, and one of
the backburner projects is to redesign it to be much more to the point
(providing less, not more functionality).

> My opinion on this is that a new _fundamental_ built-in type should be
> created for memory allocation with features and an interface similar
> to the _mmap_ object.  I'll call this a _malloc_ object.  This would
> allow Numeric 2 to use either object interchangeably depending on the
> circumstance.  The _string_ type could also benefit from this new
> object by using a read-only version of it.  Since its an object, it's
> memory area should be safe from inadvertent deletion.

Interesting.  I'm actually not sufficiently familiar with mmap to
comment.  But would the existing array module's array object be at all
useful?  You can get to the raw bytes in C (using the C buffer API,
which is not deprecated) and it is extensible.

> Because of these and other new features in Numeric 2, I have a keen
> interest in the status of PEPs 207, 208, 211, 225, and 228; and also
> in the proposed buffer object.  

Here are some quick comments on the mentioned PEPs.

207: Rich Comparisons.  This will go into Python 2.1.  (I just
finished the first draft of the PEP, please read it and comment.)

208: Reworking the Coercion Model.  This will go into Python 2.1.
Neil Schemenauer has mostly finished the patches already.  Please
comment.

211: Adding New Lineal Algebra Operators (Greg Wilson).  This is
unlikely to go into Python 2.1.  I don't like the idea much.  If you
disagree, please let me know!  (Also, a choice has to be made between
211 and 225; I don't want to accept both, so until 225 is rejected,
211 is in limbo.)

225: Elementwise/Objectwise Operators (Zhu, Lielens).  This will
definitely not go into Python 2.1.  It adds too many new operators.

228: Reworking Python's Numeric Model.  This is a total pie-in-the-sky
PEP, and this kind of change is not likely to happen before Python
3000.

> I'm willing to implement this new _malloc_ object if members of the
> python-dev list are in agreement.  Actually I see no alternative,
> given the current design of Numeric 2, since the Array class will
> initially be written completely in Python and will need a mutable
> memory buffer, while the _string_ type is meant to be a read-only
> object.

Would you be willing to take over authorship of PEP 209?  David Ascher
and the Numeric Python community will thank you.

--Guido van Rossum (home page: http://www.python.org/~guido/)