[Python-Dev] Expose the array interface in Python 2.5?

Nick Coghlan ncoghlan at gmail.com
Fri Mar 17 10:58:32 CET 2006


Travis E. Oliphant wrote:
> Would it be possible to add at least the C-struct array interface to the 
> Python arrayobject in time for Python 2.5?

Do you mean simply adding an __array_shape__ attribute that consists of a 
tuple with the array length, and an __array_type__ attribute set to 'O'?

Or trying to expose the array object's data?

The former seems fairly pointless, and the latter difficult (since it has 
implications for moving the data store when the array gets resized).

> We would love any feedback from the Python community on the array 
> interface.  Especially because we'd like to see it in Python itself and 
> supported and used by every relevant Python package sooner rather than 
> later.

I've spent a fair bit of time looking at this interface, and while I'm a big
fan of the basic idea, I'm not convinced that it makes sense to
include the interface in the core without *also* adopting a common convention
for multi-dimensional fixed shape indexing (e.g. by introducing a simple
dimensioned array type as something like array.dimarray).

The fact that array.array is a mutable sequence rather than a fixed shape
array means that it doesn't mesh particularly well with the ideas behind the 
array interface. numpy arrays can have their shape changed via reshape, but 
they impose the rule that the total number of elements can't change so that 
the allocated memory doesn't need to be moved - the standard library's array 
type has no such limitation.

Aside from the obvious (the use of Ellipsis and permitting multiple
dimensions), there are a number of ways in which the semantics of numpy array
subscripts differ from normal sequence subcripts, and which of these should be
part of the common multi-dimensional indexing conventions needs to be thrashed
out in a PEP:

   - numpy array slices are views that permit mutation of the original object
     (slicing a sequence creates a copy of the sliced section)

   - assignment to slices is not allowed to change the shape of a numpy array
     (assigning to a slice of a normal sequence may change the total length)

   - deletion of slices is not permitted by numpy arrays
     (deleting a slice of a sequence changes the total length)

   - NewAxis is a novel use of subscript notation

   - there are sophisticated rules to try to align numpy array shapes

   - assignment of a sequence to a numpy array section is rather disconcerting,
     as the checks to determine what should and should not be repeated to fit
     into the available space are type based

For something in the standard library, much of the complexity should be
stripped out, with the clever bits of programmer convenience left for numpy to
provide. However, decided which bits to remove and which to keep is a
non-trivial task.

Given that even the bytes type has been deferred to 2.6 to allow further 
consideration of the appropriate API, my vote is to do the same for an 
array.dimarray type and allow more time to figure out the appropriate *Python* 
interface.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org


More information about the Python-Dev mailing list