[SciPy-dev] Notes from meeting with Guido regarding inclusion of array package in Python core

Thu Mar 10 14:41:58 EST 2005

Hi All,

On Thu, 2005-03-10 at 10:28 -0500, Perry Greenfield wrote:
<snip>
> c) One could provide the support at the Python level by instead relying 
> on the use of buffer objects by the extension at the C level, thus 
> avoiding any dependence on the array C api. So long as the extension 
> has the ability to return buffer objects containing the putative array 
> data to the Python level and the necessary meta information (in this 
> case, the shape, type, and other info, e.g., byteswapping, necessary to 
> properly interpret the array) to Python, the extension can provide its 
> own functions or methods to convert these buffer objects into arrays 
> without copying of the data in the buffer object. The extension can try 
> to import the array package, and if it is present, provide arrays as a 
> data format using this scheme. In many respects this is the most 
> attractive approach. It has no dependencies on include files, build 
> order, etc. This approach led to the suggestion that Python develop a 
> buffer object that could contain meta information, and a way of 
> supporting community conventions (e.g., a name attribute indicating 
> which conventions was being used) to facilitate the interchange of any 
> sort of binary data, not just arrays. We also concluded that it would 
> be nice to be able create buffer objects from Python with malloced 
> memory (currently one can only create buffer objects from other objects 
> that already have memory allocated; there is no way of creating newly 
> allocated, writable memory from Python within a buffer object; one can 
> create a buffer object from a string, but it is not writable).

I like this idea. It also provides a nice interface for writing classes
in python that use c (c++) routines, where the buffer can be used to
store state information. I needed something like this for a Mersenne
Twister class and ended up using c++ with boost/python.

> 2) Scalar support, rank-0 and related. Travis and I agreed (we 
> certainly seek comments on this conclusion; we may have forgotten about 
> key arguments arguing for one the different approaches) that the 
> desirability of using rank-0 arrays as return values from single 
> element indexing depends on other factors, most importantly Python's 
> support for scalars in various aspects. This is a multifaceted issue 
> that will need to be determined by considering all the facets 
> simultaneously. The following tries to list the pro's and con's 
> previously discussed for returning scalars (two cases previously 
> discussed) or rank-0 arrays (input welcomed).

I agree that this is a difficult problem and it is hard to make a
decision. I think the simplest solution is to return rank-0 arrays. Once
people get used to this it should seem natural and obviate all that
checking that currently goes on. Overhead is a concern, but I suspect
that the heavy computation lifting will be done by the array routines,
not python code. However, it is important to make sure it is easy to
write extensions so things can be prototyped in python, then made
efficient with an extension. One further problem with this approach is
that python types need to be cast to suitable types for the array. When
working with, say, a float array and multiplying by a python 1.0, it
would be undesirable to return the result as a double. Using rank-0
arrays preserves the type info, but for python scalars perhaps we need
some simple typecast functions so that one can write Int16(pythonScalar)
and get the appropriate scalar to use in a numeric routine. Perhaps a
special Numeric3 scalar type would be appropriate and address the
efficiency problems.

One other glitch I see with this approach is that numerix functions like
sin() currently take the place of the same functions in the math module.
It would be nice if the different behavior -- if there is such -- could
be made transparent.

>  From the discussions it was clear that at least two Python PEPs need to 
> be written and implemented, but that these needed to wait until the 
> unification of the arrayobject takes place.
> 
> PEP 1:  Insertion of an __index__ special method and an as_index slot 
> (perhaps in the as_sequence methods) in the C-level typeobject  into 
> Python.
> 
> PEP 2:  Improvements on the buffer object and buffer builtin method so 
> that buffer objects can be Python-tracked wrappers around allocated 
> memory that extension packages can use and share.  Two extensions are 
> considered so far.  1) The buffer objects have a meta attribute so that 
> meta information can be passed around in a unified manner and 2) The 
> buffer builtin should take an integer giving the size of writeable 
> buffer object to create.
> 
Thanks for pursuing this. In the long run it will make python a more
serviceable language for numerical work.

Chuck