[Numpy-discussion] __array_interface__ questions

Matthias Baas matthias.baas at gmail.com
Sun Apr 20 08:33:24 EDT 2008


Hi,

I would like to make use of the __array_interface__ to be able to get 
large data sets into and out of C/C++ functions without having to touch 
each item individually by Python code.
I was reading through the __array_interface__ description and now I have 
a couple of questions to clarify things:

- There are two attributes, __array_interface__ and __array_struct__, 
and the spec says that an array has to implement only one of the two and 
an object processing an array also has only to check for one those 
attributes. But this means both sides can be compliant to the spec but 
they still can't be used with each other. Or is this meant to be two 
separate specs that just serve the same purpose?

- There is a version attribute specifying the version of the interface. 
If I support version 3, what am I supposed to do when I encounter a 
higher version number? Is it always guaranteed that higher versions are 
backwards compatible? (the last sentence in the description seems to 
suggest that but can this really be guaranteed?). Shouldn't there be 
something like a major and minor version where different major versions 
are incompatible to each other? Or maybe a current version number and a 
compatibility version number or something like that?

- There are several variants how the "data" attribute can look like and 
in some cases it is necessary to use the Python buffer interface. Is 
there an example how to use that interface (as I haven't done this 
before)? Or even better, is there a reference implementation for module 
authors that they could just use and that provides a simple to use API 
for obtaining the internal buffer pointer and layout information given 
an arbitrary Python object? I realized that writing a correct 
implementation of this myself is not really a trivial task (mainly 
because of all the variations that are allowed by the spec).

- Is there anything particular I have to know to make sure my module 
also works on 64bit systems? What types should I use/assume when reading 
the pointer from the "data" tuple?

- I had a look at PIL 1.1.6 and the __array_interface__ dict that the 
image object provides. The first thing I noticed is that it doesn't have 
a "version" attribute even though the spec says this one is required. 
The second thing is that the "data" attribute is a string, so I would 
have to use the buffer interface to access the data. Is there a way to 
get at the data pointer from Python so that I can pass it to a C 
function via ctypes?

- Using the array interface for reading data is fine, but what is the 
recommended way of writing data into a buffer when the number of items 
is not known in advance? (so what would be the equivalent of the C++ 
vector::push_back() call?)

- What's the status on getting this interface into the Python core?


Thanks,

- Matthias -




More information about the NumPy-Discussion mailing list