[Numpy-discussion] Questions about the array interface.
Chris Barker
Chris.Barker at noaa.gov
Wed Apr 6 23:36:36 EDT 2005
Travis Oliphant wrote:
> You should account for the '<' or '>' that might be present in
> __array_typestr__ (Numeric won't put it there, but scipy.base and
> numarray will---since they can have byteswapped arrays internally).
Good point, but a pain. Maybe they should be required, that way I don't
have to first check for the presence of '<' or '>', then check if they
have the right value.
> A more generic interface would handle multiple integer types if possible
I'd like to support doubles as well...
> (but this is a good start...)
Right. I want to get _something_ working, before I try to make it universal!
> I think one idea here is that if __array_strides__ returns None, then
> C-style contiguousness is assumed. In fact, I like that idea so much
> that I just changed the interface. Thanks for the suggestion.
You're welcome. I like that too.
> No, they won't always be there for SciPy arrays (currently 4 of them
> are). Only record-arrays will provide __array_descr__ for example and
> __array_offset__ is unnecessary for SciPy arrays. I actually don't much
> like the __array_offset__ parameter myself, but Scott convinced me that
> it would could be useful for very complicated array classes.
I can see that it would, but then, we're stuck with checking for all
these optional attributes. If I don't bother to check for it, one day,
someone is going to pass a weird array in with an offset, and a strange
bug will show up.
> e.g. ndarray.cint (gives 'iX' on the correct platform).
> For now, I would check (__array_typestr__ == 'i%d' %
> array.array('i',[0]).itemsize)
I can see that that would work, but it does feel like a hack. BEsides, I
might be doign this in C++ anyway, so it would probably be easier to use
sizeof()
> But, on most platforms these days an int is 4 bytes, but the about would
> be just to make sure.
Right. Making that assumption will jsut lead to weird bugs way don't he
line. Of course, I wouldn't be surprised if wxWidgets and/or python
makes that assumption in other places anyway!
>> 5) Why is: __array_data__ optional? Isn't that the whole point of this?
>
> Because the object itself might expose the buffer interface. We could
> make __array_data__ required and prefer that it return a buffer object.
Couldn't it be required, and return a reference to itself if that works?
Maybe I'm just being lazy, but it feels clunky and prone to errors to
keep having to check if a attribute exists, then use it (or not).
> So, the correct consumer usage for grabbing the data is
>
> data = getattr(obj, '__array_data__', obj)
Ah! I hadn't noticed the default parameter to getattr(). That makes it
much easier. Is there an equivalent in C? It doesn't look like it to me,
but I'm kind of a newbie with the C API.
> int *PyObject_AsReadBuffer*(PyObject *obj, const void **buffer, int
> *buffer_len)
I'm starting to get this.
> Of course this approach has the 32-bit limit until we get this changed
> in Python.
That's the least of my worries!
>> 6) Should __array_offset__ be optional? I'd rather it were required,
>> but default to zero. This way I have to check for it, then use it.
>> Also, I assume it is an integer number of bytes, is that right?
>
> A consumer has to check for most of the optional stuff if they want to
> support all types of arrays.
That's not quite true. I'm happy to support only the simple types of
arrays (contiguous, single type elements, zero offset(, but I have to
check all that stuff to make sure that I have a simple array. The
simplest arrays are the most common case, they should be as easy as
possible to support.
> Again a simple:
>
> getattr(obj, '__array_offset__', 0)
>
> works fine.
not too bad.
Also, what if we find the need for another optional attribute later? Any
older code won't check for it. Or maybe I'm being paranoid....
>> 7) An alternative to the above: A __simple_ flag, that means the data
>> is a simple, C array of contiguous data of a single type. The most
>> common use, and it would be nice to just check that flag and not have
>> to take all other options into account.
> I think if __array_strides__ returns None (and if an object doesn't
> expose it you can assume it) it is probably good enough.
That and __array_typestr__
Travis Oliphant wrote:
>
> At http://numeric.scipy.org/array_interface.py
>
> you will find the start of a set of helper functions for the array
> interface that can make it more easy to deal with.
Ah! this may well address my concerns. Good idea.
Thanks for all your work on this Travis.
By the way, a quote form Robin Dunn about this:
"Sweet!"
Thought you might appreciate that.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list