[Numpy-discussion] Re: Questions about the array interface.
Chris Barker
Chris.Barker at noaa.gov
Thu Apr 7 12:20:05 EDT 2005
James Carroll wrote:
>> def DrawPointList(self, points, pens=None):
>> ...
>> # some checking code on the pens)
>> ...
>> if (hasattr(points,'__array_shape__') and
>> hasattr(points,'__array_typestr__') and
>> len(points.__array_shape__) == 2 and
>> points.__array_shape__[1] == 2 and
>> points.__array_typestr__ == 'i4' and
>> ): # this means we have a compliant array
>> # return the array protocol version
>> return self._DrawPointArray(points.__array_data__, pens,[])
>> #This needs to be written now!
>
>
> This means that whenever you have some complex multivalued
> multidementional structure with the data you want to plot, you have to
> reshape it into the above 'compliant' array before passing it on. I'm
> a newbie, but is this reshape something where the data has to be
> copied and take up memory twice?
Probably. It depends on two things:
1) What structure the data is in at the moment
2) Whether we write the code to handle more "complex" arrangements of
data: discontiguous arrays, for instance.
But the idea is to require a data structure that makes sense for the
data. For example, a natural way to store a whole set of coordinates is
to use an NX2 NumPy array of doubles. This is exactly the data structure
that I want the above function to accept. If the points are somehow a
subset of a larger array, then they will be in a discontiguous array,
and I'm not sure if I want to bother to try to handle that. You can
always use the generic sequence interface to access the data, but that
will be a lot slower. We're interfacing with a static language here, we
can get optimum performance only by specifying a particular data structure.
> If not, then great, you would
> painlessly reshape into something that had a different set of strides
> that just accessed the data that complied in the big blob of data. If
> the reshape is expensive, then maybe we need the array abstraction,
> and then a second 'thing' that described which parts of the array to
> use for the sequence of 2-tuples to use for plotting the x,y s of a
> scatter plot. (or whatever)
The proposed array interface does provide a certain level of
abstraction, that's what:
__array_shape__
__array_typestr__
__array_descr__
__array_strides__
__array_offset__
Are all about we could certainly write the wxPy_LIST_helper functions to
handle a larger variety of options that the simple contiguous C array,
but I want to start with the simple case, and I'm not sure directly
handling the more complex cases is worth it. I'm imagining that the user
will need to do something like:
dc.DrawPointList(asarray(points, Int))
It's easier to use the utility functions that Numeric provides than
re-write similar code in wxPython.
> I do think we can accept more than just i4 for a datatype. Especially
> since a last-minute cast to i4 in inexpensive for almost every data
> type.
Sure, but we're interfacing with a static language, so for each data
type supported, we need to cast the data pointer to the right type, then
have a code to convert it to the type needed by wx. It's not a big
deal, but I'd rather keep it simple. I do want to support at least
doubles and ints. Users can use Numeric's astype() method to convert if
need be.
I've noticed that there is a wxRealPoint class that uses doubles, but it
doesn't look like it can be used as input to any of the wxDC methods.
Too bad.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list