[Numpy-discussion] Re: Questions about the array interface.

Chris Barker Chris.Barker at noaa.gov
Thu Apr 7 12:20:05 EDT 2005


James Carroll wrote:

>>     def DrawPointList(self, points, pens=None):
>>        ...
>>        # some checking code on the pens)
>>         ...
>>         if (hasattr(points,'__array_shape__') and
>>                 hasattr(points,'__array_typestr__') and
>>                 len(points.__array_shape__) == 2 and
>>                 points.__array_shape__[1] == 2 and
>>                 points.__array_typestr__ == 'i4' and
>>                 ): # this means we have a compliant array
>>            # return the array protocol version
>>            return self._DrawPointArray(points.__array_data__, pens,[])
>>                    #This needs to be written now!
> 
> 
> This means that whenever you have some complex multivalued
> multidementional structure with the data you want to plot, you have to
> reshape it into the above 'compliant' array before passing it on.  I'm
> a newbie, but is this reshape something where the data has to be
> copied and take up memory twice?

Probably. It depends on two things:
1) What structure the data is in at the moment
2) Whether we write the code to handle more "complex" arrangements of 
data: discontiguous arrays, for instance.

But the idea is to require a data structure that makes sense for the 
data. For example, a natural way to store a whole set of coordinates is 
to use an NX2 NumPy array of doubles. This is exactly the data structure 
that I want the above function to accept. If the points are somehow a 
subset of a larger array, then they will be in a discontiguous array, 
and I'm not sure if I want to bother to try to handle that. You can 
always use the generic sequence interface to access the data, but that 
will be a lot slower. We're interfacing with a static language here, we 
can get optimum performance only by specifying a particular data structure.

> If not, then great, you would
> painlessly reshape into something that had a different set of strides
> that just accessed the data that complied in the big blob of data.  If
> the reshape is expensive, then maybe we need the array abstraction,
> and then a second 'thing' that described which parts of the array to
> use for the sequence of 2-tuples to use for plotting the x,y s of a
> scatter plot. (or whatever)

The proposed array interface does provide a certain level of 
abstraction, that's what:

__array_shape__
__array_typestr__
__array_descr__
__array_strides__
__array_offset__

Are all about we could certainly write the wxPy_LIST_helper functions to 
handle a larger variety of options that the simple contiguous C array, 
but I want to start with the simple case, and I'm not sure directly 
handling the more complex cases is worth it. I'm imagining that the user 
will need to do something like:

dc.DrawPointList(asarray(points, Int))

It's easier to use the utility functions that Numeric provides than 
re-write similar code in wxPython.

> I do think we can accept more than just i4 for a datatype.  Especially
> since a last-minute cast to i4 in inexpensive for almost every data
> type.

Sure, but we're interfacing with a static language, so for each data 
type supported, we need to cast the data pointer to the right type, then 
  have a code to convert it to the type needed by wx. It's not a big 
deal, but I'd rather keep it simple. I do want to support at least 
doubles and  ints. Users can use Numeric's astype() method to convert if 
need be.

I've noticed that there is a wxRealPoint class that uses doubles, but it 
doesn't look like it can be used as input to any of the wxDC methods. 
Too bad.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov




More information about the NumPy-Discussion mailing list