[Numpy-discussion] Array views

Sat Mar 26 14:31:10 EDT 2011

On 3/26/11 10:32 AM, srean wrote:
>   I am also interested in this. In my application there is a large 2d
> array, lets call it 'b' to keep the notation consistent in the thread.
> b's  columns need to be recomputed often. Ideally this re-computation
> happens in a function. Lets call that function updater(b, col_index):
> The simplest example is where
> updater(b, col_index) is a matrix vector multiply, where the matrix or
> the vector changes.
>
>   Is there anyway apart from using ufuncs that I can make updater()
> write the result directly in b and not create a new temporary column
> that is then copied into b ?  Say for the matrix vector multiply example.

Probably not -- the trick is that when an array is a view of a slice of 
another array, it may not be laid out in memory in a way that other libs 
(like LAPACK, BLAS, etc) require, so the data needs to be copied to call 
those routines.

To understand all this, you'll need to study up a bit on how numpy 
arrays lay out and access the memory that they use: they use a concept 
of "strided" memory. It's very powerful and flexible, but most other 
numeric libs can't use those same data structures. I"m not sure what a 
good doc is to read to learn about this -- I learned it from messing 
with the C API. TAke a look at any docs that talk about "strides", and 
maybe playing with the "stride tricks" tools will help.

A simple example:

In [3]: a = np.ones((3,4))

In [4]: a
Out[4]:
array([[ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]])

In [5]: a.flags
Out[5]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

So a is a (3,4) array, stored in C_contiguous fashion, jsut like a 
"regular old C array". A lib expecting data in this fashion could use 
the data pointer just like regular C code.

In [6]: a.strides
Out[6]: (32, 8)

this means is is 32 bytes from the start of one row to the next, and 8 
bytes from the start of one element to the next -- which makes sense for 
a 64bit double.

In [7]: b = a[:,1]

In [10]: b
Out[10]: array([ 1.,  1.,  1.])

so b is a 1-d array with three elements.

In [8]: b.flags
Out[8]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

but it is NOT C_Contiguous - the data is laid out differently that a 
standard C array.

In [9]: b.strides
Out[9]: (32,)

so this means that it is 32 bytes from one element to the next -- for a 
8 byte data type. This is because the elements are each one element in a 
row of the a array -- they are not all next to each other. A regular C 
library generally won't be able to work with data laid out like this.

HTH,

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov