[Numpy-discussion] Array views
Christopher Barker
Chris.Barker at noaa.gov
Sat Mar 26 14:31:10 EDT 2011
On 3/26/11 10:32 AM, srean wrote:
> I am also interested in this. In my application there is a large 2d
> array, lets call it 'b' to keep the notation consistent in the thread.
> b's columns need to be recomputed often. Ideally this re-computation
> happens in a function. Lets call that function updater(b, col_index):
> The simplest example is where
> updater(b, col_index) is a matrix vector multiply, where the matrix or
> the vector changes.
>
> Is there anyway apart from using ufuncs that I can make updater()
> write the result directly in b and not create a new temporary column
> that is then copied into b ? Say for the matrix vector multiply example.
Probably not -- the trick is that when an array is a view of a slice of
another array, it may not be laid out in memory in a way that other libs
(like LAPACK, BLAS, etc) require, so the data needs to be copied to call
those routines.
To understand all this, you'll need to study up a bit on how numpy
arrays lay out and access the memory that they use: they use a concept
of "strided" memory. It's very powerful and flexible, but most other
numeric libs can't use those same data structures. I"m not sure what a
good doc is to read to learn about this -- I learned it from messing
with the C API. TAke a look at any docs that talk about "strides", and
maybe playing with the "stride tricks" tools will help.
A simple example:
In [3]: a = np.ones((3,4))
In [4]: a
Out[4]:
array([[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.]])
In [5]: a.flags
Out[5]:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
So a is a (3,4) array, stored in C_contiguous fashion, jsut like a
"regular old C array". A lib expecting data in this fashion could use
the data pointer just like regular C code.
In [6]: a.strides
Out[6]: (32, 8)
this means is is 32 bytes from the start of one row to the next, and 8
bytes from the start of one element to the next -- which makes sense for
a 64bit double.
In [7]: b = a[:,1]
In [10]: b
Out[10]: array([ 1., 1., 1.])
so b is a 1-d array with three elements.
In [8]: b.flags
Out[8]:
C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
but it is NOT C_Contiguous - the data is laid out differently that a
standard C array.
In [9]: b.strides
Out[9]: (32,)
so this means that it is 32 bytes from one element to the next -- for a
8 byte data type. This is because the elements are each one element in a
row of the a array -- they are not all next to each other. A regular C
library generally won't be able to work with data laid out like this.
HTH,
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list