[Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison
Francesc Alted
falted at openlc.org
Mon Feb 10 23:24:02 EST 2003
A Divendres 07 Febrer 2003 19:33, Chris Barker va escriure:
>
> Is Pyrex aware of Numeric Arrays?
Joachim Saul already answered that, it is.
More exactly, Pyrex is not aware of any special object outside the Python
standard types, but with a bit of cleverness and patience, you can map any
object you want to Pyrex.
The Numeric array object map just happens to be documented in the FAQ, but I
managed to access numarray objects as well. Here is the recipe:
First, define some enum types and headers:
# Structs and functions from numarray
cdef extern from "numarray/numarray.h":
ctypedef enum NumRequirements:
NUM_CONTIGUOUS
NUM_NOTSWAPPED
NUM_ALIGNED
NUM_WRITABLE
NUM_C_ARRAY
NUM_UNCONVERTED
ctypedef enum NumarrayByteOrder:
NUM_LITTLE_ENDIAN
NUM_BIG_ENDIAN
cdef enum:
UNCONVERTED
C_ARRAY
ctypedef enum NumarrayType:
tAny
tBool
tInt8
tUInt8
tInt16
tUInt16
tInt32
tUInt32
tInt64
tUInt64
tFloat32
tFloat64
tComplex32
tComplex64
tObject
tDefault
tLong
# Declaration for the PyArrayObject
struct PyArray_Descr:
int type_num, elsize
char type
ctypedef class PyArrayObject [type PyArray_Type]:
# Compatibility with Numeric
cdef char *data
cdef int nd
cdef int *dimensions, *strides
cdef object base
cdef PyArray_Descr *descr
cdef int flags
# New attributes for numarray objects
cdef object _data # object must meet buffer API */
cdef object _shadows # ill-behaved original array. */
cdef int nstrides # elements in strides array */
cdef long byteoffset # offset into buffer where array data begins */
cdef long bytestride # basic seperation of elements in bytes */
cdef long itemsize # length of 1 element in bytes */
cdef char byteorder # NUM_BIG_ENDIAN, NUM_LITTLE_ENDIAN */
cdef char _aligned # test override flag */
cdef char _contiguous # test override flag */
void import_array()
# The Numeric API requires this function to be called before
# using any Numeric facilities in an extension module.
import_array()
Then, declare the API routines you want to use:
cdef extern from "numarray/libnumarray.h":
PyArrayObject NA_InputArray (object, NumarrayType, int)
PyArrayObject NA_OutputArray (object, NumarrayType, int)
PyArrayObject NA_IoArray (object, NumarrayType, int)
PyArrayObject NA_Empty(int nd, int *d, NumarrayType type)
object PyArray_FromDims(int nd, int *d, NumarrayType type)
define now a couple of maps between C enum types and Python numarrar type
classes:
# Conversion tables from/to classes to the numarray enum types
toenum = {numarray.Int8:tInt8, numarray.UInt8:tUInt8,
numarray.Int16:tInt16, numarray.UInt16:tUInt16,
numarray.Int32:tInt32, numarray.UInt32:tUInt32,
numarray.Float32:tFloat32, numarray.Float64:tFloat64,
}
toclass = {}
for (key, value) in toenum.items():
toclass[value] = key
ok. you are on the way. We can finally define our user funtion; for example,
I will show here a function to multiply a matrix by a vector (C double
precision):
def multMatVec(object a, object b, object c):
cdef PyArrayObject carra, carrb, carrc
cdef double *da, *db, *dc
cdef int i, j
carra = NA_InputArray(a, toenum[a._type], C_ARRAY)
carrb = NA_InputArray(b, toenum[b._type], C_ARRAY)
carrc = NA_InputArray(c, toenum[c._type], C_ARRAY)
da = <double *>carra.data
db = <double *>carrb.data
dc = <double *>carrc.data
dim1 = carra.dimensions[0]
dim2 = carra.dimensions[1]
for i from 0<= i < dim1:
dc[i] = 0.
for j from 0<= j < dim2:
dc[i] = dc[i] + da[i*dim2+j] * db[j]
return carrc
where NA_InputArray is a high-level numarray API that ensures that the
object retrieved is a well-behaved array, and not mis-aligned, discontiguous
or whatever.
Maybe at first glance such a procedure would seem obscure, but it is not. I
find it to be quite elegant.
Look at the "for i from 0<= i < dim1:" construction. We could have used the
more pythonic form: "for i in range(dim1):", but by using the former, the
Pyrex compiler is able to produce a loop in plain C, so achieving C-speed on
this piece of code. Of course, you must be aware to not introduce Python
objects inside the loop, or all the potential speed-up improvement will
vanish. But, with a bit of practice, this is easy to avoid.
For me Pyrex is like having Python but with the speed of C. This is why I'm
so enthusiastic with it.
>
> I imagine it could use them just fine, using the generic Python sequence
> get item stuff, but that would be a whole lot lower performance than if
> it understood the Numeric API and could access the data array directly.
> Also, how does it deal with multiple dimension indexing ( array[3,6,2] )
> which the standard python sequence types do not support?
In general, you can access sequence objects like in Python (and I've just
checked that extended slicing *is* supported, I don't know why Joachim was
saying that not; perhaps he was meaning Pyrex C-arrays?), but at Python
speed. So, if you need speed, always use pointers to your data and use a bit
of pointer arithmetic to access the element you want (look at the example).
Of course, you can also define C arrays if you know the boundaries in
compilation time and let the compiler do the computations to access your
desired element, but you will need first to copy the data from your buffers
to the C-array, and perhaps this is a bit inconvenient in some situations.
> As I think about this, I think your suggestion is fabulous. Pyrex (or a
> Pyrex-like) language would be a fabulous way to write code for NumArray,
> if it really made use of the NumArray API.
There can be drawbacks, like the one stated by Perry related with how to
construct general Ufuncs that can handle many different combinations of
arrays and types, although I don't understand that very well because Numeric
and numarray crews already achieved to do that in C, so why it cannot be
possible with Pyrex?. Mmm, perhaps there is some pre-processor involved?.
Cheers,
--
Francesc Alted
More information about the NumPy-Discussion
mailing list