[Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison

Mon Feb 10 23:24:02 EST 2003

A Divendres 07 Febrer 2003 19:33, Chris Barker va escriure:
>
> Is Pyrex aware of Numeric Arrays?

Joachim Saul already answered that, it is.

More exactly, Pyrex is not aware of any special object outside the Python
standard types, but with a bit of cleverness and patience, you can map any
object you want to Pyrex.

The Numeric array object map just happens to be documented in the FAQ, but I
managed to access numarray objects as well. Here is the recipe:

First, define some enum types and headers:

# Structs and functions from numarray
cdef extern from "numarray/numarray.h":

  ctypedef enum NumRequirements:
    NUM_CONTIGUOUS
    NUM_NOTSWAPPED
    NUM_ALIGNED
    NUM_WRITABLE
    NUM_C_ARRAY
    NUM_UNCONVERTED

  ctypedef enum NumarrayByteOrder:
    NUM_LITTLE_ENDIAN
    NUM_BIG_ENDIAN

  cdef enum:
    UNCONVERTED
    C_ARRAY

  ctypedef enum NumarrayType:
    tAny
    tBool	
    tInt8
    tUInt8
    tInt16
    tUInt16
    tInt32
    tUInt32
    tInt64
    tUInt64
    tFloat32
    tFloat64
    tComplex32
    tComplex64
    tObject
    tDefault
    tLong

# Declaration for the PyArrayObject

  struct PyArray_Descr:
     int type_num, elsize
     char type

  ctypedef class PyArrayObject [type PyArray_Type]:
    # Compatibility with Numeric
    cdef char *data
    cdef int nd
    cdef int *dimensions, *strides
    cdef object base
    cdef PyArray_Descr *descr
    cdef int flags
    # New attributes for numarray objects
    cdef object _data         # object must meet buffer API */
    cdef object _shadows      # ill-behaved original array. */
    cdef int    nstrides      # elements in strides array */
    cdef long   byteoffset    # offset into buffer where array data begins */
    cdef long   bytestride    # basic seperation of elements in bytes */
    cdef long   itemsize      # length of 1 element in bytes */
    cdef char   byteorder     # NUM_BIG_ENDIAN, NUM_LITTLE_ENDIAN */
    cdef char   _aligned      # test override flag */
    cdef char   _contiguous   # test override flag */

  void import_array()

# The Numeric API requires this function to be called before
# using any Numeric facilities in an extension module.
import_array()

Then, declare the API routines you want to use:

cdef extern from "numarray/libnumarray.h":
  PyArrayObject NA_InputArray (object, NumarrayType, int)
  PyArrayObject NA_OutputArray (object, NumarrayType, int)
  PyArrayObject NA_IoArray (object, NumarrayType, int)
  PyArrayObject NA_Empty(int nd, int *d, NumarrayType type)
  object PyArray_FromDims(int nd, int *d, NumarrayType type)

define now a couple of maps between C enum types and Python numarrar type
classes:

# Conversion tables from/to classes to the numarray enum types
toenum = {numarray.Int8:tInt8,       numarray.UInt8:tUInt8,
          numarray.Int16:tInt16,     numarray.UInt16:tUInt16,
          numarray.Int32:tInt32,     numarray.UInt32:tUInt32,
          numarray.Float32:tFloat32, numarray.Float64:tFloat64,
          }

toclass = {}
for (key, value) in toenum.items():
  toclass[value] = key

ok. you are on the way. We can finally define our user funtion; for example,
I will show here a function to multiply a matrix by a vector (C double
precision):

def multMatVec(object a, object b, object c):
    cdef PyArrayObject carra, carrb, carrc
    cdef double *da, *db, *dc
    cdef int i, j

    carra = NA_InputArray(a, toenum[a._type], C_ARRAY)
    carrb = NA_InputArray(b, toenum[b._type], C_ARRAY)
    carrc = NA_InputArray(c, toenum[c._type], C_ARRAY)
    da = <double *>carra.data
    db = <double *>carrb.data
    dc = <double *>carrc.data
    dim1 = carra.dimensions[0]
    dim2 = carra.dimensions[1]
    for i from 0<= i < dim1:
      dc[i] = 0.
      for j from 0<= j < dim2:
        dc[i] = dc[i] + da[i*dim2+j] * db[j]

    return carrc

where NA_InputArray is a high-level numarray API that ensures that the
object retrieved is a well-behaved array, and not mis-aligned, discontiguous
or whatever. 

Maybe at first glance such a procedure would seem obscure, but it is not. I
find it to be quite elegant.

Look at the "for i from 0<= i < dim1:" construction. We could have used the
more pythonic form: "for i in range(dim1):", but by using the former, the
Pyrex compiler is able to produce a loop in plain C, so achieving C-speed on
this piece of code. Of course, you must be aware to not introduce Python
objects inside the loop, or all the potential speed-up improvement will
vanish. But, with a bit of practice, this is easy to avoid.

For me Pyrex is like having Python but with the speed of C. This is why I'm
so enthusiastic with it.

>
> I imagine it could use them just fine, using the generic Python sequence
> get item stuff, but that would be a whole lot lower performance than if
> it understood the Numeric API and could access the data array directly.
> Also, how does it deal with multiple dimension indexing ( array[3,6,2] )
> which the standard python sequence types do not support?

In general, you can access sequence objects like in Python (and I've just
checked that extended slicing *is* supported, I don't know why Joachim was
saying that not; perhaps he was meaning Pyrex C-arrays?), but at Python
speed. So, if you need speed, always use pointers to your data and use a bit
of pointer arithmetic to access the element you want (look at the example).

Of course, you can also define C arrays if you know the boundaries in
compilation time and let the compiler do the computations to access your
desired element, but you will need first to copy the data from your buffers
to the C-array, and perhaps this is a bit inconvenient in some situations.

> As I think about this, I think your suggestion is fabulous. Pyrex (or a
> Pyrex-like) language would be a fabulous way to write code for NumArray,
> if it really made use of the NumArray API.

There can be drawbacks, like the one stated by Perry related with how to
construct general Ufuncs that can handle many different combinations of
arrays and types, although I don't understand that very well because Numeric
and numarray crews already achieved to do that in C, so why it cannot be
possible with Pyrex?. Mmm, perhaps there is some pre-processor involved?.

Cheers,

-- 
Francesc Alted