[Numpy-discussion] ctypes and NumPy

Albert Strasheim fullung at gmail.com
Mon May 1 15:12:06 EDT 2006


Hello all

I've been working on wrapping a C library for use with NumPy for the past
week. After struggling to get it "just right" with SWIG and hand-written C
API, I tried ctypes and I was able to do with ctypes in 4 hours what I was
unable to do with SWIG or the C API in 5 days (probably mostly due to
incompetence on my part ;-)). So there's my ctypes testimonial.

I have a few questions regarding using of ctypes with NumPy. I'd appreciate
any feedback on better ways of accomplishing what I've done so far.

1. Passing pointers to NumPy data to C functions

I would like to pass the data pointer of a NumPy array to a C function via
ctypes. Currently I'm doing the following in C:

#ifdef _DEBUG
#define DEBUG__
#undef _DEBUG
#endif
#include "Python.h"
#ifdef DEBUG__
#define _DEBUG
#undef DEBUG__
#endif

typedef struct PyArrayObject {
   PyObject_HEAD
   char* data;
   int nd;
   void* dimensions;
   void* strides;
   void* descr;
   int flags;
   void* weakreflist;
} PyArrayObject;

extern void* PyArray_DATA(PyObject* obj) {
   return (void*) (((PyArrayObject*)(obj))->data);
}

First some notes regarding the code above. The preprocessor goop is there to
turn off the _DEBUG define, if any, to prevent Python from trying to link
against its debug library (python24_d) on Windows, even when you do a debug
build of your own code. Including arrayobject.h seems to introduce some
Python library symbols, so that's why I also had to extract the definition
of PyArrayObject from arrayobject.h. Now I can build my code in debug or
release mode without having to worry about Python.

As a companion to this C function that allows me to get the data pointer of
a NumPy array I have on the Python side:

def c_arraydata(a, t):
    return cast(foo.PyArray_DATA(a), POINTER(t))

def arraydata_intp(a):
    return N.intp(foo.PyArray_DATA(a))

I use c_arraydata to cast a NumPy array for wrapped functions expecting
something like a double*. I use arraydata_intp when I need to deal with
something like a double**. I make the double** buffer as an array of intp
and then assign each element to point to an array of 'f8', the address of
which I get from arraydata_intp.

The reason I'm jumping through all these hoops is so that I can get at the
data pointer of a NumPy array. ndarray.data is a Python buffer object and I
didn't manage to find any other way to obtain this pointer. If it doesn't
exist already, it would be very useful if NumPy arrays exposed a way to get
this information, by calling something like ndarray.dataptr or
ndarray.dataintp.

Once this is possible, there could be more integration with ctypes. See item
3.

2. Checking struct alignment

With the following ctypes struct:

class svm_node(Structure):
    _fields_ = [
        ('index', c_int),
        ('value', c_double)
        ]

I can do:

print svm_node.index.offset
print svm_node.index.size
print svm_node.value.offset
print svm_node.value.size

which prints out: 0, 4, 8, 8 on my system. The corresponding array
description is:

In [58]: dtype({'names' : ['index', 'value'], 'formats' : [intc, 'f8']},
align=1)
Out[58]: dtype([('index', '<i4'), ('', '|V4'), ('value', '<f8')])

I can get some information about the array layout:

In [47]: descr['index'].type
Out[47]: <type 'int32scalar'>
In [48]: descr['index'].alignment
Out[48]: 4
In [49]: descr['index'].itemsize
Out[49]: 4

However, there doesn't seem to be an equivalent in the array description to
the offset parameter that ctypes structs have. Is there a way to get this
information? It would be useful to have it, since then one could make sure
that the NumPy array and the ctypes struct line up in memory.

3. Further integration with ctypes

>From the ctypes tutorial:

"""You can also customize ctypes argument conversion to allow instances of
your own classes be used as function arguments. ctypes looks for an
_as_parameter_ attribute and uses this as the function argument. Of course,
it must be one of integer, string, or unicode: ...

If you don't want to store the instance's data in the _as_parameter_
instance variable, you could define a property  which makes the data
available."""

If I understand correctly, you could also accomplish the same thing by
implementing the from_param class method.

I don't think it's well defined what _as_parameter_ (or from_param) should
do for an arbitrary NumPy array, so there are a few options.

1. Allow the user to add _as_parameter_ or from_param to an ndarray
instance. I don't know if this is possible at all (it doesn't seem to work
at the moment because ndarray is a "built-in" type).

2. Allow _as_parameter_ to be a property with the user being able to specify
the get method at construction time (or allow the user to specify the
from_param method). For svm_node I might do something like:

def svm_node_as_parameter(self):
    return cast(self.dataptr, POINTER(svm_node))

svm_node_descr = \
dtype({'names' : ['index', 'value'],
       'formats' : [N.intc, 'f8']},
      align=1)

node = array([...],
             dtype=svm_node_descr,
             ctypes_as_parameter=svm_node_as_parameter)

3. As a next step, provide defaults for _as_parameter_ where possible. The
scalar types could set it to the corresponding ctype (or None if ctypes
can't be imported). Arrays with "basic" data, such as 'f8' and friends could
set up a property that calls ctypes.cast(self.dataptr,
POINTER(corresponding_ctype)).

Thanks for reading. :-) Comments would be appreciated. If some of my
suggestions seem implementation-worty, I'd be willing to try to implement
them.

Regards,

Albert





More information about the NumPy-Discussion mailing list