[Numpy-discussion] Access dtype kind from cython

Valentin Haenel valentin at haenel.co
Thu Jan 1 14:38:10 EST 2015


Hi,

* Nathaniel Smith <njs at pobox.com> [2014-12-31]:
> On Tue, Dec 30, 2014 at 11:03 PM, Valentin Haenel <valentin at haenel.co> wrote:
> > * Eric Moore <ewm at redtetrahedron.org> [2014-12-30]:
> >> On Monday, December 29, 2014, Valentin Haenel <valentin at haenel.co> wrote:
> >>
> >> > Hi,
> >> >
> >> > how do I access the kind of the data from cython, i.e. the single
> >> > character string:
> >> >
> >> > 'b' boolean
> >> > 'i' (signed) integer
> >> > 'u' unsigned integer
> >> > 'f' floating-point
> >> > 'c' complex-floating point
> >> > 'O' (Python) objects
> >> > 'S', 'a' (byte-)string
> >> > 'U' Unicode
> >> > 'V' raw data (void)
> >> >
> >> > In regular Python I can do:
> >> >
> >> > In [7]: d = np.dtype('S')
> >> >
> >> > In [8]: d.kind
> >> > Out[8]: 'S'
> >> >
> >> > Looking at the definition of dtype that comes with cython, I see:
> >> >
> >> >   ctypedef class numpy.dtype [object PyArray_Descr]:
> >> >       # Use PyDataType_* macros when possible, however there are no macros
> >> >       # for accessing some of the fields, so some are defined. Please
> >> >       # ask on cython-dev if you need more.
> >> >       cdef int type_num
> >> >       cdef int itemsize "elsize"
> >> >       cdef char byteorder
> >> >       cdef object fields
> >> >       cdef tuple names
> >> >
> >> > I.e. no kind.
> 
> The problem is just that whoever wrote numpy.pxd was feeling a bit
> lazy that day and only filled in the fields they felt were most
> important :-). There are a bunch of public fields in PyArray_Descr
> that are just being left out of the Cython file you quote:
> 
>    https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/ndarraytypes.h#L566
> 
> In particular, there's a 'char kind' field.
> 
> The quick workaround is
> 
> cdef extern from "*":
>     cdef struct my_numpy_dtype [object PyArray_Descr]:
>         cdef char kind
>         # ... whatever other fields you might need
> 
> and then cast to my_numpy_dtype when you need to get at the kind field
> from Cython.
> 
> If feeling generous, then submit a PR to Cython adding 'cdef char
> kind' to the definition above. If feeling extra generous, it would be
> awesome if someone systematically went through and added all the
> missing fields that are in the numpy header but not cython -- I've run
> into these missing field issues annoyingly often myself, and it's
> silly that we should all keep making our own individual workarounds
> for numpy.pxd's limitations...

Thanks for the suggestions, it got me thinking.

So, I actually discovered an additional ugly workaround. Basically it
turns out, that my dtype instance does have a 'kind' attribute, but it
is a Python str object. Hence I needed to do:

  ord(dtype_.kind[0])

To cast it to a Cython char... This is because---for reasons I don't
understand---when you define a char in cython and you try to assign a
python object to it, that object needs to be an integer. Otherwise you
get:

  TypeError: an integer is required

During run-time.

Using the hack above my code now compiles and the tests all pass. I
would guess that it probably won't perform very well due to various
python to c back and forth activities.

V-

PS: none the less I may look into getting some patches into cython as
suggested, as the solution above isn't exactly clean code...



More information about the NumPy-Discussion mailing list