[Numpy-discussion] Access dtype kind from cython
Valentin Haenel
valentin at haenel.co
Thu Jan 1 14:38:10 EST 2015
Hi,
* Nathaniel Smith <njs at pobox.com> [2014-12-31]:
> On Tue, Dec 30, 2014 at 11:03 PM, Valentin Haenel <valentin at haenel.co> wrote:
> > * Eric Moore <ewm at redtetrahedron.org> [2014-12-30]:
> >> On Monday, December 29, 2014, Valentin Haenel <valentin at haenel.co> wrote:
> >>
> >> > Hi,
> >> >
> >> > how do I access the kind of the data from cython, i.e. the single
> >> > character string:
> >> >
> >> > 'b' boolean
> >> > 'i' (signed) integer
> >> > 'u' unsigned integer
> >> > 'f' floating-point
> >> > 'c' complex-floating point
> >> > 'O' (Python) objects
> >> > 'S', 'a' (byte-)string
> >> > 'U' Unicode
> >> > 'V' raw data (void)
> >> >
> >> > In regular Python I can do:
> >> >
> >> > In [7]: d = np.dtype('S')
> >> >
> >> > In [8]: d.kind
> >> > Out[8]: 'S'
> >> >
> >> > Looking at the definition of dtype that comes with cython, I see:
> >> >
> >> > ctypedef class numpy.dtype [object PyArray_Descr]:
> >> > # Use PyDataType_* macros when possible, however there are no macros
> >> > # for accessing some of the fields, so some are defined. Please
> >> > # ask on cython-dev if you need more.
> >> > cdef int type_num
> >> > cdef int itemsize "elsize"
> >> > cdef char byteorder
> >> > cdef object fields
> >> > cdef tuple names
> >> >
> >> > I.e. no kind.
>
> The problem is just that whoever wrote numpy.pxd was feeling a bit
> lazy that day and only filled in the fields they felt were most
> important :-). There are a bunch of public fields in PyArray_Descr
> that are just being left out of the Cython file you quote:
>
> https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/ndarraytypes.h#L566
>
> In particular, there's a 'char kind' field.
>
> The quick workaround is
>
> cdef extern from "*":
> cdef struct my_numpy_dtype [object PyArray_Descr]:
> cdef char kind
> # ... whatever other fields you might need
>
> and then cast to my_numpy_dtype when you need to get at the kind field
> from Cython.
>
> If feeling generous, then submit a PR to Cython adding 'cdef char
> kind' to the definition above. If feeling extra generous, it would be
> awesome if someone systematically went through and added all the
> missing fields that are in the numpy header but not cython -- I've run
> into these missing field issues annoyingly often myself, and it's
> silly that we should all keep making our own individual workarounds
> for numpy.pxd's limitations...
Thanks for the suggestions, it got me thinking.
So, I actually discovered an additional ugly workaround. Basically it
turns out, that my dtype instance does have a 'kind' attribute, but it
is a Python str object. Hence I needed to do:
ord(dtype_.kind[0])
To cast it to a Cython char... This is because---for reasons I don't
understand---when you define a char in cython and you try to assign a
python object to it, that object needs to be an integer. Otherwise you
get:
TypeError: an integer is required
During run-time.
Using the hack above my code now compiles and the tests all pass. I
would guess that it probably won't perform very well due to various
python to c back and forth activities.
V-
PS: none the less I may look into getting some patches into cython as
suggested, as the solution above isn't exactly clean code...
More information about the NumPy-Discussion
mailing list