[Cython] Upcoming issues with NumPy deprecated APIs and Cython's sizeof checks

mark florisson markflorisson88 at gmail.com
Tue Jan 31 21:53:10 CET 2012


On 31 January 2012 15:40, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 01/31/2012 03:29 PM, mark florisson wrote:
>>
>> On 30 January 2012 21:03, Lisandro Dalcin<dalcinl at gmail.com>  wrote:
>>>
>>> I'm testing my code with numpy-dev. They are trying to discourage use
>>> of deprecated APIs, this includes direct access to the ndarray struct.
>>> In order to update your code, you have to pass -DNPY_NO_DEPRECATED_API
>>> to the C compiler (or #define it before including NumPy headers).
>>>
>>> However, they have implemented this feature by exposing the ndarray
>>> type with just the Python object header:
>>>
>>> https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/ndarraytypes.h#L695
>>>
>>> Obviously, this interact bad with Cython's sizeof check, I'm getting
>>> this runtime warning:
>>>
>>> build/lib.linux-x86_64-2.7/petsc4py/lib/__init__.py:64:
>>> RuntimeWarning: numpy.ndarray size changed, may indicate binary
>>> incompatibility
>>>
>>> I think there is nothing Cython can do about this (other than
>>> special-casing NumPy to disable this VERY useful warning).
>
>
> Hmm...but you can still recompile the Cython module and then don't get the
> warning, right?
>
> We've already been through at least one such round. People tend to ignore
> it, or install warning filters...
>
> If one does want a workaround, we don't have to special case NumPy as such
> -- I think it is marginally cleaner to add new obscure syntax which we only
> use in numpy.pxd:
>
>    ctypedef class numpy.ndarray [object PyArrayObject nosizecheck]:
>
> Or, if anybody bothers, a way to register and automatically run the
> functions NumPy provides for checking ABI compatability.
>
> I don't think any changes should be done on the NumPy end.
>

I really don't care about the warning, more about the possibility of
numpy rearranging/removing or adding to its private fields (I guess
they won't be doing that anytime soon, but prudence is a virtue). I
suppose we would notice any changes soon enough though, as it would
break all the code.

>>
>> Weird, shouldn't you be getting an error? Because the size of the
>> PyArrayObject should be less than what Cython expects.
>>
>>>  I've tried the patch below with success, but I'm not convinced...
>>> Does any of you have a suggestion for NumPy folks about how to improve
>>> this?
>>>
>>
>> I'm not sure this should be fixed in NumPy. Their entire point is that
>> people shouldn't use those attributes directly. I think numpy.pxd
>> should be fixed, but the problem is that some attributes might be used
>> in user code (especially shape), and we still want that to work in
>> nogil mode. As such, I'm not sure what the best way of fixing it is,
>> without special casing these attributes in the compiler directly.
>> Maybe Dag will have some thoughts about this.
>
>
> Well, we should definitely deprecate direct access to the PyArrayObject
> fields -- you can either use "cdef int[:]", or, if you use "cdef
> np.ndarray[int]", you should use "PyArray_SHAPE".

Yeah. However, PyArray_SHAPE seems to be new in numpy 1.7. I also see
PyArray_BASE and PyArray_DESCR are commented out (because apparently
they may be NULL. Should we, for the sake of consistency, rename
'get_array_base' to PyArray_BASE in numpy.pxd?

And maybe we could provide our own implementation of PyArray_SHAPE,
which would be portable across numpy versions.

> Problem is that a lot of tutorial material etc. encourages accessing the
> fields directly (my fault). But I think it just needs to happen in the user
> code.
>
>  - Do we just remove the fields from numpy.pxd; or do we put in a
> very-special-case in other to give deprecation warnings for a release? (It'd
> be a very special transform stage, but only for one release and then we
> simply remove both the transform stage and the fields from numpy.pxd)

That would be a good idea.

>  - Do we deprecate the whole "cdef np.ndarray[int]" syntax in favour of
> "cdef int[:]"? My hunch is against it, as that would render a lot of code
> using deprecated features, but it would "solve" the size warning issue.

I think that's more for the long run. Memoryviews still behave
differently, as they coerce to memoryview objects instead of to the
original numpy array. So users can't simply adjust their declarations
and expect things to work.

Maybe we could allow users to install a hook for object coercion (e.g.
cython.view.set_object_coercion_hook(numpy.asarray))? The only problem
with that is a potentially large additional overhead, as re-aquiring a
memoryview would have to go through the buffer interface and would
have to re-parse the format string. Although that is currently the
same situation for memoryviews, it would be an easy hack to optimize
that and just compare the type pointers. I'm not yet sure how to make
that work across modules, however. Maybe if pointers don't match it
could compare all the fields of the struct, which would probably still
be cheaper than parsing buffer format strings and obtaining a buffer
view.

> Dag Sverre
>
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel


More information about the cython-devel mailing list