[Cython] Mitigating perfomance impact of NumPy API change

Matti Picus matti.picus at gmail.com
Fri Sep 28 04:25:50 EDT 2018


On 28/09/18 10:25, Matti Picus wrote:
> Breaking this into a number of sub-dsicussions, since we seem to be 
> branching. The original topic was
>
> Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter 
> properties
>
> On 28/09/18 01:20, Robert Bradshaw wrote:
>>
>> Hmm...so in this case it upgrading Cython would cause an 
>> unconditional switch from direct access to a function call without 
>> any code change (or choice) for users of numpy.pxd. I am curious what 
>> kind of a slowdown this would represent (though would assume this 
>> kind of analysis was done by the NumPy folks when choosing macro vs. 
>> function for the public API).
>>
>>     As I point out in the "experiment" comment referenced above,
>>     pandas has
>>     code that needs lvalue access to ndarray data, so they would be 
>> stuck
>>     with the old API which is deprecated but still works for now.
>>     Scipy has
>>     no such code and oculd move forward to the newer API.
>>
>>
>> But if we upgraded Cython, how would they access the old API? I 
>> suppose they could create a setter macro of their own to use in the 
>> (presumably few) cases where they needed an lvalue.
>>
>> - Robert
>>
>>
>
> NumPy changed its recommended API to an opaque one via inline getter 
> functions in 2011, in this PR https://github.com/numpy/numpy/pull/116. 
> I could not find a discussion on performance impact, perhaps since the 
> functions are in the header files and marked inline. Hopefully the 
> compilers will properly deal with making them fast. However, it is 
> true that when people update to a new version of a library things 
> change. In this case, there are backward-compatibility macros that 
> revert the post-1.7 functions into pre-1.7 macros with the same name.
>
> Thus for the experiment I used a new numpy.pxd, defined the pre-1.7 
> api in the pandas build (experimental changeset 
> https://github.com/mattip/pandas/commit/9113bf7e55e1eddece3544c1ad3ef2a761b5210a), 
> and was still able to access ndarray.data as a lvalue.
>
> Matti

This means cython/numpy could provide an integration path based on numpy 
starting to ship its own numpy.pxd:

- Cython would define the macro (if not already defined) to use the 
pre-1.7 Numpy API in the numpy.pxd it ships. This would still work 
(lvalues would be allowed) after direct access is replaced with the 
getter properties, since they are macros

- NumPy would define the macro to use post-1.7 API (if not already 
defined) in the numpy.pxd it ships, which as I understand would take 
precedence over cython's. Then projects like pandas could freely upgrade 
Cython without changing their codebase, but would encounter errors when 
updating NumPy.

Matti


More information about the cython-devel mailing list