[Cython] `cdef inline` and typed memory views

Stefan Behnel stefan_ml at behnel.de
Mon Apr 23 08:24:32 CEST 2012


mark florisson, 22.04.2012 22:20:
> On 21 April 2012 20:17, Dimitri Tcaciuc wrote:
>> Say I want to factor out inner part of
>> some N^2 loops over a flow array, I write something like
>>
>>  cdef inline float _inner(size_t i, size_t j, float[:] x):
>>     cdef float d = x[i] - x[j]
>>     return sqrtf(d * d)
>>
>> In 0.16, this actually compiles (as opposed to 0.15 with ndarray) and
>> function is declared as inline, which is great. However, the
>> memoryview structure is passed by value:
>>
>>  static CYTHON_INLINE float __pyx_f_3foo__inner(size_t __pyx_v_i,
>> size_t __pyx_v_j, __Pyx_memviewslice __pyx_v_x) {
>>     ...
>>
>> This seems to hinder compiler's (in my case, GCC 4.3.4) ability to
>> perform efficient inlining (although function does in fact get
>> inlined). If I manually inline that distance calculation, I get 3x
>> speedup. (in my case 0.324020147324 vs 1.43209195137 seconds for 10k
>> elements). When I manually modified generated .c file to pass memory
>> view slice by pointer, slowdown was eliminated completely.
> 
> Although it is neither documented nor tested, it works if you just
> take the address of the memoryview. You can then index it using
> memoryview_pointer[0][i].

Are you advertising this an an actual feature here? I'm just asking because
supporting hacks can be nasty in the long run. What if we ever want to make
a change to the internal way memoryviews work that would break this?

Stefan


More information about the cython-devel mailing list