[Cython] Wrong order of __Pyx_DECREF when calling a function with an implicit str → char* conversion.

Stefan Behnel stefan_ml at behnel.de
Fri May 30 15:01:18 CEST 2014


Emmanuel Gil Peyrot, 28.05.2014 13:25:
> I was testing my cython codebase on top of pypy/cpyext, and I found a
> memory corruption.  After investigating it with the pypy guys (thanks
> arigato!), we identified it as a cython bug:
> 
>  cdef void test(char *string):
>      print(string)
> 
>  def run(array):
>      test(array[0])
> 
> 
> This code works fine under cpython, but looking at the generated code
> for the last line, the call to test is done with a pointer (__pyx_t_2)
> to some potentially deallocated string (__pyx_t_1), which isn’t freed
> just yet on cpython but is on pypy:
> 
>  __pyx_t_1 = __Pyx_GetItemInt(__pyx_v_array, 0, …);
>  __Pyx_GOTREF(__pyx_t_1);
>  __pyx_t_2 = __Pyx_PyObject_AsString(__pyx_t_1);
>  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
>  __pyx_f_1a_test(__pyx_t_2);
> 
> 
> The obvious solution is to swap the last two lines, it then works fine
> on pypy (although not necessarily if the test function stores the
> pointer somewhere, but it’s not cython’s fault then).
> 
> This issue can also happen with an explicit cast:
> 
>  pointer = <char *>array[0]
>  test(pointer)
> 
> 
> I’m not sure if it should be addressed the same way, because that would
> mean keeping a reference to array[0] for all the lifetime of the
> current scope, but it could still prevent obscure bugs like the memory
> corruption I had.

Neither of these two examples should compile. Even in CPython, indexing can
not only return a safe reference but a new object, e.g.

    class Indy(object):
        def __getitem__(self, i): return "abc%s" % i

    run(Indy())

This should crash also in CPython, or at least show unpredictable results.

Cython has a mechanism to reject this kind of code, not sure why it
wouldn't strike here.

Stefan



More information about the cython-devel mailing list