[Cython] Wrong order of __Pyx_DECREF when calling a function with an implicit str → char* conversion.
Stefan Behnel
stefan_ml at behnel.de
Sat May 31 14:39:33 CEST 2014
Stefan Behnel, 30.05.2014 15:01:
> Emmanuel Gil Peyrot, 28.05.2014 13:25:
>> I was testing my cython codebase on top of pypy/cpyext, and I found a
>> memory corruption. After investigating it with the pypy guys (thanks
>> arigato!), we identified it as a cython bug:
>>
>> cdef void test(char *string):
>> print(string)
>>
>> def run(array):
>> test(array[0])
>>
>>
>> This code works fine under cpython, but looking at the generated code
>> for the last line, the call to test is done with a pointer (__pyx_t_2)
>> to some potentially deallocated string (__pyx_t_1), which isn’t freed
>> just yet on cpython but is on pypy:
>>
>> __pyx_t_1 = __Pyx_GetItemInt(__pyx_v_array, 0, …);
>> __Pyx_GOTREF(__pyx_t_1);
>> __pyx_t_2 = __Pyx_PyObject_AsString(__pyx_t_1);
>> __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
>> __pyx_f_1a_test(__pyx_t_2);
>>
>>
>> The obvious solution is to swap the last two lines, it then works fine
>> on pypy (although not necessarily if the test function stores the
>> pointer somewhere, but it’s not cython’s fault then).
>>
>> This issue can also happen with an explicit cast:
>>
>> pointer = <char *>array[0]
>> test(pointer)
>>
>>
>> I’m not sure if it should be addressed the same way, because that would
>> mean keeping a reference to array[0] for all the lifetime of the
>> current scope, but it could still prevent obscure bugs like the memory
>> corruption I had.
>
> Neither of these two examples should compile. Even in CPython, indexing can
> not only return a safe reference but a new object, e.g.
>
> class Indy(object):
> def __getitem__(self, i): return "abc%s" % i
>
> run(Indy())
>
> This should crash also in CPython, or at least show unpredictable results.
>
> Cython has a mechanism to reject this kind of code, not sure why it
> wouldn't strike here.
Here's a patch. However, while writing it up, I noticed that there are many
cases in CPython where these things are actually safe and nice, e.g.
cdef char* charptr
for charptr in list_of_byte_strings:
...
CPython would allow us to avoid some reference counting here, for example
(although I don't think we currently make use of it). And as long as the
list doesn't change (which is the user's concern, not Cython's), this is
also perfectly safe. Thus, we'd loose a nice feature, which speaks against
changing the current behaviour.
It's clear that at least emitting a warning in these cases would be
helpful, though. An option like -Wcpython could enable special warnings on
code that only works on CPython and not on other C-API implementations,
such as PyPy.
Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: reject_charptr_of_indexed_object.patch
Type: text/x-patch
Size: 1938 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20140531/6cc6c522/attachment.bin>
More information about the cython-devel
mailing list