Return str to a callback raise a segfault if used in string formating

Paul Moore p.f.moore at gmail.com
Fri Oct 13 08:22:07 EDT 2017


On 13 October 2017 at 12:18, Vincent Vande Vyvre
<vincent.vande.vyvre at telenet.be> wrote:
> Le 13/10/17 à 12:39, Paul Moore a écrit :
>>
>> As a specific suggestion, I assume the name of the created file is a
>> string object constructed in the C extension code, somehow. The fact
>> that you're getting the segfault with some uses of that string
>> (specifically, passing it to %-formatting) suggests that there's a bug
>> in the C code that constructs that string. That's where I'd start by
>> looking. Maybe something isn't zero-terminated that should be? Maybe
>> your code doesn't set up the character encoding information correctly?
>>
>> Paul
>
> That was my first idea, because I can verify the instance of PyUnraw is not
> destroyed when I use the file name, but I was in trouble by the usage of the
> file name in string formatting.
>
> For example I can use the file name into the slot i.e. shutil.copy(fname,
> "path/renamed.tiff")
> The file is correctly copied.
>
> In fact, I can do anything with the file name except use it in string
> formatting, then your approach is probably a good way.
>
> Into the CPython part I have a c-string pythonized by:
>     temp = self->outfname;
>     self->outfname = PyUnicode_FromString(ofname);
>     Py_XDECREF(temp);
>
> and exposed to Python with:
> static PyMemberDef PyUnraw_members[] = {
>     {"out_file", T_OBJECT_EX, offsetof(PyUnraw, outfname), 0,
>      "Path of the decoded file"},

OK. I presume ofname is UTF-8 encoded as required by
PyUnicode_FromString, and you're not accidentally getting a different
encoding?

I don't see any obvious issue - but it's still far more likely it's an
issue in the C code somewhere. Maybe a refcounting bug - those often
trigger peculiar errors as memory gets freed too soon, and something
then references freed memory.

Paul



More information about the Python-list mailing list