C API PyObject_Call segfaults with string

Jen Kris jenkris at tutanota.com
Wed Feb 9 20:15:20 EST 2022


Right you are.  In that case should I use Py_BuildValue and convert to tuple (because it won't return a tuple for a one-arg), or should I just convert pListStr to tuple?  Thanks for your help.  


Feb 9, 2022, 17:08 by songofacandy at gmail.com:

> On Thu, Feb 10, 2022 at 10:05 AM Jen Kris <jenkris at tutanota.com> wrote:
>
>>
>> Thanks for your reply.
>>
>> I eliminated the DECREF and now it doesn't segfault but it returns 0x0.  Same when I substitute pListStrE for pListStr.  pListStr contains the string representation of the fileid, so it seemed like the one to use.  According to  http://web.mit.edu/people/amliu/vrut/python/ext/buildValue.html, PyBuildValue "builds a tuple only if its format string contains two or more format units" and that doc contains examples.
>>
>
> Yes, and PyObject_Call accept tuple, not str.
>
>
> https://docs.python.org/3/c-api/call.html#c.PyObject_Call
>
>>
>> Feb 9, 2022, 16:52 by songofacandy at gmail.com:
>>
>> On Thu, Feb 10, 2022 at 9:42 AM Jen Kris via Python-list
>> <python-list at python.org> wrote:
>>
>>
>> I have everything finished down to the last line (sentences = gutenberg.sents(fileid)) where I use PyObject_Call to call gutenberg.sents, but it segfaults. The fileid is a string -- the first fileid in this corpus is "austen-emma.txt."
>>
>> pName = PyUnicode_FromString("nltk.corpus");
>> pModule = PyImport_Import(pName);
>>
>> pSubMod = PyObject_GetAttrString(pModule, "gutenberg");
>> pFidMod = PyObject_GetAttrString(pSubMod, "fileids");
>> pSentMod = PyObject_GetAttrString(pSubMod, "sents");
>>
>> pFileIds = PyObject_CallObject(pFidMod, 0);
>> pListItem = PyList_GetItem(pFileIds, listIndex);
>> pListStrE = PyUnicode_AsEncodedString(pListItem, "UTF-8", "strict");
>> pListStr = PyBytes_AS_STRING(pListStrE);
>> Py_DECREF(pListStrE);
>>
>>
>> HERE.
>> PyBytes_AS_STRING() returns pointer in the pListStrE Object.
>> So Py_DECREF(pListStrE) makes pListStr a dangling pointer.
>>
>>
>> // sentences = gutenberg.sents(fileid)
>> PyObject *c_args = Py_BuildValue("s", pListStr);
>>
>>
>> Why do you encode&decode pListStrE?
>> Why don't you use just pListStrE?
>>
>> PyObject *NullPtr = 0;
>> pSents = PyObject_Call(pSentMod, c_args, NullPtr);
>>
>>
>> c_args must tuple, but you passed a unicode object here.
>> Read https://docs.python.org/3/c-api/arg.html#c.Py_BuildValue
>>
>> The final line segfaults:
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x00007ffff6e4e8d5 in _PyEval_EvalCodeWithName ()
>> from /usr/lib/x86_64-linux-gnu/libpython3.8.so.1.0
>>
>> My guess is the problem is in Py_BuildValue, which returns a pointer but it may not be constructed correctly. I also tried it with "O" and it doesn't segfault but it returns 0x0.
>>
>> I'm new to using the C API. Thanks for any help.
>>
>> Jen
>>
>>
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
>>
>> Bests,
>>
>> --
>> Inada Naoki <songofacandy at gmail.com>
>>
>
>
> -- 
> Inada Naoki  <songofacandy at gmail.com>
>



More information about the Python-list mailing list