C API PyObject_Call segfaults with string

Jen Kris jenkris at tutanota.com
Thu Feb 10 15:04:05 EST 2022


Hi and thanks very much for your comments on reference counting.  Since I'm new to the C_API that will help a lot.  I know that reference counting is one of the difficult issues with the C API.  

I just posted a reply to Inada Naoki showing how I solved the problem I posted yesterday.  

Thanks much for your help.

Jen


Feb 9, 2022, 18:43 by python at mrabarnett.plus.com:

> On 2022-02-10 01:37, Jen Kris via Python-list wrote:
>
>> I'm using Python 3.8 so I tried your second choice:
>>
>> pSents = PyObject_CallFunctionObjArgs(pSentMod, pListItem);
>>
>> but pSents is 0x0.  pSentMod and pListItem are valid pointers.
>>
> 'PyObject_CallFunction' looks like a good one to use:
>
> """PyObject* PyObject_CallFunction(PyObject *callable, const char *format, ...)
>
> Call a callable Python object callable, with a variable number of C arguments. The C arguments are described using a Py_BuildValue() style format string. The format can be NULL, indicating that no arguments are provided.
> """
>
> [snip]
>
> What I do is add comments to keep track of what objects I have references to at each point and whether they are new references or could be NULL.
>
> For example:
>
>  pName = PyUnicode_FromString("nltk.corpus");
>  //> pName+?
>
> This means that 'pName' contains a reference, '+' means that it's a new reference, and '?' means that it could be NULL (usually due to an exception, but not always) so I need to check it.
>
> Continuing in this vein:
>
>  pModule = PyImport_Import(pName);
>  //> pName+? pModule+?
>
>  pSubMod = PyObject_GetAttrString(pModule, "gutenberg");
>  //> pName+? pModule+? pSubMod+?
>  pFidMod = PyObject_GetAttrString(pSubMod, "fileids");
>  //> pName+? pModule+? pSubMod+? pFidMod+?
>  pSentMod = PyObject_GetAttrString(pSubMod, "sents");
>  //> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+?
>
>  pFileIds = PyObject_CallObject(pFidMod, 0);
>  //> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+? PyObject_CallObject+?
>  pListItem = PyList_GetItem(pFileIds, listIndex);
>  //> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+? PyObject_CallObject+? pListItem?
>  pListStrE = PyUnicode_AsEncodedString(pListItem, "UTF-8", "strict");
>  //> pName+? pModule+? pSubMod+? pFidMod+? pSentMod+? PyObject_CallObject+? pListItem? pListStrE+?
>
> As you can see, there's a lot of leaked references building up.
>
> Note how after:
>
>  pListItem = PyList_GetItem(pFileIds, listIndex);
>
> the addition is:
>
>  //> pListItem?
>
> This means that 'pListItem' contains a borrowed (not new) reference, but could be NULL.
>
> I find it easiest to DECREF as soon as I no longer need the reference and remove a name from the list as soon I no longer need it (and DECREFed where).
>
> For example:
>
>  pName = PyUnicode_FromString("nltk.corpus");
>  //> pName+?
>  if (!pName)
>  goto error;
>  //> pName+
>  pModule = PyImport_Import(pName);
>  //> pName+ pModule+?
>  Py_DECREF(pName);
>  //> pModule+?
>  if (!pModule)
>  goto error;
>  //> pModule+
>
> I find that doing this greatly reduces the chances of getting the reference counting wrong, and I can remove the comments once I've finished the function I'm writing.
> -- 
> https://mail.python.org/mailman/listinfo/python-list
>



More information about the Python-list mailing list