[Numpy-discussion] Bug in numpy user-define types mechanism causes segfault on multiple calls to a ufunc

Tom Denniston tom.denniston at alum.dartmouth.org
Wed Feb 7 18:16:07 EST 2007


I am trying to register a custom type to numpy.  When I do so it works
and the ufuncs work but then when I invoke any ufunc twice the second
time my python interpretter segfaults.  I think i know what the
problem is.  In the select_types method in ufuncobject.c in
numpy/core/src/ numpy gets a reference to the key for the loop via a
call to PyInt_FromLong:
key = PyInt_FromLong((long) userdef);
Then it gets the actual loop via a call to PyDict_GetItem:
obj = PyDict_GetItem(self->userloops, key);
It later proceeds to do a decref on key:
Py_DECREF(key);
and later a decref on obj
Py_DECREF(obj);

None of this code actually runs unless you are doing an operation on a
user defined type because it is all in the block with an if statement
if (userdef > 0).

The Py_DECREF on key is correct because it returns a new reference per
python c api doc:

PyObject* PyInt_FromLong( long ival)

Return value: New reference.
Create a new integer object with a value of ival.

However the Py_DECREF on the obj looks incorrect to me because the
PyDict_Getitem returns a borrowed reference and the numpy code doesn't
increment the reference:

PyObject* PyDict_GetItem( PyObject *p, PyObject *key)

Return value: Borrowed reference.
Return the object from dictionary p which has a key key. Return NULL
if the key key is not present, but without setting an exception.


So what seems to happen is the last reference gets decremented and the
garbage collector frees up obj (the ufunc loop) on the first time
through.  The second time the through the ufunc loop is garbage memory
and it segfaults.

If I comment out the obj DECREF it works.  I think one needs to either
do that or add an INCREF right after retrieving key.  I think either
will work but the multithreading implications are different.  I don't
think it matters give that (I believe) numpy doesn't release the GIL
but I thought someone on this list would be a better judge than I
(maybe Travis) of what the correct fix should be.

Am I correct in my analysis?

In either case it should be a one or two line fix.

--Tom



More information about the NumPy-Discussion mailing list