gc assertion failure

Todd Miller jmiller at stsci.edu
Wed Oct 29 17:27:50 EST 2003


Todd Miller wrote:
> Tim Peters wrote:
> 
>> [Todd Miller]
>>
>>> I recently discovered an assertion failure in the Python garbage
>>> collection system when scripts using our C extension (numarray) exit.
>>> The assertion is activated for Pythons configured using
>>>  --with-pydebug. I have a feeling I may be doing something wrong
>>> with garbage collection support for some of our c types,  but I'm not
>>> sure exactly what.
>>>
>>> Here is the assertion output:
>>>
>>> python: Modules/gcmodule.c:231: visit_decref: Assertion
>>> `gc->gc.gc_refs != 0' failed.
>>> Abort (core dumped)
>>
>>
>>
>> Looking at the source code should clarify:
>>
>>     assert(gc->gc.gc_refs != 0); /* else refcount was too small */
>>
>> That is, gc found more pointers to an object than that object's refcount
>> believes exists.  A missing Py_INCREF or an extra Py_DECREF are plausible
>> causes; so is a bad tp_traverse function that passes a single containee
>> multiple times (although I've only see that once in real life).  A 
>> missing
>> Py_INCREF is (IME) the most common cause for this assertion.
>>
>>
>>> ...
>>> #5  0x080e9222 in visit_decref (op=0x405adc74, data=0x0) at
>>> Modules/gcmodule.c:231
>>> #6  0x0808cebf in tupletraverse (o=0x40a62f74, visit=0x80e9194
>>> <visit_decref>, arg=0x0) at Objects/tupleobject.c:398
>>
>>
>>
>> So it's complaing about an object that happens to be in a tuple.  
>> Displaying
>> more info about op would tell you more about the kind of object it's
>> complaining about.
>>
> 
> Thanks Tim!  It turns out to be one of the objects numarray uses to 
> represent data type, Int64.   I also noticed that the problem goes away 
> when I switch on the "Python prototype" for some C code,  which is 
> further evidence that the problem is a ref count error since the code in 
> question just touches type objects,  it doesn't implement them.
> 
> I haven't found the bug yet,  but I'm out of wheel lock.  Definitely 
> makes my day...

FWIW,  here's what my bug looked like:

<       key = Py_BuildValue("(NNsNN)", _digest(in1), _digest(out), 
cumop, thread_id, type
---
 >       key = Py_BuildValue("(NNsNO)", _digest(in1), _digest(out), 
cumop, thread_id, type

Since I used "N" for type in the Py_BuildValue, it stole a reference to 
type which it shouldn't have.  Switching to "O" made the Py_BuildValue 
reference count neutral for type and the problem was solved.

Thanks for the help,
Todd





More information about the Python-list mailing list