Debugging a difficult refcount issue.

buck workitharder at gmail.com
Mon Dec 19 03:09:19 EST 2011


This is what I came up with:
https://gist.github.com/1496028

We'll see if it helps, tomorrow.


On Sunday, December 18, 2011 6:01:50 PM UTC-8, buck wrote:
> Thanks Jack. I think printf is what it will come down to. I plan to put a little code into PyDict_New to print the id and the line at which it was allocated. Hopefully this will show me all the possible suspects and I can figure it out from there.
> 
> I hope figuring out the file and line-number from within that code isn't too hard.
> 
> 
> On Sunday, December 18, 2011 9:52:46 AM UTC-8, Jack Diederich wrote:
> > I don't have any great advice, that kind of issue is hard to pin down.
> >  That said, do try using a python compile with --with-debug enabled,
> > with that you can turn your unit tests on and off to pinpoint where
> > the refcounts are getting messed up.  It also causes python to use
> > plain malloc()s so valgrind becomes useful.  Worst case add assertions
> > and printf()s in the places you think are most janky.
> > 
> > -Jack
> > 
> > On Sat, Dec 17, 2011 at 11:17 PM, buck <work... at gmail.com> wrote:
> > > I'm getting a fatal python error "Fatal Python error: GC object already tracked"[1].
> > >
> > > Using gdb, I've pinpointed the place where the error is detected. It is an empty dictionary which is marked as in-use. This is somewhat helpful since I can reliably find the memory address of the dict, but it does not help me pinpoint the issue. I was able to find the piece of code that allocates the problematic dict via a malloc/LD_PRELOAD interposer, but that code was pure python. I don't think it was the cause.
> > >
> > > I believe that the dict was deallocated, cached, and re-allocated via PyDict_New to a C routine with bad refcount logic, then the above error manifests when the dict is again deallocated, cached, and re-allocated.
> > >
> > > I tried to pinpoint this intermediate allocation with a similar PyDict_New/LD_PRELOAD interposer, but that isn't working for me[2].
> > >
> > > How should I go about debugging this further? I've been completely stuck on this for two days now :(
> > >
> > > [1] http://hg.python.org/cpython/file/99af4b44e7e4/Include/objimpl.h#l267
> > > [2] http://stackoverflow.com/questions/8549671/cant-intercept-pydict-new-with-ld-preload
> > > --
> > > http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list