interpreter crashes

Sun Oct 28 12:59:46 EST 2001

[Paul Rubin]
> ...
> I've also been wondering whether it could make sense to modify the gc
> to check the correctness of all ref counts when the gc runs.  However
> I haven't yet looked at the gc code enough to tell whether this
> is practical.

Sorry, not even in theory.  The cyclic gc in Python works by computing the
transitive closure of registered objects (basically objects of cooperative
container types -- objects which *can* be in cycles, and which choose to
participate in gc by registering themselves), then trashing those whose
refcounts are entirely accounted for by intra-closure pointers.  This isn't
like traditional mark-&-sweep, which computes the transitive closure of a
root set, then concludes that anything allocated not in the closure must be
trash:  Python doesn't assume it controls memory allocation, and has no idea
what the root set may be.  For example, if an extension module hostile to gc
allocates its own memory and stuffs pointers to Python objects in it, Python
can't find that memory, or chase those pointers.  But the refcounts *due* to
those pointers won't be accounted for by the transitive closure of the
objects Python *does* know about, so as far as cyclic gc is concerned those
objects are not trash (it can't find the pointers keeping them alive, but
*deduces* they must exist, from the refcount evidence).

In short, Python's cyclic gc relies on correct refcounts for correct
operation; it can't ensure them.

Building a Python in debug mode is helpful for tracking suspected refcount
bugs (e.g., in 2.2 a decref that falls below 0 will dump a msg to stderr in
a debug build at the moment it goes bad).  Note that if a refcount is too
high, the symptom is usually memory leakage, not a crash.