[Python-Dev] Cyclic GC issues

Tim Peters tim.peters at gmail.com
Tue Oct 12 03:21:57 CEST 2004


[Jason Evans]
... 
> The low level tree code is implemented completely separately from the
> Python object wrapper code.  This means that, for example, the Python
> "Node" object does not actually store PyObject* pointers to Ring objects;
> instead it uses the low level tree code to find out what Ring objects are
> attached to it.

I think you're in for a rude awakening if you expect any language's gc
support to help you manage arbitrary blobs of memory obtained from the
system malloc().  If all the nodes in your object graph derived from
PyObject, Python's gc would do it all for you.  If you want to plant
pointers in and among arbitrary C structs, then you'll have to
implement your own gc, or hope that you can bash a general gc thingie
(like BDW -- and I'm not sure there is anything else of that sort
usefully available, so "like" is probably optimistic) into guessing
the layout of your structs.

> Crux was designed this way in order to be able to implement various low level
> algorithms such that they never have to call interpreter-related code.

I don't know why that was thought to be desirable, or even exactly
what it means.  One plausible meaning is that you don't want to tie
the "various low level algorithms" to Python, because you hope to
reuse them in other, non-Python contexts.  And, in addition to that,
you don't want, or don't know how, to abstract the necessary
operations into macros or functions that are pluggable for the
different contexts you have in mind.  As a very simple example, using
Py_INCREF in a function doesn't tie that function to Python -- if you
want to use it in a non-Python context,

#define Py_INCREF(x)

gets rid of such calls.  But I'm guessing too much about what you
might mean, so I'll stop there.

> As such, reference breaking code must actually tear down the low level tree in
> such a way that it is always "consistent" (simply setting pointers to NULL
> doesn't fit well with this way of doing things).

Martin had in mind that the pointers were to properly refcounted
PyObject*, in which case his advice was sound.

[Martin]
>> Hmmm. Even if a ring member holds only a single reference to another
>> ring member, would not calling visit() on this member in turn invoke
>> visit() on the other one, which would, in turn, invoke visit() on
>> the next one, and so on?

[Jason]
> Yes, this is true.

It's false, but I covered that one yesterday.  What's passed back to
visit() has nothing to do with which objects' tp_traverse()s get
called.  Registering ("tracking") PyObject*s of gc'able types is what
determines which objects' tp_traverse()s  get called.  Python's gc
doesn't try to determine what's reachable, it determines what isn't
reachable from pointers it *hasn't* been told about.  It infers
"reachable from outside" from the refcounts:  if all the refcounts on
an object are not accounted for by the objects gc *has* been told
about, then that object is directly reachable from something gc hasn't
been told about, so gc dare not touch it.  The transitive closure of
what's reachable from such objects is the set of all objects gc knows
about that are reachable from outside.  gc leaves those alone. 
Everything else gc knows about is necessarily cyclic trash, or
reachable only from cyclic trash.  It can't know anything about
objects that aren't tracked by gc, so in particular can't know
anything about pointers that don't point at a PyObject.

> The problem I ran into had to do with the node logically holding references to all
> the ring objects, but only reporting one of them.

If a ring object doesn't derive from PyObject, then visit() must not
be passed a pointer to a ring object.  Doing so would account for many
catastrophic meanings of "doesn't work" <wink>.

> That is, the node's tp_traverse function was not reporting all references, though
> it was reporting enough of them to guarantee that all reachable objects were
> visited.

No, the visit() callback has nothing to do with which objects are
traversed.  gc determines the set of objects to traverse, before any
object's tp_traverse() has been called, as a "generational" subset of
all objects that have registered with gc.  The pointers passed back to
visit() callbacks have no effect on that set:  they can't add anything
to, or remove anything from, that set.  The set of objects to traverse
is predetermined.

> At this point, it is clear to me that Python's cyclic GC does not do
> mark/sweep GC,

RIght, although we're only talking about CPython here (a particular
implementation of Python),

> and that this won't change,

In CPython that's a safe bet.

> so there's probably not much point in discussing this issue further.

We can help you use Python's gc, but not if you're unwilling or unable
to change anything about what you do.

> ...
> What if breaking cycles and cleanup are the same thing?  In the case of
> Crux, they are one and the same, since the low level tree must be torn down
> one piece at a time.

Again Martin had in mind that all your objects derived from PyObject. 
If that were so, then Python's gc would guarantee not to tear down
anything until every cycle it was part of became unreachable, in which
case internal invariants no longer matter (since they're no longer
reachable form anything except other unreachable trash).  Python's gc
is very nice to work with, for Python objects.  If you can't make your
objects PyObjects, then you get all the pain of implementing a form of
gc that does work with your objects.  You really should look at how
core objects implement tp_traverse() and tp_clear(), to see how easy
life *can* be, even in the presence of much hairier object graphs than
you're working with.

> I should have said tp_clear here.  In Modules/gcmodule.c:delete_garbage(),
> the clear() call does not pay attention to the return value.

Heh.  So true!  See

    http://mail.python.org/pipermail/python-dev/2003-April/034435.html

> It would be possible to check the return value, and defer deallocation if the
> return value were non-zero, and ob_refcnt were 1.  (However, it would probably
> have to be a separate list of objects, contrary to what I claimed before.)

I'm not sure, but I think your desire for this was still based on
misunderstanding of what Python's gc does.


More information about the Python-Dev mailing list