[Python-ideas] breaking cycles that include __del__

Scott Dial scott+python-ideas at scottdial.com
Tue Oct 20 19:59:05 CEST 2009


Adam Olsen wrote:
> Your weakref callback is a method of your object.  The callback
> requires that method still be alive, but isn't triggered until self is
> deleted.

This is made clear in the gcmodule.c comments, but is severely lacking
from the actual documentation in the weakref module. But also, Antoine's
is flawed in the same manner. Antoine's example works merely because the
gc module is left out of the issue (there are no cycles in it). If you
introduce a cycle, then it falls apart, just like Daniel's:

import weakref, gc
class Foo:
    def __init__(self):
        def call_free(_, r=repr(self)):
            print(r)
        self._weakref = weakref.ref(self, call_free)
x,y = Foo(), Foo()
x.y, y.x = y, x
del x
del y
gc.collect()
print(gc.garbage)

The whole source of the confusion is documented in the handle_weakrefs()
in Modules/gcmodule.c at line 600:

/* Headache time.  `op` is going away, and is weakly referenced by
 * `wr`, which has a callback.  Should the callback be invoked?  If wr
 * is also trash, no:
 *
 * 1. There's no need to call it.  The object and the weakref are
 *    both going away, so it's legitimate to pretend the weakref is
 *    going away first.  The user has to ensure a weakref outlives its
 *    referent if they want a guarantee that the wr callback will get
 *    invoked.
 *
 * 2. It may be catastrophic to call it.  If the callback is also in
 *    cyclic trash (CT), then although the CT is unreachable from
 *    outside the current generation, CT may be reachable from the
 *    callback.  Then the callback could resurrect insane objects.
 *
 * Since the callback is never needed and may be unsafe in this case,
 * wr is simply left in the unreachable set.  Note that because we
 * already called _PyWeakref_ClearRef(wr), its callback will never
 * trigger.
 *
 * OTOH, if wr isn't part of CT, we should invoke the callback:  the
 * weakref outlived the trash.  Note that since wr isn't CT in this
 * case, its callback can't be CT either -- wr acted as an external
 * root to this generation, and therefore its callback did too.  So
 * nothing in CT is reachable from the callback either, so it's hard
 * to imagine how calling it later could create a problem for us.  wr
 * is moved to wrcb_to_call in this case.
 */
    if (IS_TENTATIVELY_UNREACHABLE(wr))
        continue;

The only way to guarantee the callback occurs is if you attach it to
some other object that will outlive your object, *but* must also not get
pulled into the same gc generation, otherwise it will *still not be
called*. I believe this is the source of your advice to store the
weakref in some globally reachable set. I believe given the way modules
are currently deallocated, this is guaranteed to work. Should modules
ever be included in the gc, then perhaps this would have to be revisited.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu



More information about the Python-ideas mailing list