[Python-Dev] finalization again

Greg Stein gstein@lyra.org
Thu, 9 Mar 2000 12:18:06 -0800 (PST)


On Thu, 9 Mar 2000, Guido van Rossum wrote:
>...
> I don't think so.  While my poor wording ("finalizer-free garbage")
> didn't make this clear, my references to earlier algorithms were
> intended to imply that this is garbage that consists of truly
> unreachable objects.  I have three lists: let's call them T(rash),
> R(oot-reachable), and F(inalizer-reachable).  The Schemenauer
> c.s. algorithm moves all reachable nodes to R.  I then propose to move
> all finalizers to F, and to run another pass of Schemenauer c.s. to
> also move all finalizer-reachable (but not root-reachable) nodes to F.
>...
> [Tim Peters]
> > I see Marc-Andre already declined to get sucked into the magical part of
> > this <wink>.  Greg should speak for his scheme, and I haven't made time to
> > understand it fully; my best guess is to call x.__cleanup__ for every object
> > in the SCC (but there's no clear way to decide which order to call them in,
> > and unless they're more restricted than __del__ methods they can create all
> > the same problems __del__ methods can!).

My scheme was to identify objects in F, but only those with a finalizer
(not the closure). Then call __cleanup__ on each of them, in arbitrary
order. If any are left after the sequence of __cleanup__ calls, then I
call it an error.

[ note that my proposal defined checking for a finalizer by calling
  tp_clean(TPCLEAN_CARE_CHECK); this accounts for class instances and for
  extension types with "heavy" processing in tp_dealloc ]

The third step was to use tp_clean to try and clean all other objects in a
safe fashion. Specifically: the objects have no finalizers, so there is no
special care needed in finalizing, so this third step should nuke
references that are stored in the object. This means object pointers are
still valid (we haven't dealloc'd), but the insides have been emptied. If
the third step does not remove all cycles, then one of the PyType objects
did not remove all references during the tp_clean call.

>...
> > If I *ever* have a trash cycle with a finalizer in my code (> 0 -- "exactly
> > 1" isn't special to me), I will consider it to be a bug.  So I want a way to
> > get it back from gc, so I can see what the heck it is, so I can fix my code
> > (or harass whoever did it to me).  __cleanup__ suffices for that, so the
> > very act of calling it is all I'm really after ("Python invoked __cleanup__
> > == Tim has a bug").

Agreed.

>...
> I suppose we can print some obnoxious message to stderr like

A valid alternative to raising an exception, but it falls into the whole
trap of "where does stderr go?"

>...
> But I'd still like to reclaim the memory.  If this is some
> long-running server process that is executing arbitrary Python
> commands sent to it by clients, it's not nice to leak, period.

If an exception is raised, the top-level server loop can catch it, log the
error, and keep going. But yes: it will leak.

> (Because of this, I will also need to trace functions, methods and
> modules -- these create massive cycles that currently require painful
> cleanup.  Of course I also need to track down all the roots
> then... :-)

Yes. It would be nice to have these participate in the "cleanup protocol"
that I've described. It should help a lot at Python finalization time,
effectively moving some special casing from import.c to the objects
themselves.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/