[Python-Dev] PEP 442: Safe object finalization

Maciej Fijalkowski fijall at gmail.com
Tue Jun 4 03:56:55 CEST 2013


On Sat, May 18, 2013 at 10:33 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Sat, 18 May 2013 16:22:55 +0200
> Armin Rigo <arigo at tunes.org> wrote:
>> Hi Antoine,
>>
>> On Sat, May 18, 2013 at 3:45 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> >> How is this done?  I don't see a clear way to determine it by looking
>> >> only at the objects in the CI, given that arbitrary modifications of
>> >> the object graph may have occurred.
>> >
>> > The same way a generation is traversed, but restricted to the CI.
>> >
>> > First the gc_refs field of each CI object is initialized to its
>> > ob_refcnt (again).
>> >
>> > Then, tp_traverse is called on each CI object, and each visited
>> > CI object has its gc_refs decremented. This substracts CI-internal
>> > references from the gc_refs fields.
>> >
>> > At the end of the traversal, if all CI objects have their gc_refs equal
>> > to 0, then the CI has no external reference to it and can be cleared.
>> > If at least one CI object has non-zero gc_refs, the CI cannot be
>> > cleared.
>>
>> Ok, indeed.  Then you really should call finalizers only once: in case
>> one of the finalizers in a cycle did a trivial change like I
>> described, the algorithm above will conservatively assume the cycle
>> should be kept alive.  At the next GC collection we must not call the
>> finalizer again, because it's likely to just do a similar trivial
>> change.
>
> Well, the finalizer will only be called if the resurrected object is
> dereferenced again; otherwise the object won't be considered by the GC.
> So, this will only happen if someone keeps trying to destroy a
> resurrected object.
>
> Calling finalizers only once is fine with me, but it would be a change
> in behaviour; I don't know if it may break existing code.
>
> (for example, say someone is using __del__ to manage a freelist)
>
> Regards
>
> Antoine.

PyPy already ever calls finalizers once. If you resurrect an object,
it'll be alive, but it's finalizer will not be called again. We
discussed a few changes a while ago and we decided (I think even
debated on python-dev) than even such behavior is correct:

* you have a reference cycle A <-> B, C references A. C references itself.

* you collect the stuff. We do topological order, so C finalizer is
called first (they're only undefined inside a cycle)

* then A and B finalizers are called in undefined order, even if C
finalizer would resurrect C.

* no more finalizers for those objects are called

I'm not sure if it's cool for CPython or not to do such changes

Cheers,
fijal


More information about the Python-Dev mailing list