[Python-Dev] reference leaks, __del__, and annotations

Nick Coghlan ncoghlan at gmail.com
Fri Mar 31 15:27:35 CEST 2006


Duncan Booth wrote:
> Surely if you have a cycle what you want to do is to pick just *one* of the 
> objects in the cycle and break the link which makes it participate in the 
> cycle. That should be sufficient to cause the rest of the cycle to collapse 
> with __del__ methods being called from the normal refcounting mechanism.
> 
> So something like this:
> 
> for obj in cycle:
>     if hasattr(obj, "__breakcycle__"):
>         obj.__breakcycle__()
>         break
> 
> Every object which knows it can participate in a cycle then has the option 
> of defining a method which it can use to tear down the cycle. e.g. by 
> releasing the resource and then deleting all of its attributes, but no 
> guarantees are made over which obj has this method called. An object with a 
> __breakcycle__ method would have to be extra careful as its methods could 
> still be called after it has broken the cycle, but it does mean that the 
> responsibilities are in the right place (i.e. defining the method implies 
> taking that into account).

Unfortunately, there's two problems with that idea:
   a. it's broken, since we now have a partially torn down object at the tail 
end of our former cycle. What happens if the penultimate object's finaliser 
tries to access that broken one?
   b.it doesn't actually help in the case of generators (which are the ones 
causing all the grief). The generator object itself (which implements the 
__del__ method) knows nothing about what caused the cycle (the cycle is going 
to be due to the Python code in the body of the generator).

As PJE posted the other day, the problem is that the GC assumes that because 
the *type* has a __del__ method, the *instance* needs finalisation. And for 
objects with an explicit close method (like generators), context management 
semantics (like generator-based context managers), or the ability to be 
finalised in the normal course of events (like generator-iterators), most 
instances *don't* need finalisation, as they'll have already been finalised in 
the normal course of events.

Generators are even more special, in that they only require finalisation in 
the first place if they're stopped on a yield statement inside a try-finally 
block.

A simple Boolean attribute (e.g. __finalized__) should be enough. If the type 
has a __del__ method, then the GC would check the __finalized__ attribute. If 
it's both present and true, the GC can ignore the finaliser on that instance 
(i.e. never invokes it, and doesn't treat cycles as uncollectable because of it)

I don't know the GC well enough to know how hard that would be to implement, 
but I suspect we need to do it (or something like it) if PEP 342 isn't going 
to cause annoying memory leaks in real applications.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org


More information about the Python-Dev mailing list