CPython's cyclic garbage collector (was [Python-ideas] Automatic context managers)

Dave Angel davea at davea.name
Fri Apr 26 13:42:11 EDT 2013


On 04/26/2013 12:54 PM, Chris Angelico wrote:
> [[ Resending under a more appropriate subject line... sorry about
> that, ignore the other one as it'll only confuse matters ]]
>
> On Sat, Apr 27, 2013 at 1:54 AM, MRAB <python at mrabarnett.plus.com> wrote:
>> On 26/04/2013 14:02, anatoly techtonik wrote:
>>> This circular reference problem is interesting. In object space it
>>> probably looks like a stellar detached from the visible (attached)
>>> universe. Is the main problem in detecting it?
>>>
>> The problem is in knowing in which order the objects should be
>> collected.
>>
>> For example, if A refers to B and B refers to A, should you collect A
>> then B, or B then A? If you collect A first, then, for a time, B will
>> be referring to a non-existent object. That's not good if the objects
>> have destructors which need to be run.
>
> Spin-off thread from python-ideas to discuss a more general question
> of garbage collection of cyclic structures.
>
> Once it's been proven that there's an unreferenced cycle, why not
> simply dispose of one of the objects, and replace all references to it
> (probably only one - preferably pick an object with the fewest
> references) with a special temporary object? In fact, that could
> probably be done in CPython by wiping out the object in memory and
> replacing it with a special marker of some sort, which would then
> automatically "take over" all references to the old object. Any
> attempt to manipulate this object could simply pop back with a
> DestructedObject exception or something.
>
> Is this a plausible (never mind viable yet, just conceptually
> plausible) alternative to sticking them into gc.garbage and ignoring
> them? It'd allow a double-linked list/tree to function cleanly -
> imagine, for instance, something like the DOM facilities available to
> web browser scripts:
>
> class DOMObject:
>          def __init__(self,parent):
>                  self.parent=parent
>                  self.firstchild=self.sibling=None
>                  if not parent: return
>                  if not parent.firstchild:
>                          parent.firstchild=self
>                  else:
>                          child=parent.firstchild
>                          while child.sibling:
>                                  child=child.sibling
>                          child.sibling=self
>          def __del__(self):
>                  print("Disposing of id #%d"%id(self))
>
> document=DOMObject(None)
> body=DOMObject(document)
> p=DOMObject(body)
> p=DOMObject(body)
> p=DOMObject(body)
> del document,body,p
> gc.collect()
>
> The __del__ method would need to clean up the external resources used
> by this object, but wouldn't have to walk the tree. Yet, just because
> there is a reference loop and there are __del__ methods, the garbage
> collector gives up and leaves it to the program author to deal with.
>
> I can understand if this is considered too complicated and too unusual
> a circumstance to be worth bothering to support, but I'm curious as to
> whether it's at least conceptually reasonable to do something like
> this.
>
> ChrisA
>

I don't see what your "special" temporary object actually accomplishes. 
  Seems to me you need to declare that your __del__() methods promise 
not to reference each other, and the gc would then check all objects in 
the cycle, and do its present behavior if any of the destructors is not 
specially declared.

I'm not sure how often you'd have a non-trivial destructor that wouldn't 
reference any objects.  And doing a static analysis of what will happen 
during the destructor would be pretty messy.  So the best I and come up 
with is to keep the declaration, but require a try/catch to cleanly 
terminate each destructor if it ever references anything in the tree.

-- 
DaveA



More information about the Python-list mailing list