CPython's cyclic garbage collector (was [Python-ideas] Automatic context managers)

Chris Angelico rosuav at gmail.com
Fri Apr 26 21:56:24 EDT 2013


On Sat, Apr 27, 2013 at 9:45 AM, Dave Angel <davea at davea.name> wrote:
> I didn't know there was a callback that a user could hook into.  That's very
> interesting.
>

On Sat, Apr 27, 2013 at 10:22 AM, Skip Montanaro <skip at pobox.com> wrote:
>> Whenever the GC finds a cycle that is unreferenced but uncollectable,
>> it stores those objects in the list gc.garbage.  At that point, if the
>> user wishes to clean up those cycles, it is up to them to delve into
>> gc.garbage, untangle the objects contained within, break the cycles,
>> and remove them from the list so that they can be freed by the ref
>> counter.
>
> I wonder if it would be useful to provide a gc.garbagehook analogous
> to sys.excepthook?
> Users could assign a function of their choice to much the cyclic
> garbage periodically.
>
> Just a thought, flying out of my fingers before my brain could stop it...

As far as I know, Dave, there isn't currently one; Skip, that's close
to what I'm talking about - it saves on the periodic check. But
burying it in gc.garbagehook implies having a separate piece of code
that knows how to break the reference cycles, whereas the __del__
method puts the code right there in the code that has the problem.
Actually, *ANY* solution to this problem implies having __del__ able
to cope with the cycle being broken. Here's an example, perhaps a
silly one, but not far different in nature from some things I've done
in C++. (Granted, all the Python implementations of those same
algorithms have involved built-in types rather than linked lists, but
still.)

class DLCircList:
	def __init__(self,payload):
		self.payload=payload
		self.next=self.prev=self
		print("Creating node: %s"%self.payload)
	def __del__(self):
		print("Deleting node %s from cycle %s"%(self.payload,self.enum()))
		self.prev.next=self.next
		self.next.prev=self.prev
	def attach(self,other):
		assert(self.next==self) # Don't attach twice
		self.prev=other
		self.next=other.next
		other.next=self
		self.next.prev=self
		print("Adding node %s to cycle %s"%(self.payload,self.enum()))
	def enum(self):
		"""Return a list of all node payloads in this cycle."""
		ptr=self.next
		nodes=[self.payload]
		while ptr!=self:
			nodes.append(ptr.payload)
			ptr=ptr.next
		return nodes

lst=DLCircList("foo")
DLCircList("bar").attach(lst)
DLCircList("quux").attach(lst)
DLCircList("asdf").attach(lst)
DLCircList("qwer").attach(lst)
DLCircList("zxcv").attach(lst)
print("Enumerating list: %s"%lst.enum())

del lst
import gc
gbg=gc.collect()
print("And we have garbage: %s"%gbg)
print(gc.garbage)



Supposing you did this many many times, and you wanted decent garbage
collection. How would you write a __del__ method, how would you write
something to clean up gc.garbage? One way or another, something will
have to deal with the possibility that the invariants have been
broken, so my theory is that that possibility should be entirely
within __del__. (Since __del__ calls enum(), it's possible for enum()
to throw DestructedObject or whatever, but standard exception handling
will deal with that.)

ChrisA



More information about the Python-list mailing list