[Python-ideas] [Python-Dev] GC Changes

Talin talin at acm.org
Tue Oct 2 08:06:46 CEST 2007


Adam Olsen wrote:
> Ahh, I forgot a major alternative.  You could instead work on an exact
> GC for PyPy.  I personally think it's more interesting to get
> cooperation from the compiler. ;)

The open source world sorely needs an exact garbage collector, which is 
why I am in the process of writing one. Right now its going very slowly, 
as I've been able to only commit a few hours a week to the work.

One difficulty in making a FOSS garbage collector is that exact 
collectors tend to be tightly coupled to the specific object model of 
the runtime environment. The design of the collector puts a lot of 
constraints on the design of the object model, and vice versa. This 
makes it difficult to create a collector that works with a lot of 
different languages.

As you already know, PyPy is currently targeting LLVM as a backend. LLVM 
provides an abstract interface for garbage collection that solves some 
of these problems. Because the collector interface is defined at the 
virtual instruction level, before optimization takes place, it means 
that some of the overhead that would be incurred by making a completely 
generic callback interface can be avoided - that is, the abstraction 
levels needed for a loosely-coupled collector can be optimized away by 
LLVM's optimizer.

Based on a study of the LLVM code and docs, I have come to the belief 
that it would be possible to create a generic collector implementation 
which would work with a fairly broad class of both object models and 
collection algorithms. In other words, while it might not support every 
possible language out there, it would be possible to support a fairly 
wide subset.

So far, what I've got is a general purpose heap implementation, which is 
  loosely inspired by (although not copied from) the popular dlmalloc 
implementation and a few others, and also incorporates a number of 
features which would be needed by a collector algorithm (such as 
reserved space for collector state bits and type tags in the object 
allocation header.) I've also got some utility classes that would be 
useful to a collector, such as a lock-free implementation of work 
queues. I don't have an actual collector working yet, and I'm not going 
to post a link to the code here because it's not ready for public 
viewing yet.

In any case, if anyone is interested in discussing this further, feel 
free to email me privately. Since this isn't directly related to CPython 
or PyPy, I don't want to clutter up the discussion lists here with 
issues related to garbage collection.

-- Talin




More information about the Python-ideas mailing list