[Python-Dev] GC Changes

Gustavo Carneiro gjcarneiro at gmail.com
Mon Oct 1 13:10:17 CEST 2007


On 01/10/2007, Justin Tulloss <tulloss2 at uiuc.edu> wrote:
>
> Hello,
>
> I've been doing some tests on removing the GIL, and it's becoming clear
> that some basic changes to the garbage collector may be needed in order for
> this to happen efficiently. Reference counting as it stands today is not
> very scalable.
>
> I've been looking into a few options, and I'm leaning towards the
> implementing IBMs recycler GC (http://www.research.ibm.com/people/d/dfb/recycler-publications.html )
> since it is very similar to what is in place now from the users'
> perspective. However, I haven't been around the list long enough to really
> understand the feeling in the community on GC in the future of the
> interpreter. It seems that a full GC might have a lot of benefits in terms
> of performance and scalability, and I think that the current gc module is of
> the mark-and-sweep variety. Is the trend going to be to move away from
> reference counting and towards the mark-and-sweep implementation that
> currently exists, or is reference counting a firmly ingrained tradition?
>
> On a more immediately relevant note, I'm not certain I understand the full
> extent of the gc module. From what I've read, it sounds like it's fairly
> close to a fully functional GC, yet it seems to exist only as a
> cycle-detecting backup to the reference counting mechanism. Would somebody
> care to give me a brief overview on how the current gc module interacts with
> the interpreter, or point me to a place where that is done? Why isn't the
> mark-and-sweep mechanism used for all memory management?


The cyclic GC is just too slow to react and makes programmers mad.

For instance, in PyGtk we had a traditional problem with gtk.gdk.Pixbuf,
which is basically an object that wraps a raw RGB image.  When users deleted
such an object, which could sometimes comprise tens or hundreds of
megabytes, the memory was not relased until much much later.  That kind of
code ended up having to manually call gc.collect() to fix what was perceived
by most programmers as a "memory leak", which kind of defeats the purpose of
a garbage collector.  This happened because PyGtk used to rely on the cyclic
GC doing its work.  Thankfully we moved away from that and now simple
reference counting can free a Pixbuf in most cases.

The cyclic GC is a very useful system, but it should only be used in
addition to, not instead of, reference counting.  At least that's my
personal opinion...

-- 
Gustavo J. A. M. Carneiro
INESC Porto, Telecommunications and Multimedia Unit
"The universe is always one step beyond logic." -- Frank Herbert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20071001/ee5746af/attachment.htm 


More information about the Python-Dev mailing list