Python's garbage collection was Re: Python reliability

Fri Oct 14 13:27:59 EDT 2005

Paul Rubin wrote:

> I haven't been keeping up with this stuff in recent years so I have a
> worse concern.  I don't know whether it's founded or not.  Basically
> in the past decade or so, memory has gotten 100x larger and cpu's have
> gotten 100x faster, but memory is less than 10x faster once you're out
> of the cpu cache.  The mark phase of mark/sweep tends to have a very
> random access pattern (at least for Lisp).  In the old days that
> wasn't so bad, since a random memory access took maybe a couple of cpu
> cycles, but today it takes hundreds of cycles.  So for applications
> that use a lot of memory, simple mark/sweep may be a much worse dog
> than it was in the Vax era, even if you don't mind the pauses.

You pay a price for CG one way or the other.  In Python the price is
spread out among a bunch of increment and decrement operations in the
code.  For mark and sweep the price is a big operation done less often.
 My understanding from reading about GC implementations is that
reference counting can exhibit poor cache performance because
decrementing a reference can lead to a chain of decrements on objects
that can be laid out in memory in a fairly random fashion.  Mark and
sweep can be a better cache performer since it's memory access will
tend to be more linear.