Reference counting garbage collection

Thu Aug 23 01:44:26 EDT 2001

[Paul Rubin]
> I don't see any conflict between portability and other GC methods not
> counting those which play with the machine memory protection hardware.

Or assume that the low-order bit of a pointer is usable to hold info, or
assume a pointer fits in a C long, or assume that the program doing GC has
control over all allocated memory, or etc.  If you want a generational
scheme, how are you going to identify "bad" cross-generational links
efficiently when they happen without playing VM-protection games?  If, e.g.,
every stinking pointer read or write needs to be a macro expansion checking
"by hand", performance goes down the tubes.  The history of effective
generational schemes is the history of exploiting platform gimmicks to
effect efficient read or write barriers.

> ...
> Yes, simple M/S isn't appropriate except for small systems.  A
> generational scheme is indicated here.  However, maybe some future
> implementation can avoid doing so much heap allocation.

That alone could have a bigger effect on speed than choice of GC gimmicks;
e.g., in

    i += 1

today, Python spends more time fiddling refcounts than doing the useful
increment; but the extreme dynamic throughput would provoke M&S schemes into
gobs of "useless" work too.

> At that point it should be possible to compile Python into pretty
> good machine code.

Well, there are many barriers to that.  Going back to i+=1, Python has no
idea at compile-time what type of value i is bound to, and fully resolving
that at runtime probably costs more than the refcount fiddling.

> ...
> By Lisp-style GC, I mostly just mean GC that handles circular
> structure properly.  At another level it means GC that doesn't require
> a lot of programmer attention when writing C extensions, as ref
> counting needs.

It always requires attention, and in my experience M&S gc catastrophes are
no easier to track down than refcount bugs; to the contrary, throw a
threaded collector into the mix and they're pure hell to track down.  So
long as extensions are coded in C, memory management is going to be a
headache.  There are approaches to writing extensions in C++ that manage to
hide almost all of the refcount hair, though.

[on http://www.hpl.hp.com/personal/Hans_Boehm/gc/complexity.html]

> Thanks, that's an interesting paper but I don't know what to conclude
> from it.

Good!  Then I'd say you grasped his main point, but perhaps don't want to
believe it:  it's too complicated for casual pronouncements to enjoy any
credibility.

> Zorn's measurements started in Berkeley in the 80's.  Back
> then, a typical workstation was a 16 mhz 68000 with 80 nsec memory
> access time.  These days, a typical workstation is a 1 Ghz Athlon with
> 50 nsec memory access time.  So today's CPU is > 100x faster (more
> parallelism in addition to higher frequency) but memory is only
> slightly faster.  So the cycle cost of a cache miss has gone up
> tremendously--you can execute 50 or more instructions in the time it
> takes to read something from DRAM on a cache miss.  This -has- to have
> some effect on the cost of memory fragmentation.

Sure, but on the other hand it means every cache miss you *don't* take
because RC managed to reuse memory instantly is that much bigger a win, too;
while zooming all over memory chasing pointers is the least cache-friendly
instruction sequence imaginable.  There's simply no killer argument here.

See http://www.cs.utexas.edu/users/oops/papers.html for more recent work,
esp. Johnstone and Wilson's 1998 "The Memory Fragmentation Problem:
Solved?".  They measured fragmentation rates averaging 1% across a number of
programs using non-relocating allocation, and conclude:

    ...
    While it is clear that there must exist some programs for which this
    is not true, it shows that the received view of the fragmentation
    problem is simply wrong, and fragmentation is not the kind of problem
    30 years of research generally assumed it was---typical program
    behavior is very favorable to several clearly excellent allocation
    policies, due to strong regularities in program allocation and
    deallocation behavior.

I don't believe them either <0.9 wink>.

>> ain't-nothin'-obvious-about-gc-strategies-ly y'rs  - tim

> That's for sure--there's a large literature of GC research that goes
> back for decades.  It's not appropriate to ignore everything that's
> been tried.

I agree, but why would you assume we're ignoring something (let alone
everything)?  We don't have the resource to *implement* every scheme that's
been suggested over the last 50 years, but I expect there are few I haven't
heard of myself, and none that *somebody* in the Python community hasn't
heard of and considered.  The best of what has been implemented and measured
is what we ship; there's a long string of complex patches we haven't
shipped, because they performed worse.  I don't know what else you want us
to do, but if that's not an empty set I predict disappointment <wink>.

sourceforge-remains-open-for-more-patches-ly y'rs  - tim