Reference counting garbage collection

Tim Peters tim.one at home.com
Thu Aug 23 23:13:40 EDT 2001


[plakal at nospam-cs.wisc.edu]
> 	Just wanted to point out that there's been
> 	some recent work from IBM Research on a
> 	concurrent reference-counting collector for Java
> 	that reclaims cycles but without doing any
> 	global tracing like mark&sweep or copying
>         (it does do some localized tracing but
> 	not the entire graph of accessible objects).
>
> 	This also seems to be one of the few papers
> 	that does a head-to-head performance comparison
> 	between ref-counting and mark&sweep (concurrent
> 	versions):
> 	  http://www.research.ibm.com/people/d/dfb/papers.html#Bacon01Java

Cool!  Thanks for the reference.  Python's cycle reclamation scheme actually
has a lot in common with this, although Python's is based more directly on a
scheme first implemented for Fortran (of all places!) than on the work of
Christopher and Lins and so on.

I'll note in passing that these "local scans" have a killer advantage for
Python that doesn't appear relevant for their Java work:  a foreign system
can easily get hold of a pointer to a Python object, and Python has no way
"even in theory" to chase memory owned by other subsystems.  Neither does it
have any way to *know* that a foreign subsystem is holding on to a pointer.
This creates what appear to be insurmountable problems for global M&S:  it
can't trace all the live objects because some live objects are outside its
vision.

This doesn't bother local scanning, because that's only trying to account
for all the refcounts due to pointers it *does* know about.  The effect of
an "invisible" pointer on a local scan is merely to guarantee that the
pointee stays alive (because we can't see the pointer, we can't account for
all the refcounts in a local scan, and so don't reclaim the object it points
to).

> ...
> 	They also claim that in this age of
> 	ultra-fast processors and slow memory,
> 	it might be a good tradeoff to spend
> 	more CPU time doing ref-counting operations
> 	while improving memory access time
> 	due to ref-counting's better overall locality
> 	(no periodic trashing of cache and VM
> 	system by global tracing of heap).

It's unclear.  For *distributed* systems, it is clear(er), and refcounting
has enjoyed a major resurgence in that context (having to travel across a
network to chase a pointer has a wonderfully clarifying effect <wink>).

still-waiting-for-bubblesort's-rehabilitation-ly y'rs  - tim





More information about the Python-list mailing list