[Python-3000] Is reference counting still needed?

Tim Peters tim.peters at gmail.com
Thu Apr 20 12:41:27 CEST 2006


[Erno Kuusela]
>> The refcounting vs generational GC reasoning I've heard argues that
>> refcounting is less cache-friendly: The cache line containing the
>> refcount field of the pointed-to objects is dirtied (or at least
>> loaded) every time something is done with the reference,

[Greg Ewing]
> Has anyone actually measured this effect in a real
> system, or is it just theorising?

Of course people have tried it in real systems.  Then they write about
it, and everyone gets confused <0.5 wink>.  Efficiency of gc strategy
is deeply dependent, in highly complex ways, on _many_ aspects of the
system in question.  It's a tempting but basically idiotic mistake to
imagine, e.g., that a strategy that works well for LISP would also
work well for CPython (or vice versa).

> If it's a real effect, would this be helped at all if the
> refcounts weren't stored with the objects, but kept all
> together in one block of memory? Or would that just make
> things worse?

That's also been tried, and "it depends".  An obvious thing about
CPython is that you can't do anything non-trivial with an object O
without reading up O's type pointer first -- but as soon as you do
that, chances are high that you have O's refcount in L1 cache too,
since the refcount is adjacent to the type pointer in the base
PyObject struct, and that's true of _all_ objects in CPython. 
"Reading up the refcount" is essentially free then, given that you had
to read up the type pointer anyway (or "reading up the type pointer"
is essentially free, given that you already read up the refcount).

> With dynamic languages becoming increasingly important
> these days, I wonder whether anyone has thought about
> what sort of cache or other memory architecture modifications
> might make things like refcounting more efficient.

Yes, but I don't think anyone's offering to build a P3K chip for us ;-)


More information about the Python-3000 mailing list