RC (was Re: Real Problems with Python)

Wed Feb 16 04:45:58 EST 2000

[Vladmir & Tim have apparently lost everyone, so let's wrap it up]

[Tim]
>> Vlad, you're trying to get "deep" on what is really a "shallow" point:

[Vladimir Marangozov]
> Tim, you're trying to get "shallow" on what is really a "deep"
> point.

I suspect it's really a bit of both!  This started as a very brief Usenet
posting, and I don't feel bad about pointing people to LOR (locality of
reference) as a way to get a first handle on why RC (reference counting) has
run faster in Python than every "real GC" attempt to replace it.  To a first
approximation, LOR is the single most useful concept to get across to people
struggling with memory issues on VM (virtual memory) systems.  To a second
approximation, "cache effects" is more accurate, but also much harder to get
a grip on.  To a third approximation, "platform quirks" would have covered
my ass completely, yet left everyone without a clue.

So I'm definitely indulging in idealized fiction, that happens to be true in
some actual cases.  My track record of *predicting* how various gc
approaches would turn out in Python ain't half bad, you know <wink>.  OTOH,
your saying "it ain't necessarily so!" in six ways is six times true, but
doesn't leave people with even a sixth of a first approximation to trip
over.  If they're interested enough, they'll pursue it and fill in the
painful details themselves.

> Saying that "RC has excellent LOR", even when trash is recycled at
> high rates, is disturbing.

Take a vacation <wink>.

> ...
> Nope, not "whatever". Take a flat-out computation involving a
> container (say, a dict with ints) which gets resized repeatedly,
> and your argument is flawed.

Not so much flawed here as ridiculous.  So what?  I haven't claimed to give
a rigorous account, and one can't even be attempted without a sufficiently
precise model of a specific platform (both HW and system SW).  There's
nothing to argue about here short of that:  take it as it was intended, a
comforting brief bedtime story with elements of truth.

BTW, I expect you do undervalue the usefulness of LOR arguments "in
practice, in general".  Their utility is based on the simplicities that are
presumed usual rather than the pathologies that are presumed possible.  The
real world often enough plays along to make it worth the minimal effort.

> ...
> You're saying that the drops of an intense rain falling over the MIT
> building will *not* wet Cambridge and Boston, without knowing:
> (1) how long the rain will last,

Forever, for practical purposes.

 (2) the evolution of the rainy cloud and

Long periods of steady states, with brief periods of unpredictable change.

> (3) which parts of Boston/Cambridge are included in the calculus.
> <wink>

MIT people are quite certain Boston is a fiction <wink>.

> ...
> And that's why I don't buy your argument.

That's OK by me:  you're still trying to make it too deep <wink>.

> We still don't know anything about LOR in Python, so drop it from
> your vocabulary <wink>. It may be bad, it may be good, it may be
> acceptable.

I know about cache effects on Pentiums, because I've watched them under
Intel's analysis tools.  LOR looked like a darned good bet to me.  This is
time-consuming & tedious work, though, and I only looked at a few of my own
programs.  They happened to fit all the assumptions I made, so perhaps it's
not surprising that my conclusions appeared to fit <wink>.

>> I believe there's sufficient evidence that state-of-the-art "real gc"
>> can be made faster than rc.  Python isn't other languages, though, and
>> so long as everything in Python is always "boxed", other languages'
>> gc experiences aren't especially relevant.

> Agreed. With Toby/Neil's recent work, things look promising...
> BTW, any comparison of RC with state-of-the-art GC is fine by me.
>
> as-long-as-you-don't-mess-with-LOR-<wink>-ly y'rs

ok-ok-it's-due-to-the-"cacheothonic-principle"-ly y'rs  - tim