RC (was Re: Real Problems with Python)

Tue Feb 15 11:07:46 EST 2000

Tim Peters wrote:
> 
> [Tim]
> > ...
> > (RC has excellent locality of reference, especially when trash
> > is being created and recycled at high dynamic rates).
> 
> [Vladimir Marangozov]
> > That last one is a good one. So far, nobody has shown any evidence
> > that LOR is good or bad when different objects are involved (except,
> > I remember, some special cases like iterating over ints that happen
> > to be allocated close to each other). Better yet, nobody _can_ show
> > such evidence as far as one stays on top of a virtual memory manager.
> 
> Vlad, you're trying to get "deep" on what is really a "shallow" point:

Tim, you're trying to get "shallow" on what is really a "deep" point.
Saying that "RC has excellent LOR", even when trash is recycled at high
rates, is disturbing.

> a "high dynamic rate" is most likely to occur when a program is doing
> flat-out computation, so type homogeneity is likely.  Like ints or floats
> or whatever in Python.

Nope, not "whatever". Take a flat-out computation involving a container
(say, a dict with ints) which gets resized repeatedly, and your argument
is flawed. It's likely that the dict will walk through your memory.
(and more precisely, through your virtual memory).

> And those are *not* "on top of a virtual memory manager" in Python,
> but involve cycling & recycling via Python's type-specific internal
> "free lists" (some of which you added, so you should remember this <wink>).

That first one is a good one <wink>.

> So long as the free lists act like LIFOs, you get to reuse the same little
> chunks of memory over and over and over ... again.

Here, you're absolutely right.

> It's LOR in *both* time and space, and of course that's "good":
> it's the difference between page faults and, umm, not page faults <wink>.

And here, you're cheating.

You're saying that the drops of an intense rain falling over the MIT
building will *not* wet Cambridge and Boston, without knowing:
(1) how long the rain will last, (2) the evolution of the rainy cloud and
(3) which parts of Boston/Cambridge are included in the calculus. <wink>

IOW, you're qualifying LOR without considering (1) a time slice, (2) the
nature of the objects (which may be "moving"), and most importantly,
(3) a fixed "working set" of memory pages, which gives the basis for
any further reasoning about LOR.

However, your "educated guess" <wink> is based on assumptions & observations
of popular combos. You're assuming that we have enough memory so that
the working set covers all objects involved in the intense computation
(which is far from being granted, especially for combos with limited memory),
and that these objects are "static" for a long perod of time.

This may well be granted in most of the cases (a PC with 8+ MB of RAM,
without heavy loads, i.e single-user) and for most of the Python scripts
we're running. But note that this doesn't give us *any* evidence about LOR.

And that's why I don't buy your argument. We still don't know anything about
LOR in Python, so drop it from your vocabulary <wink>. It may be bad, it may
be good, it may be acceptable.

> 
> > make-that-simply-"RC-is-faster-than-the-alternatives"-<wink>-
> > -and-I'll-take-it-ly y'rs
> 
> I believe there's sufficient evidence that state-of-the-art "real gc"
> can be made faster than rc.  Python isn't other languages, though, and
> so long as everything in Python is always "boxed", other languages'
> gc experiences aren't especially relevant.

Agreed. With Toby/Neil's recent work, things look promising...
BTW, any comparison of RC with state-of-the-art GC is fine by me.

as-long-as-you-don't-mess-with-LOR-<wink>-ly y'rs
-- 
       Vladimir MARANGOZOV          | Vladimir.Marangozov at inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252