A couple garbage collector questions

Hannah Schroeter hannah at schlund.de
Thu Apr 5 09:16:32 EDT 2001


Hello!

In article <3AC94B6B.F1F4DDEB at cosc.canterbury.ac.nz>,
Greg Ewing  <greg at cosc.canterbury.ac.nz> wrote:
>Kragen Sitaker wrote:

>> Reference-counting exacts very heavy performance costs, no matter what
>> you back it up with.

Correct. *Except* if the compiler does heavy optimization of reference
count updates (i.e. if you can prove that some basic block just increases
the RC, later decreases it, having a net effect of +- 0, you can drop both
RC updates, and so on).

>Something nobody has mentioned yet is that RC is cache-friendly,
>whereas pure M&S is quite cache-hostile. This is important
>now that most machine architectures are heavily reliant
>on cacheing for good performance, and I believe that it is
>one of the main reasons for retaining RC alongside the new
>GC mechanisms.

Pure M&S is ancient, anyway. Generational GC's are much more
cache friendly and not *too* difficult to implement, either.

>The other reason is backwards compatibility with existing
>extension modules.

This *is* a valid reason from a pragmatic POV. However, I wouldn't
think it can't be overcome. Current extension modules must probably
somehow maintain the reference count semantics, e.g. on Python heap
objects which are contained in objects created by the extension lib.
I.e. there must be something like "add reference to this python
object" as an extension API interface, as well as "drop reference to this
python object". Now, you can just add those objects with nonzero
references *from an extension* to some special root set (may not
be freed *and not be moved*). In the opposite direction, you can
wrap any extension object (which could be represented as a C pointer)
into a Python object, consisting of a descriptor "extension object
belonging to *this* extension module [i.e. this finalizer function])"
and the pointer itself. When the GC finds the object to be garbage,
it calls the C finalizer, which perhaps calls the "drop reference to
python object", if the C object "contains" some python object.
That in turn might remove that py object from that special root set,
and so on.

Not completely easy, but certainly doable and perhaps faster than RC
(except if you do those heavy RC optimizations, possibly even
interprocedurally and cross-module, perhaps partly using JIT techniques).

Kind regards,

Hannah.



More information about the Python-list mailing list