[Python-Dev] GC and Vladimir's obmalloc

nas@arctrix.com nas@arctrix.com
Mon, 26 Feb 2001 07:42:34 -0800


Executive Summary: obmalloc will allow more efficient GC and we
should try hard to get it into 2.1.

I've finally spent some time looking at obmalloc and thinking
about how to iterate the GC. The advantage would be that objects
managed by obmalloc would no longer have to kept in a linked
list.  That saves both time and memory.

I think the right way to do this is to have obmalloc kept track
of two separate heaps. One would be for "atomic" objects, the
other for "container" objects. I haven't yet figured out how to
implement this yet. A lower level malloc interface that takes a
heap structure as an argument is an obvious solution.

When the GC runs it needs to find container objects. Since
obmalloc only deals with blocks of 256 bytes or less, large
containers would still have to be stored in a linked list.  The
rest can be found by searching the obmalloc container heap.

Searching the heap is fairly easy. The heap is an array of
pointers to pool lists. The only trick is figuring out which
parts of the pools are allocated.  I think adding the invariant
ob_type = NULL means object not allocated is a good solution.
That pointer could be set to NULL when the object is deallocated
which would also be good for catching bugs. If we pay attention
to pool->ref.count we don't even have to set those pointers for a
newly allocated pool.  Some type of GC locking will probably
have to be added (we don't want the collector running when
objects are in inconsistent states).  

I think the GC state (an int for each object) for obmalloc
objects should be stored separately. Each pool header could have
a pointer to an array of ints. This array could be allocated
lazily when the GC runs. The advantages would be better cache
behavior and less memory use if GC is disabled.

Crude generational collection could be done by doing something
like treating the first partially used pool in each size class as
generation 0, other partially used pools and the first used pool
as generation 1, and all other non-free pools as generation 2.

Is the only issue with obmalloc treading? If so, what do we do to
resolve this?  


  Neil