Python Memory Manager

greg greg at cosc.canterbury.ac.nz
Mon Feb 18 02:14:18 EST 2008


Paul Rubin wrote:

> As I understand it, Python primarily uses reference counting, with a
> mark and sweep scheme for cycle breaking tacked on as an afterthought.

It's not mark-and-sweep, it's a cycle detector. It goes through
all allocated objects of certain types, and all objects reachable
from those objects, trying to find sets of objects whose reference
counts are all accounted for by references within the set. Then
it picks an arbitrary object in each set and clears all the
references from that object. This breaks the cycle, and allows all
the objects in it to be reclaimed by their reference counts
dropping to zero.

(There's some kind of generational scheme in there as well,
I think, but I don't know the details.)

In a sense, this is the opposite of mark-and sweep. A mark-and-
sweep collector assumes that everything is garbage unless it
can be proven that it's not. Python's cycle detector, on the other
hand, assumes that nothing is garbage unless it can be proven
that it is.

An advantage of this refcount/cycle detection scheme is that
there is no "root set" that must be maintained at all costs.
This makes it fairly easy to interface Python with external
libraries that have their own notions of how to manage memory.

> It doesn't move objects in memory during GC so you can get
> fragmentation.

It never moves PyObject structs, but variable-sized objects
such as lists have their contents stored in a separate block
of memory, that can be moved when its size changes.

-- 
Greg



More information about the Python-list mailing list