python gc performance in large apps

Tim Peters tim.peters at gmail.com
Sun Nov 6 21:57:18 EST 2005


[Robby Dermody]
> ...
> -len(gc.get_objects()) will show linearly increasing counts over time
> with the director (that match the curve of the director's memory usage
> exactly), but with the harvester, the object count doesn't increase over
> time (although memory usage does). This might mean that we are dealing
> with two separate problems on each component....

You meant "at least two" <0.5 wink>.

> uncollected object count growth on the director, and something else on
> the harvester. ...OR the harvester may have an object count growth
> problem as well, it might just be in a C module in a place not visible to
> gc.get_objects() ?

Note that gc.get_objects() only knows about container objects that
elect to participate in cyclic gc.  In particular, it doesn't know
anything about "scalar" types, like strings or floats (or any other
type that can't be involved in a cycle).  For example, this little
program grows about a megabyte per second on my box, but
len(gc.get_objects()) never changes:

"""
import gc
from random import choice

letters = "abcdefghijklmnop"

def build(n):
    return "".join(choice(letters) for dummy in range(n))

d = {}
i = 0
while 1:
    d[build(10)] = build(5)
    i += 1
    if i % 1000 == 0:
        print i, len(gc.get_objects())
"""

To learn about non-gc-container-object growth, use a debug build and
sys.getobjects() instead.  This is described in SpecialBuilds.txt. 
sys.getobjects() tries to keep track of _all_ objects (but exists only
in a debug build).



More information about the Python-list mailing list