python gc performance in large apps

Fredrik Lundh fredrik at pythonware.com
Tue Oct 25 05:30:49 EDT 2005


Robby Dermody wrote:

> In the diagrams above, one can see the night-day separation clearly. At
> night, the memory usage growth seemed to all but stop, but with the
> increased call volume of the day, it started shooting off again. When I
> first started gathering this data, I was hoping for a logarithmic curve,
> but at least after 48 hours, it looks like the usage increase is almost
> linear. (Although logarithmic may still be the case after it exceeds a
> gig or two of used memory. :) I'm not sure if this is something that I
> should expect from the current gc, and when it would stop.

I don't think the GC has much to do with this; it's a lot more likely that you
have growing memory structures (in-memory logs, memo caches, etc) some-
where in your application.

adding some code that calls gc.get_objects() and dumps the N largest lists
and dictionaries might help.  something like (untested):

import gc, repr, sys

def gc_check(objects):

    def check(objects):
        lists = []
        dicts = []
        for i in objects:
            if isinstance(i, list): lists.append((len(i), i))
            elif isinstance(i, dict): dicts.append((len(i), i))
        lists.sort(); lists.reverse()
        for n, i in lists[:20]:
            print "LIST", n, repr.repr(i)
        dicts.sort(); dicts.reverse()
        for i in dicts[:20]:
            print "DICT", n, repr.repr(i)

    try:
        check(objects)
    except:
        print "error in gc_check", sys.exc_info()
        try:
            # clear exception state (even if exc_clear doesn't exist ;-)
            sys.exc_clear()
        except AttributeError:
            pass

(tweak as necessary)

and call this like

    gc_check(gc.get_objects())

at regular intervals.  if you have any constantly growing containers in your
application, they will most likely appear in the output rather quickly.

</F> 






More information about the Python-list mailing list