How do you debug memory usage?

Noah noah at noah.org
Tue May 6 16:21:30 EDT 2008


On May 6, 6:27 am, David <wizza... at gmail.com> wrote:
> Hi list.
> What is the best way to debug memory usage in a Python script?
> ...
> Are there any tools/modules/etc I can use like this?
> David.

You need to use the debug build of Python to get exact numbers,
but there are a few tricks you can use with the standard build
to detect memory leaks. The simplest thing is to simply watch the
RSS column output of `ps aux` (I'm assuming you are using UNIX).

The other trick I got from Chris Siebenmann
http://utcc.utoronto.ca/~cks/space/blog/python/GetAllObjects
I modified his example a little bit. This does not tell you how
many bytes of memory your running code is using, but it will
show you the number of objects. When looking for memory leaks,
counting the number of objects is sufficient to detect leaks.
For example, say you suspect a function is leaking memory.
You could call it in a loop like this and watch the count of
objects before and after each call.

  while True:
      print "Number objects before:", len(get_all_objects())
      suspect_function()
      print "Number objects after:", len(get_all_objects())

Here is my modified version of Chris' get_all_objects() function.
All I did was force garbage collection using gc.collect().
This makes sure that you are not counting objects that Python has
left in memory, but plans on deleting at some point.

  import gc
  # Recursively expand slist's objects
  # into olist, using seen to track
  # already processed objects.
  def _getr(slist, olist, seen):
      for e in slist:
          if id(e) in seen:
              continue
          seen[id(e)] = None
          olist.append(e)
          tl = gc.get_referents(e)
          if tl:
              _getr(tl, olist, seen)
  # The public function.
  def get_all_objects():
      """Return a list of all live Python objects, not including the
list itself."""
      gc.collect()
      gcl = gc.get_objects()
      olist = []
      seen = {}
      # Just in case:
      seen[id(gcl)] = None
      seen[id(olist)] = None
      seen[id(seen)] = None
      # _getr does the real work.
      _getr(gcl, olist, seen)
      return olist

--
Noah



More information about the Python-list mailing list