Memory Leaks and Heapy

Andrew MacIntyre andymac at bullseye.apana.org.au
Sat Jan 5 05:47:49 EST 2008


Yaakov Nemoy wrote:

> A couple of developers have mentioned that python might be fragmenting
> its memory space, and is unable to free up those pages.  How can I go
> about testing for this, and are there any known problems like this?
> If not, what else can I do to look for leaks?

Marc-Andre brought up pymalloc, but it is worth clarifying a couple of
issues related to its use:
- pymalloc only manages allocations up to (and including) 256 bytes;
   allocations larger than this are passed to the platform malloc to
   allocate.
- the work that was put in to allow return of empty arenas (in Python
   2.5) was geared to handling the general case of applications that
   created huge volumes of objects (usually at start up) and then destroy
   most of them.  There is no support that I'm aware of for any form of
   arena rationalisation in the case of sparsely occupied arenas.
- it has been my experience that pymalloc is a significant benefit over
   the platform malloc for the Python interpreter, both in terms of
   performance and gross memory consumption.  Prior to defaulting to
   using pymalloc (as of 2.3) CPython had run into issues with the
   platform malloc of just about every platform it had been ported to,
   heap fragmentation being particularly notable on Windows (though other
   platforms have also been subject to this).

While pymalloc is highly tuned for the general case behaviour of the
Python interpreter, just as platform malloc implementations have corner
cases so does pymalloc.

Be aware that ints and floats are managed via free lists with
memory allocation directly by the platform malloc() - these objects
are never seen by pymalloc, and neither type has support for
relinquishing surplus memory.  Be also aware that many C extensions
don't use pymalloc even when they could.

In addition to Marc-Andre's suggestions, I would suggest paying 
particular attention to the creation and retention of objects in your 
code - if something's no longer required, explicitly delete it.  It is
all too easy to lose sight of references to objects that hang around in
ways that defeat the gc support.  Watch out for things that might be
sensitive to thread-ids for example.

Careful algorithm planning can also be useful, leveraging object 
references to minimise duplicated data (and possibly get better 
performance).


-- 
-------------------------------------------------------------------------
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac at bullseye.apana.org.au  (pref) | Snail: PO Box 370
        andymac at pcug.org.au             (alt) |        Belconnen ACT 2616
Web:    http://www.andymac.org/               |        Australia



More information about the Python-list mailing list