Python Memory Usage

malkarouri at gmail.com malkarouri at gmail.com
Sat Jun 30 11:30:43 EDT 2007


On Jun 20, 4:48 am, "greg.no... at gmail.com" <greg.no... at gmail.com>
wrote:
> I am using Python to process particle data from a physics simulation.
> There are about 15 MB of data associated with each simulation, but
> there are many simulations.  I read the data from each simulation into
> Numpy arrays and do a simple calculation on them that involves a few
> eigenvalues of small matricies and quite a number of temporary
> arrays.  I had assumed that that generating lots of temporary arrays
> would make my program run slowly, but I didn't think that it would
> cause the program to consume all of the computer's memory, because I'm
> only dealing with 10-20 MB at a time.
>
> So, I have a function that reliably increases the virtual memory usage
> by ~40 MB each time it's run.  I'm measuring memory usage by looking
> at the VmSize and VmRSS lines in the /proc/[pid]/status file on an
> Ubuntu (edgy) system.  This seems strange because I only have 15 MB of
> data.
>
> I started looking at the difference between what gc.get_objects()
> returns before and after my function.  I expected to see zillions of
> temporary Numpy arrays that I was somehow unintentionally maintaining
> references to.  However, I found that only 27 additional objects  were
> in the list that comes from get_objects(), and all of them look
> small.  A few strings, a few small tuples, a few small dicts, and a
> Frame object.
>
> I also found a tool called heapy (http://guppy-pe.sourceforge.net/)
> which seems to be able to give useful information about memory usage
> in Python.  This seemed to confirm what I found from manual
> inspection: only a few new objects are allocated by my function, and
> they're small.
>
> I found Evan Jones article about the Python 2.4 memory allocator never
> freeing memory in certain circumstances:  http://evanjones.ca/python-memory.html.
> This sounds a lot like what's happening to me.  However, his patch was
> applied in Python 2.5 and I'm using Python 2.5.  Nevertheless, it
> looks an awful lot like Python doesn't think it's holding on to the
> memory, but doesn't give it back to the operating system, either.  Nor
> does Python reuse the memory, since each successive call to my
> function consumes an additional 40 MB.  This continues until finally
> the VM usage is gigabytes and I get a MemoryException.
>
> I'm using Python 2.5 on an Ubuntu edgy box, and numpy 1.0.3.  I'm also
> using a few routines from scipy 0.5.2, but for this part of the code
> it's just the eigenvalue routines.
>
> It seems that the standard advice when someone has a bit of Python
> code that progressively consumes all memory is to fork a process.  I
> guess that's not the worst thing in the world, but it certainly is
> annoying.  Given that others seem to have had this problem, is there a
> slick package to do this?  I envision:
> value = call_in_separate_process(my_func, my_args)
>
> Suggestions about how to proceed are welcome.  Ideally I'd like to
> know why this is going on and fix it.  Short of that workarounds that
> are more clever than the "separate process" one are also welcome.
>
> Thanks,
> Greg

I had almost the same problem. Will this do?

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/511474

Any comments are welcome (I wrote the recipe with Pythonistas' help).

Regards,
Muhammad Alkarouri




More information about the Python-list mailing list