Python Memory Usage
malkarouri at gmail.com
malkarouri at gmail.com
Sat Jun 30 11:30:43 EDT 2007
On Jun 20, 4:48 am, "greg.no... at gmail.com" <greg.no... at gmail.com>
wrote:
> I am using Python to process particle data from a physics simulation.
> There are about 15 MB of data associated with each simulation, but
> there are many simulations. I read the data from each simulation into
> Numpy arrays and do a simple calculation on them that involves a few
> eigenvalues of small matricies and quite a number of temporary
> arrays. I had assumed that that generating lots of temporary arrays
> would make my program run slowly, but I didn't think that it would
> cause the program to consume all of the computer's memory, because I'm
> only dealing with 10-20 MB at a time.
>
> So, I have a function that reliably increases the virtual memory usage
> by ~40 MB each time it's run. I'm measuring memory usage by looking
> at the VmSize and VmRSS lines in the /proc/[pid]/status file on an
> Ubuntu (edgy) system. This seems strange because I only have 15 MB of
> data.
>
> I started looking at the difference between what gc.get_objects()
> returns before and after my function. I expected to see zillions of
> temporary Numpy arrays that I was somehow unintentionally maintaining
> references to. However, I found that only 27 additional objects were
> in the list that comes from get_objects(), and all of them look
> small. A few strings, a few small tuples, a few small dicts, and a
> Frame object.
>
> I also found a tool called heapy (http://guppy-pe.sourceforge.net/)
> which seems to be able to give useful information about memory usage
> in Python. This seemed to confirm what I found from manual
> inspection: only a few new objects are allocated by my function, and
> they're small.
>
> I found Evan Jones article about the Python 2.4 memory allocator never
> freeing memory in certain circumstances: http://evanjones.ca/python-memory.html.
> This sounds a lot like what's happening to me. However, his patch was
> applied in Python 2.5 and I'm using Python 2.5. Nevertheless, it
> looks an awful lot like Python doesn't think it's holding on to the
> memory, but doesn't give it back to the operating system, either. Nor
> does Python reuse the memory, since each successive call to my
> function consumes an additional 40 MB. This continues until finally
> the VM usage is gigabytes and I get a MemoryException.
>
> I'm using Python 2.5 on an Ubuntu edgy box, and numpy 1.0.3. I'm also
> using a few routines from scipy 0.5.2, but for this part of the code
> it's just the eigenvalue routines.
>
> It seems that the standard advice when someone has a bit of Python
> code that progressively consumes all memory is to fork a process. I
> guess that's not the worst thing in the world, but it certainly is
> annoying. Given that others seem to have had this problem, is there a
> slick package to do this? I envision:
> value = call_in_separate_process(my_func, my_args)
>
> Suggestions about how to proceed are welcome. Ideally I'd like to
> know why this is going on and fix it. Short of that workarounds that
> are more clever than the "separate process" one are also welcome.
>
> Thanks,
> Greg
I had almost the same problem. Will this do?
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/511474
Any comments are welcome (I wrote the recipe with Pythonistas' help).
Regards,
Muhammad Alkarouri
More information about the Python-list
mailing list