Memory usage per top 10x usage per heapy

Dave Angel d at davea.name
Mon Sep 24 21:14:23 EDT 2012


On 09/24/2012 05:59 PM, MrsEntity wrote:
> Hi all,
>
> I'm working on some code that parses a 500kb, 2M line file 

Just curious;  which is it, two million lines, or half a million bytes?

> line by line and saves, per line, some derived strings into various data structures. I thus expect that memory use should monotonically increase. Currently, the program is taking up so much memory - even on 1/2 sized files - that on 2GB machine 

which machine is 2gb, the Windows machine, or the VM?  You could get
thrashing at either level.

> I'm thrashing swap. What's strange is that heapy (http://guppy-pe.sourceforge.net/) is showing that the code uses about 10x less memory than reported by top, and the heapy data seems consistent with what I was expecting based on the objects the code stores. I tried using memory_profiler (http://pypi.python.org/pypi/memory_profiler) but it didn't really provide any illuminating information. The code does create and discard a number of objects per line of the file, but they should not be stored anywhere, and heapy seems to confirm that. So, my questions are:
>
> 1) For those of you kind enough to help me figure out what's going on, what additional data would you like? I didn't want swamp everyone with the code and heapy/memory_profiler output but I can do so if it's valuable.
> 2) How can I diagnose (and hopefully fix) what's causing the massive memory usage when it appears, from heapy, that the code is performing reasonably?
>
> Specs: Ubuntu 12.04 in Virtualbox on Win7/64, Python 2.7/64
>
> Thanks very much.

Tim raised most of my concerns, but I would point out that just because
you free up the memory from the Python doesn't mean it gets released
back to the system.  The C runtime manages its own heap, and is pretty
persistent about hanging onto memory once obtained.  It's not normally a
problem, since most small blocks are reused.  But it can get
fragmented.  And i have no idea how well Virtual Box maps the Linux
memory map into the Windows one.



-- 

DaveA




More information about the Python-list mailing list