Python memory handling
Paul Melis
paul at science.uva.nl
Thu May 31 10:22:20 EDT 2007
Hello,
frederic.pica at gmail.com wrote:
> I've some troubles getting my memory freed by python, how can I force
> it to release the memory ?
> I've tried del and gc.collect() with no success.
[...]
> The same problem here with a simple file.readlines()
> #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
> import gc #no memory change
> f=open('primary.xml') #no memory change
> data=f.readlines() #meminfo: 12 Mb private, 1.4 Mb shared
> del data #meminfo: 11.5 Mb private, 1.4 Mb shared
> gc.collect() # no memory change
>
> But works great with file.read() :
> #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
> import gc #no memory change
> f=open('primary.xml') #no memory change
> data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared
> del data #meminfo: 1.1 Mb private, 1.4 Mb shared
> gc.collect() # no memory change
>
> So as I can see, python maintain a memory pool for lists.
> In my first example, if I reparse the xml file, the memory doesn't
> grow very much (0.1 Mb precisely)
> So I think I'm right with the memory pool.
>
> But is there a way to force python to release this memory ?!
This is from the 2.5 series release notes
(http://www.python.org/download/releases/2.5.1/NEWS.txt):
"[...]
- Patch #1123430: Python's small-object allocator now returns an arena to
the system ``free()`` when all memory within an arena becomes unused
again. Prior to Python 2.5, arenas (256KB chunks of memory) were never
freed. Some applications will see a drop in virtual memory size now,
especially long-running applications that, from time to time, temporarily
use a large number of small objects. Note that when Python returns an
arena to the platform C's ``free()``, there's no guarantee that the
platform C library will in turn return that memory to the operating
system.
The effect of the patch is to stop making that impossible, and in
tests it
appears to be effective at least on Microsoft C and gcc-based systems.
Thanks to Evan Jones for hard work and patience.
[...]"
So with 2.4 under linux (as you tested) you will indeed not always get
the used memory back, with respect to lots of small objects being
collected.
The difference therefore (I think) you see between doing an f.read() and
an f.readlines() is that the former reads in the whole file as one large
string object (i.e. not a small object), while the latter returns a list
of lines where each line is a python object.
I wonder how 2.5 would work out on linux in this situation for you.
Paul
More information about the Python-list
mailing list