Python memory handling

Paul Melis paul at science.uva.nl
Thu May 31 10:22:20 EDT 2007


Hello,

frederic.pica at gmail.com wrote:
> I've some troubles getting my memory freed by python, how can I force
> it to release the memory ?
> I've tried del and gc.collect() with no success.

[...]

> The same problem here with a simple file.readlines()
> #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
> import gc #no memory change
> f=open('primary.xml') #no memory change
> data=f.readlines() #meminfo: 12 Mb private, 1.4 Mb shared
> del data #meminfo: 11.5 Mb private, 1.4 Mb shared
> gc.collect() # no memory change
> 
> But works great with file.read() :
> #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared
> import gc #no memory change
> f=open('primary.xml') #no memory change
> data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared
> del data #meminfo: 1.1 Mb private, 1.4 Mb shared
> gc.collect() # no memory change
> 
> So as I can see, python maintain a memory pool for lists.
> In my first example, if I reparse the xml file, the memory doesn't
> grow very much (0.1 Mb precisely)
> So I think I'm right with the memory pool.
> 
> But is there a way to force python to release this memory ?!

This is from the 2.5 series release notes 
(http://www.python.org/download/releases/2.5.1/NEWS.txt):

"[...]

- Patch #1123430: Python's small-object allocator now returns an arena to
   the system ``free()`` when all memory within an arena becomes unused
   again.  Prior to Python 2.5, arenas (256KB chunks of memory) were never
   freed.  Some applications will see a drop in virtual memory size now,
   especially long-running applications that, from time to time, temporarily
   use a large number of small objects.  Note that when Python returns an
   arena to the platform C's ``free()``, there's no guarantee that the
   platform C library will in turn return that memory to the operating 
system.
   The effect of the patch is to stop making that impossible, and in 
tests it
   appears to be effective at least on Microsoft C and gcc-based systems.
   Thanks to Evan Jones for hard work and patience.

[...]"

So with 2.4 under linux (as you tested) you will indeed not always get 
the used memory back, with respect to lots of small objects being 
collected.

The difference therefore (I think) you see between doing an f.read() and 
an f.readlines() is that the former reads in the whole file as one large 
string object (i.e. not a small object), while the latter returns a list 
of lines where each line is a python object.

I wonder how 2.5 would work out on linux in this situation for you.

Paul



More information about the Python-list mailing list