when does python free memory?

Wed Feb 6 15:46:20 EST 2002

Robert,

	you'll need a more sophisticated garbage collection implementation to
actually return memory to the O/S.  The problem with the C allocator
malloc/free is that it simply maintains a pool that grows on demand but
never returns memory to the O/S.  The garbage collector instead needs to
obtain memory via memory mapping (e.g. VirtualAlloc on Windows, mmap on
unix) and arrange that regions so obtained are unmapped when they become
vacant.  Vacating these segments is a significant problem that typically
requires a compacting garbage collector as it is unlikely that just by
chance all objects within a memory-mapped segment would be unmapped at
the same time.  Any objects still used within a segment obviously
prevent it from being unmapped at all.

You might take a look at the Boehm garbage collector for C/C++ to see if
it addresses these issues.  Also interesting is the Harlequin GC
toolkit.  I'm speaking from experience with VisualWorks, the Smalltalk
environment, as I did the simple part of the work (the memory mapping)
while Barry Hayes did the hard part (the cross-segment compactor) in
allowing the VisualWorks VM to return memory to the O/S.

HTH

Robert Eanes wrote:
> 
> Hi-
> Reading the docs and searching the list archives, it seemed that
> python should release memory as soon as the last reference to an
> object is deleted.  However, I can't seem to prove this in practice.
> Look at this example:
> 
> Python 1.5.2 (#1, May 11 2000, 11:19:54)  [GCC 2.8.1] on sunos5
> Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
> >>> import os
> >>> blah = os.popen("ps -ef -o pid -o vsz | grep %s" % os.getpid())
> >>> memsize = blah.readlines()
> >>> print memsize
> ['23240 2360\012']
> >>> z = ["LSDKFJLSKDFJSDF"]*10000000 #allocate a large list
> >>> blah = os.popen("ps -ef -o pid -o vsz | grep %s" % os.getpid())
> >>> memsize = blah.readlines()
> >>> print memsize
> ['23240 41424\012']
> >>> del z #get rid of the large list
> >>> blah = os.popen("ps -ef -o pid -o vsz | grep %s" % os.getpid())
> >>> memsize = blah.readlines()
> >>> print memsize
> ['23240 41424\012']
> 
> The first number that comes out of the "print memsize" statement is
> the pid, the second is the amount of memory used.  It doesn't seem to
> be actually freeing the memory that was allocated to variable z.  Now
> I played around with it a little more and if I reallocate another list
> it does re-use that memory:
> 
> >>> x = ["LSDKFJLSKDFJSDF"]*10000000
> >>> blah = os.popen("ps -ef -o pid -o vsz | grep %s" % os.getpid())
> >>> memsize = blah.readlines()
> >>> print memsize
> ['23240 41424\012']
> 
> So apparently python knows that the memory formerly allocated to z is
> available, but it doesn't tell the OS.  Is this the case, and if so,
> is there any way around it?  I have an application that needs to run
> 80-150 python processes that can stay around for 15min to an hour, so
> I'd like them not to hold onto the maximum amount of memory that they
> use throughout the whole life of the process.
> 
> Thanks for any help,
> -Robert

-- 
_______________,,,^..^,,,____________________________
Eliot Miranda              Smalltalk - Scene not herd