Garbage collection

Steve Holden steve at holdenweb.com
Wed Mar 21 12:07:07 EDT 2007


Tom Wright wrote:
> skip at pobox.com wrote:
>>     Tom> ...and then I allocate a lot of memory in another process (eg.
>>     open Tom> a load of files in the GIMP), then the computer swaps the
>>     Python
>>     Tom> process out to disk to free up the necessary space.  Python's
>>     Tom> memory use is still reported as 953 MB, even though nothing like
>>     Tom> that amount of space is needed.  From what you said above, the
>>     Tom> problem is in the underlying C libraries, but is there anything I
>>     Tom> can do to get that memory back without closing Python?
>>
>> Not really.  I suspect the unused pages of your Python process are paged
>> out, but that Python has just what it needs to keep going.
> 
> Yes, that's what's happening.
> 
>> Memory contention would be a problem if your Python process wanted to keep
>> that memory active at the same time as you were running GIMP.
> 
> True, but why does Python hang on to the memory at all?  As I understand it,
> it's keeping a big lump of memory on the int free list in order to make
> future allocations of large numbers of integers faster.  If that memory is
> about to be paged out, then surely future allocations of integers will be
> *slower*, as the system will have to:
> 
> 1) page out something to make room for the new integers
> 2) page in the relevant chunk of the int free list
> 3) zero all of this memory and do any other formatting required by Python
> 
> If Python freed (most of) the memory when it had finished with it, then all
> the system would have to do is:
> 
> 1) page out something to make room for the new integers
> 2) zero all of this memory and do any other formatting required by Python
> 
> Surely Python should free the memory if it's not been used for a certain
> amount of time (say a few seconds), as allocation times are not going to be
> the limiting factor if it's gone unused for that long.  Alternatively, it
> could mark the memory as some sort of cache, so that if it needed to be
> paged out, it would instead be de-allocated (thus saving the time taken to
> page it back in again when it's next needed)
> 
Easy to say. How do you know the memory that's not in use is in a 
contiguous block suitable for return to the operating system? I can 
pretty much guarantee it won't be. CPython doesn't use a relocating 
garbage collection scheme, so objects always stay at the same place in 
the process's virtual memory unless they have to be grown to accommodate 
additional data.
> 
>> I think the process's resident size is more important here than virtual
>> memory size (as long as you don't exhaust swap space). 
> 
> True in theory, but the computer does tend to go rather sluggish when paging
> large amounts out to disk and back.  Surely the use of virtual memory
> should be avoided where possible, as it is so slow?  This is especially
> true when the contents of the blocks paged out to disk will never be read
> again.
> 
Right. So all we have to do is identify those portions of memory that 
will never be read again and return them to the OS. That should be easy. 
Not.
> 
> I've also tested similar situations on Python under Windows XP, and it shows
> the same behaviour, so I think this is a Python and/or GCC/libc issue,
> rather than an OS issue (assuming Python for linux and Python for windows
> are both compiled with GCC).
> 
It's probably a dynamic memory issue. Of course if you'd like to provide 
a patch to switch it over to a relocating garbage collection scheme 
we'll all await it with  bated breath :)

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb     http://del.icio.us/steve.holden
Recent Ramblings       http://holdenweb.blogspot.com




More information about the Python-list mailing list