Garbage collection
Steven D'Aprano
steve at REMOVE.THIS.cybersource.com.au
Wed Mar 21 12:59:38 EDT 2007
On Wed, 21 Mar 2007 15:32:17 +0000, Tom Wright wrote:
>> Memory contention would be a problem if your Python process wanted to keep
>> that memory active at the same time as you were running GIMP.
>
> True, but why does Python hang on to the memory at all? As I understand it,
> it's keeping a big lump of memory on the int free list in order to make
> future allocations of large numbers of integers faster. If that memory is
> about to be paged out, then surely future allocations of integers will be
> *slower*, as the system will have to:
>
> 1) page out something to make room for the new integers
> 2) page in the relevant chunk of the int free list
> 3) zero all of this memory and do any other formatting required by Python
>
> If Python freed (most of) the memory when it had finished with it, then all
> the system would have to do is:
>
> 1) page out something to make room for the new integers
> 2) zero all of this memory and do any other formatting required by Python
>
> Surely Python should free the memory if it's not been used for a certain
> amount of time (say a few seconds), as allocation times are not going to be
> the limiting factor if it's gone unused for that long. Alternatively, it
> could mark the memory as some sort of cache, so that if it needed to be
> paged out, it would instead be de-allocated (thus saving the time taken to
> page it back in again when it's next needed)
And increasing the time it takes to re-create the objects in the cache
subsequently.
Maybe this extra effort is worthwhile when the free int list holds 10**7
ints, but is it worthwhile when it holds 10**6 ints? How about 10**5 ints?
10**3 ints?
How many free ints is "typical" or even "common" in practice?
The lesson I get from this is, instead of creating such an enormous list
of integers in the first place with range(), use xrange() instead.
Fresh running instance of Python 2.5:
$ ps up 9579
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
steve 9579 0.0 0.2 6500 2752 pts/7 S+ 03:42 0:00 python2.5
Run from within Python:
>>> n = 0
>>> for i in xrange(int(1e7)):
... # create lots of ints, one at a time
... # instead of all at once
... n += i # make sure the int is used
...
>>> n
49999995000000L
And the output of ps again:
$ ps up 9579
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
steve 9579 4.2 0.2 6500 2852 pts/7 S+ 03:42 0:11 python2.5
Barely moved a smidgen.
For comparison, here's what ps reports after I create a single list with
range(int(1e7)), and again after I delete the list:
$ ps up 9579 # after creating list with range(int(1e7))
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
steve 9579 1.9 15.4 163708 160056 pts/7 S+ 03:42 0:11 python2.5
$ ps up 9579 # after deleting list
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
steve 9579 1.7 11.6 124632 120992 pts/7 S+ 03:42 0:12 python2.5
So there is another clear advantage to using xrange instead of range,
unless you specifically need all ten million ints all at once.
--
Steven.
More information about the Python-list
mailing list