[Python-Dev] Fwd: Re: last minute GC questions

Neil Schemenauer nascheme@enme.ucalgary.ca
Fri, 30 Jun 2000 15:30:08 -0600


Oops, should have cc python-dev.

----- Forwarded message from Neil Schemenauer <nascheme@enme.ucalgary.ca> -----

Date: Fri, 30 Jun 2000 15:27:48 -0600
From: Neil Schemenauer <nascheme@enme.ucalgary.ca>
Subject: Re: last minute GC questions
To: Jeremy Hylton <jeremy@beopen.com>
X-Url: http://www.enme.ucalgary.ca/~nascheme/

On Fri, Jun 30, 2000 at 04:57:44PM -0400, Jeremy Hylton wrote:
> Might we change the strategy for deciding when to collect?

We might. :)

> It seems to me that counting deallocations only would be more
> effective.  It is only the deallocations that cause a live object to
> become garbage.

You can easily run out of memory with that strategy though:

    N = 10000
    while 1:
        l = []
        for i in xrange(N):
            l.append([])
        l[0] = l

You only get a couple of deallocations while a large amount of
garbage is created.  Think of large cyclic structures like graphs
being created and then becoming garbage due to one deallocation.
By counting the net new objects we guarantee that this doesn't
happen.

> The other part of the strategy that might be changed is the collection
> frequency.  Right now, the threshold is 100 net allocations &
> dealloactions.  On the compiler benchmark, this leads to some 2600
> collections, which seems like a lot.  (I have no idea why it seems
> like a lot, but it does.)

Try setting the threshold to zero.  The major part of the GC
overhead does not seem to be running the collector.  OTOH, the
frequency could probably be decreased without the risk of running
out of memory.  No Python applications currently exist that
create that amount of garbage anyhow.

  Neil

----- End forwarded message -----