[Patches] [ python-Patches-551915 ] GC collection frequency bug

noreply@sourceforge.net noreply@sourceforge.net
Fri, 03 May 2002 19:27:30 -0700


Patches item #551915, was opened at 2002-05-03 12:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=551915&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Daniel Dunbar (danieldunbar)
>Assigned to: Neil Schemenauer (nascheme)
Summary: GC collection frequency bug

Initial Comment:
 - bug fix, garbage collection frequency was incorrect
    in the presence of empty generations.

The generational cycle collection algorithm skipped
calling collect() on empty generations, however
collect() has the side effect of resetting the
'allocated' global, on which the collection frequency
is based.

The result was that in pathological cases every
call to PyObject_GC_Malloc could trigger a collection,
for an extended number of calls to _GC_Malloc.

Note that this bug only has a noticable effect
when many calls to _GC_Malloc are made
without tracking any of the returned objects.

It also raises the question of whether allocated
should be tracked from within _GC_Malloc, as it is 
the actual _GC_Track function that causes objects to
be exposed to the collector.

Patch consolidates empty generation tests at
top of gcmodule:collect(), and sets allocated
to 0 if it skips the collection - perhaps this
should be fixed by changing _GC_Track to control
the collection frequency.

The collection frequency will still be incorrect if,
for large N, N objects are allocated without being
tracked (allocated=~N), a collection is run
(allocated=0), and then all N objects become tracked
(gen0 is now very large, but no collection, nor
does _GC_Malloc think one needs to be done).

In practice I believe this is unlikely to be 
important, as a new collection will probably
run very soon anyway.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-05-03 22:27

Message:
Logged In: YES 
user_id=31435

Oddly enough, this came up on Python-Dev this week, but in 
the context of deliberately untracking tracked items with 
no intent of deallocating them for a long time.  Same 
glitch:  pretty soon generation 0 is empty, and a useless 
gen0 collection gets run on nearly every GC malloc; worse, 
a gen2 collection gets run about 700x too often.

I view "allocated" as trying to guess the size of gen0, and 
agree the net excess of tracked less untracked is a better 
guess in edge cases.

Assigned to Neil for cogitation.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=551915&group_id=5470