[Python-Dev] iterzip()

Neil Schemenauer nas@python.ca
Mon, 29 Apr 2002 15:52:47 -0700


Jeremy Hylton wrote:
> I'm not sure what your trick is, since you've only described it as a
> "decref counter."

Sorry.  I'm keeping track of the number of calls to PyObject_GC_Del
since the last collection.  While it's zero, collection doesn't happen.
That makes the justzip function run fast but doesn't seem to help
anywhere else.

> I was imagining a scheme like this:  Count increfs and decrefs.  Set
> two thresholds.  A collection occurs when both thresholds are
> exceeded.  Perhaps 100 decrefs and 1000 increfs.

That would cut down on the collections more but I'm not sure how much in
practice.  In real code it seems like allocations and deallocations are
pretty mixed up.

> How does this come into play in the benchmark in question?  It seems
> like we should have gotten a lot of quick collections, but it was
> still quite slow.

The GC cost is paid early and the objects get put in an older
generation.  Obviously that's a waste of time if they are deallocated in
the near future.  justpush deallocates as it goes so the GC is never
triggered.

I just tried measuring the time spent in the GC while loading some nasty
web pages in our system (stuff that looks at thousands of objects).  I
used the Pentium cycle counter since clock(2) seems to have very low
resolution.  Setting threshold0 to 7500 makes the GC take up twice the
amount of time as with the default settings (700).  That surprised me.
I thought it wouldn't make much difference.  Maybe I screwed up. :-)

  Neil