[Python-Dev] Billions of gc's

Guido van Rossum guido@python.org
Tue, 30 Apr 2002 08:52:51 -0400


> Here's a question: suppose we've got a database result with 10K rows (I'd
> say that is fairly common), and we're processing each row with a regex
> (something that can't be done in SQL).  What's a ballpark for gc overhead
> before and after your fix?  (I'm still not set up to compile CVS, so I
> can't do it myself.)

For 10K objects, the GC overhead is negligeable.  Jeremy did 100K
tuples and still found that 60% of the time was in malloc.  You only
get in trouble when you approach a million tuples.  Remember, it's
quadratic.  That means it gets 100x worse with every multiplying
factor 10 -- but also that it gets 100x better with every division by
10 (until you run into other effects, of course).

--Guido van Rossum (home page: http://www.python.org/~guido/)