Program slowing down with greater memory use

Dan Stromberg strombrg at dcs.nac.uci.edu
Thu Jun 22 19:36:00 EDT 2006


I have two different python programs that are slowing down quite a bit as
their memory use goes up.

I'm not really sure if this is some sort of CPU cache effect, or if it's
something about python's garbage collector taking more time, or what.

One of the programs is one of those "how fast is data moving through the
pipe" measurement tools, called reblock.  I'd initially expected that the
program would run faster with large block sizes, because then you're
wasting less time with in the C library, system call interface and context
switches.

But that turned out to be false.  If I use a blocksize of about 2**18, it
runs much faster than if I use a blocksize of 2**22.

Then another program, which is a search engine, will slow way down if I
try to run it against too many search keywords.  The program inhales a
bunch of filenames at the beginning, and if I run it against n files, n>m,
and later run it against m files, then the n file run will be quite a bit
slower even after it has handled only m files.  The program implements a
memory cache to speed database operations, but even when the size of my
cache is nowhere near the size of the system's physical memory, it still
slows way down in this enigmatic way.

This makes me wonder.

What's the deal here?  Is the garbage collector working overtime on these
programs that have a lot of objects?  One of the programs above (reblock)
doesn't precisely have a large number of objects - just one huge string.

Thanks!




More information about the Python-list mailing list