something Evil happens when large hashes destroyed

Skip Montanaro skip at pobox.com
Sun Nov 18 12:37:19 EST 2001


    guuge> I was trying to sort about 100,000 items by splitting them into
    guuge> groups (using a python dictionary type) and then successively
    guuge> splitting these groups until they were small enough to use a
    guuge> brute force method.

    guuge> My test program, which used only two or three keys to create the
    guuge> first split, worked fine.  When I tried the real thing, using 256
    guuge> keys, the program slowed to a crawl.  The python interpreter took
    guuge> forever to destroy the dictionaries.

I believe this topic has been discussed recently.  When a dictionary is
deleted, the keys are traversed in their "natural order", that is, as they
are laid out in the dict.  However, the values stored in the dict are
scattered all over the place, so lots of small chunks of memory are freed.
Your underlying malloc library is probably spending lots of time trying to
coalesce small chunks of freed memory into bigger chunks.

One workaround seems to be to compile Python with pymalloc enabled.  Another
is to use sys._exit to exit your program (assuming the storage for your big
hashes are getting reclaimed at program exit).

On the other hand, are you sure you're not just running out of memory?

-- 
Skip Montanaro (skip at pobox.com - http://www.mojam.com/)




More information about the Python-list mailing list