Reducing memory overhead for dictionaries by removing precomputed hash

kirat.singh at gmail.com kirat.singh at gmail.com
Mon Apr 17 22:45:05 EDT 2006


Forgive me if this has already been discussed, but it seems to me that
one could reduce the memory usage of dictionaries by 2/3 by removing
the precomputed hash in each bucket.

Since Dictionaries only allow immutable objects as keys, one could move
the precomputed hash into the keys.

* Strings are probably the most popular keys for dictionaries and they
already cache the hash (hmm that almost rhymes).
* Numbers are trivial to hash.
* For Tuples one could add a member to cache the hash.

So why store the hash? I imagine it makes rebuilding the dictionary a
bit quicker, since I guess you can avoid some comparisions since you
know there are no duplicates. haven't looked at the code to see if it
does that tho.

and collision chains are possibly a bit quicker to traverse, tho I
think python uses a mask instead of a mod by prime, so hash keys with
the same low bits will collide, so collisions might be more common, but
that's possibly fixable by using a congruential hash, but that's all
black magic to me.

-Kirat




More information about the Python-list mailing list