don't need dictionary's keys - hash table?

Paul Rubin http
Wed Jul 12 18:46:56 EDT 2006


kdotsky at gmail.com writes:
> do it this way am I going to get the memory savings I am after?  Will
> the hash function always generate unique keys?  Also, would the same
> technique work for a set?

A hash function that always generates unique hashes is called a
perfect hash.  They tend to be slow, and of course, to generate one,
you need to know all the keys in advance.

If you use a long cryptographic hash like md5 or sha, the chances of 
collision is very small.  That's probably your best bet.

In general, if your hash is N bits long, you will probably get
collisions once you have around 2**(N/2) keys.  sha is a 160
bit hash so getting collisions by accident is extremely unlikely.



More information about the Python-list mailing list