Determining number of dict key collisions in a dictionary
Roger Binns
rogerb at rogerbinns.com
Tue Dec 2 14:38:49 EST 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
python at bdurham.com wrote:
> Background: I'm working on a project using very large dictionaries (64
> bit Python) and question from my client is how effective is Python's
> default hash technique for our data set?
Python hash functions return a long which in a 64 bit process is 32 bits
on Windows and 64 bits on pretty much every other 64 bit environment.
> Their concern is based on the
> belief that Python's default dictionary hash scheme is optimized for 32
> bit vs. 64 bit environments and may not have anticipated the additional
> range of keys that can be generated in a 64 bit environment. Our keys
> are based on 20 to 44 byte ASCII (7-bit) alpha-numeric strings.
Why not have them look at the source code? It is well commented and
there is another file with various notes. Look at Objects/dictobject.c
and Objects/dictnotes.txt
A teaser comment for you:
Most hash schemes depend on having a "good" hash function, in
the sense of simulating randomness. Python doesn't.
Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iEYEARECAAYFAkk1jsUACgkQmOOfHg372QTeEQCeJwkRphiPeDefkANg1IdG3HH1
oocAoICJk6NGxVmtZTZtLOL4Sv4aCw1n
=IqsO
-----END PGP SIGNATURE-----
More information about the Python-list
mailing list