[Python-Dev] Status of the fix for the hash collision vulnerability

Stephen J. Turnbull stephen at xemacs.org
Tue Jan 17 12:10:38 CET 2012


martin at v.loewis.de writes:
 > >> It doesn't change anything, you will still get collisions.
 > >
 > >
 > > That depends right? If the collision is because they all have the same
 > > hash(), yes. It might be different if it is because the secondary hashing
 > > (or whatever it's called :-) causes collisions.
 > 
 > But Python deals with the latter case just fine already. The open hashing
 > approach relies on the dict resizing "enough" to prevent collisions after
 > the dictionary has grown. Unless somebody can demonstrate a counter example,
 > I believe this discussion is a red herring.
 >
 > Plus: if an attacker could craft keys that deliberately cause collisions
 > because of the dictionary size, they could likely also craft keys in the same
 > number that collide on actual hash values, bringing us back to the original
 > problem.

I thought that the original problem was that with N insertions in the
dictionary, by repeatedly inserting different keys generating the same
hash value an attacker could arrange that the cost of finding an open
slot is O(N), and thus the cost of N insertions is O(N^2).

If so, frequent resizing could make the attacker's problem much more
difficult, as the distribution of secondary probes should change with
each resize.


More information about the Python-Dev mailing list