[issue34751] Hash collisions for tuples
Jeroen Demeyer
report at bugs.python.org
Mon Sep 24 05:45:54 EDT 2018
Jeroen Demeyer <J.Demeyer at UGent.be> added the comment:
> Has anyone figured out the real source of the degeneration when mixing in negative integers?
The underlying reason for the collisions is the following mathematical relation:
x ^ -x = -2 << i
where i is the index of the smallest set bit of x. In particular, x ^ -x = -2 for odd x.
Now consider two consecutive hash iterations:
y = (x ^ a) * m1
z = (y ^ b) * m2
and suppose that x ^ a is odd. Now if we replace a by a ^ -2, then x ^ a will be replaced by -(x ^ a) and y will be replaced by -y. If we also replace b by b ^ -2, then y ^ b will be replaced by y ^ b. In other words, we have a collision.
This kind of bad interaction between ^ and * only happens with the FNV-style hashing. The Bernstein hash using + instead of ^ does not suffer from this problem. That makes it a better choice in my opinion.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue34751>
_______________________________________
More information about the Python-bugs-list
mailing list