[issue34751] Hash collisions for tuples

Jeroen Demeyer report at bugs.python.org
Mon Sep 24 05:45:54 EDT 2018


Jeroen Demeyer <J.Demeyer at UGent.be> added the comment:

> Has anyone figured out the real source of the degeneration when mixing in negative integers?

The underlying reason for the collisions is the following mathematical relation:

x ^ -x = -2 << i

where i is the index of the smallest set bit of x. In particular, x ^ -x = -2 for odd x.

Now consider two consecutive hash iterations:

y = (x ^ a) * m1
z = (y ^ b) * m2

and suppose that x ^ a is odd. Now if we replace a by a ^ -2, then x ^ a will be replaced by -(x ^ a) and y will be replaced by -y. If we also replace b by b ^ -2, then y ^ b will be replaced by y ^ b. In other words, we have a collision.

This kind of bad interaction between ^ and * only happens with the FNV-style hashing. The Bernstein hash using + instead of ^ does not suffer from this problem. That makes it a better choice in my opinion.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue34751>
_______________________________________


More information about the Python-bugs-list mailing list