[issue34751] Hash collisions for tuples
Jeroen Demeyer
report at bugs.python.org
Tue Oct 2 08:38:23 EDT 2018
Jeroen Demeyer <J.Demeyer at UGent.be> added the comment:
SeaHash seems to be designed for 64 bits. I'm guessing that replacing the shifts by
x ^= ((x >> 16) >> (x >> 29))
would be what you'd do for a 32-bit hash. Alternatively, we could always compute the hash with 64 bits (using uint64_t) and then truncate at the end if needed.
However, when testing the hash function
for t in INPUT:
x ^= hash(t)
x *= MULTIPLIER
x ^= ((x >> 16) >> (x >> 29))
x *= MULTIPLIER
It fails horribly on the original and my new testsuite. I'm guessing that the problem is that the line x ^= ((x >> 16) >> (x >> 29)) ignores low-order bits of x, so it's too close to pure FNV which is known to have problems. When replacing the first line of the loop above by x += hash(t) (DJB-style), it becomes too close to pure DJB and it also fails horribly because of nested tuples.
So it doesn't seem that the line x ^= ((x >> 16) >> (x >> 29)) (which is what makes SeaHash special) really helps much to solve the known problems with DJB or FNV.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue34751>
_______________________________________
More information about the Python-bugs-list
mailing list