[issue34751] Hash collisions for tuples
Tim Peters
report at bugs.python.org
Tue Oct 2 17:41:39 EDT 2018
Tim Peters <tim at python.org> added the comment:
> >>> from itertools import product
> >>> len(set(map(hash, product([0.5, 0.25], repeat=20))))
> 32
> Good catch! Would you like me to add this to the testsuite?
It's in mine already ;-) I've added all the "bad examples" in all the messages here. Sooner or later they'll get folded into Python's test suite.
BTW, there were no collisions in that under whatever 64-bit Python I last compiled. That was a SeaHash variant. I'm not certain, but I believe it had "t ^= t << 1" at the start and with the first multiply commented out.
Having learning _something_ about why SeaHash does what it does, I'm not convinced the first multiply is of much value. As a standalone bit-scrambler for a single 64-bit input, it does matter. But in the tuple hash context, we're running it in a loop. Strictly alternating "propagate left" and "propagate right" seems to me almost as good - although that's just intuition.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue34751>
_______________________________________
More information about the Python-bugs-list
mailing list