[issue38105] hash collision when hash(x) == -2 causes many calls to __eq__

Tim Peters report at bugs.python.org
Thu Sep 12 18:23:23 EDT 2019


Tim Peters <tim at python.org> added the comment:

A more principled change would be to replace instances of this:

    i = (i*5 + perturb + 1) & mask;

with this:

    i = (i*5 + (perturb << 1) + 1) & mask;

The latter spelling has no fixed points.  That's easy to see:  `(perturb << 1) + 1` is necessarily odd, and then `i*5 + ODD` is even when `i` is odd, and vice versa.

I had hoped that the time for a new shift could overlap with the multiply latency, but no such luck.  At least Visual Studio didn't multiply to begin with:  it first computes `i*4 + perturb` cheaply with a single LEA instruction, then adds 1 (INC), then adds in `i` again.

I expect it would be able to fold in a "free" shifted add of perturb with another LEA, so the latter spelling isn't necessarily more expensive.  But I'm generally loathe to increase operation counts relying on clever compilers to exploit processor-specific tricks.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue38105>
_______________________________________


More information about the Python-bugs-list mailing list