[issue14621] Hash function is not randomized properly
Mark Dickinson
report at bugs.python.org
Wed Nov 7 12:55:11 CET 2012
Mark Dickinson added the comment:
[MAL]
> I don't understand why we are only trying to fix the string problem
> and completely ignore other key types.
[Armin]
> estimating the risks of giving up on a valid query for a truly random
> hash, at an overestimated one billion queries per second ...
That's fine in principle, but if this gets extended to integers, note that our current integer hash is about as far from 'truly random' as you can get:
Python 3.4.0a0 (default:f02555353544, Nov 4 2012, 11:50:12)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> [hash(i) for i in range(20)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
Moreover, it's going to be *very* hard to change the int hash while preserving the `x == y implies hash(x) == hash(y)` invariant across all the numeric types (int, float, complex, Decimal, Fraction, 3rd-party types that need to remain compatible).
----------
nosy: +mark.dickinson
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14621>
_______________________________________
More information about the Python-bugs-list
mailing list