[issue13703] Hash collision security issue
STINNER Victor
report at bugs.python.org
Wed Jan 11 10:56:12 CET 2012
STINNER Victor <victor.stinner at haypocalc.com> added the comment:
> * it is exceedingly complex
Which part exactly? For hash(str), it just add two extra XOR.
> * the method would need to be implemented for all hashable Python types
It was already discussed, and it was said that only hash(str) need to
be modified.
> * it causes startup time to increase (you need urandom data for
> every single hashable Python data type)
My patch reads 8 or 16 bytes from /dev/urandom which doesn't block. Do
you have a benchmark showing a difference?
I didn't try my patch on Windows yet.
> * it causes run-time to increase due to changes in the hash
> algorithm (more operations in the tight loop)
I posted a micro-benchmark on hash(str) on python-dev: the overhead is
nul. Did you have numbers showing that the overhead is not nul?
> * causes different processes in a multi-process setup to use different
> hashes for the same object
Correct. If you need to get the same hash, you can disable the
randomized hash (PYTHONHASHSEED=0) or use a fixed seed (e.g.
PYTHONHASHSEED=42).
> * doesn't appear to work well in embedded interpreters that
> regularly restarted interpreters (AFAIK, some objects persist across
> restarts and those will have wrong hash values in the newly started
> instances)
test_capi runs _testembed which restarts a embedded interpreters 3
times, and the test pass (with my patch version 5). Can you write a
script showing the problem if there is a real problem?
In an older version of my patch, the hash secret was recreated at each
initiliazation. I changed my patch to only generate the secret once.
> The most important issue, though, is that it doesn't really
> protect Python against the attack - it only makes it less
> likely that an adversary will find the init vector (or a way
> around having to find it via crypt analysis).
I agree that the patch is not perfect. As written in the patch, it
just makes the attack more complex. I consider that it is enough.
Perl has a simpler protection than the one proposed in my patch. Is
Perl vulnerable to the hash collision vulnerability?
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13703>
_______________________________________
More information about the Python-bugs-list
mailing list