[issue13703] Hash collision security issue
STINNER Victor
report at bugs.python.org
Wed Jan 4 02:54:53 CET 2012
STINNER Victor <victor.stinner at haypocalc.com> added the comment:
> https://gist.github.com/0a91e52efa74f61858b5
Please, attach directly a file to the issue, or copy/paste the code in your comment. Interesting part the code:
---
#Proposed replacement
#--------------------------------------
import os, array
size_exponent = 14 #adjust as a memory/security tradeoff
r = array.array('l', os.urandom(2**size_exponent))
len_r = len(r)
def _hash_string2(s):
"""The algorithm behind compute_hash() for a string or a unicode."""
length = len(s)
#print s
if length == 0:
return -1
x = (ord(s[0]) << 7) ^ r[length % len_r]
i = 0
while i < length:
x = intmask((1000003*x) ^ ord(s[i]))
x ^= r[x % len_r]
i += 1
x ^= length
return intmask(x)
---
> r = array.array('l', os.urandom(2**size_exponent))
> len_r = len(r)
r size should not depend on the size of a long. You should write something like:
sizeof_long = ctypes.sizeof(ctypes.c_long)
r_bits = 8
r = array.array('l', os.urandom((2**r_bits) * sizeof_long))
r_mask = 2**r_bits-1
and then replace "% len_r" by "& r_mask".
What is the minimum value of r_bits? For example, would it be safe to use a single long integer? (r_bits=1)
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13703>
_______________________________________
More information about the Python-bugs-list
mailing list