[Python-Dev] Hash collision security issue (now public)

Christian Heimes lists at cheimes.de
Sun Jan 1 16:27:51 CET 2012


Am 01.01.2012 16:13, schrieb Guido van Rossum:
> Different concern. What if someone were to have code implementing an
> external, persistent hash table, using Python's hash() function? They
> might have a way to rehash everything when a new version of Python comes
> along, but they would not be happy if hash() is different in each
> process. I somehow vaguely remember possibly having seen such code, or
> something else where a bit of random data was needed and hash() was used
> since it's so easily available.

I had the same concern as you and was worried that projects like ZODB
might require a stable hash function. Fred already stated that ZODB
doesn't use the hash in its btree structures.

Possible solutions:

 * make it possible to provide the seed as an env var

 * disable randomizing as default setting or at least add an option to
disable randomization

IMHO the issue needs a PEP that explains the issue, shows all possible
solutions and describes how we have solved the issue. I'm willing to
start a PEP. Who likes to be the co-author?

Christian


More information about the Python-Dev mailing list