Hash stability

Stefan Behnel stefan_ml at behnel.de
Sun Jan 15 12:20:02 EST 2012


Chris Angelico, 15.01.2012 17:13:
> Of course, it's still dodgy to depend on the stability of something
> that isn't proclaimed stable, and would be far better to use some
> other hashing algorithm (MD5 or SHA for uberreliability).

I've seen things like MD5 or SHA* being used quite commonly for file caches
(or file storage in general, e.g. for related files referenced in a text
document). Given that these algorithms are right there in the stdlib, I
find them a rather obvious choice.

However, note that they may also be subject to complexity attacks at some
point, although likely requiring substantially more input data. In the
specific case of a cache, an attacker may only need an arbitrary set of
colliding hashes. Those can be calculated in advance for a given hash
function. For example, Wikipedia currently presents MD5 with a collision
complexity of ~2^20, that sounds a bit weak. Something like SHA256 should
be substantially more robust.

https://en.wikipedia.org/wiki/Cryptographic_hash_function#Cryptographic_hash_algorithms

Stefan




More information about the Python-list mailing list