Hash stability

Bryan bryanjugglercryptographer at yahoo.com
Sun Jan 15 07:03:53 EST 2012


Chris Angelico wrote:
> Suggestion: Create a subclass of dict, the SecureDict or something,
> which could either perturb the hashes or even use a proper
> cryptographic hash function; normal dictionaries can continue to use
> the current algorithm. The description in Objects/dictnotes.txt
> suggests that it's still well worth keeping the current system for
> programmer-controlled dictionaries, and only change user-controlled
> ones (such as POST data etc).

I have to disagree; that's not how the world works, at least not
anymore. Competent, skilled, dedicated programmers have over and over
again failed to appreciate the importance and the difficulty of
maintaining proper function in an adversarial environment. The tactic
of ignoring security issues unless and until they are proven
problematic stands utterly discredited.

> It would then be up to the individual framework and module authors to
> make use of this, but it would not impose any cost on the myriad other
> uses of dictionaries - there's no point adding extra load to every
> name lookup just because of a security issue in an extremely narrow
> situation. It would also mean that code relying on hash(str) stability
> wouldn't be broken.

That seemingly "extremely narrow situation" turns out to be wide as
Montana. Maybe Siberia. Does your program take input? Does it accept a
format that could possibly be downloaded from a malicious site on the
Internet? Does your market include users who occasionally make
mistakes? If not, enjoy your utter irrelevance. If so,
congratulations: you write Internet software.

Varying the hash function is just the first step. Plausible attacks
dynamically infer how to induce degenerate behavior. Replacing the
dictionary hash function with a "proper cryptographic hash function"
is a naive non-solution; all things considered it's somewhat worse
than useless. An old and interesting and relevant exercise is to
implement a dictionary with O(1) insert, look-up, and delete in the
average non-adversarial case; and O(lg n) insert, look-up, and delete
in the worse case.




More information about the Python-list mailing list