Defensive programming

Sun Jun 1 20:11:09 EDT 2003

Tim Peters <tim.one at comcast.net> writes:
> Sure.  All dict implementations (hash tables) have O(N**2) worst-case
> behavior, over a sequence of N inserts and/or lookups, provoked by a
> sufficiently nasty set of N keys.  If an app can't bear that, then it
> shouldn't use dicts (or hash tables in Perl, or any similar facility in any
> other language).  

I don't buy that; it's like saying don't use cryptography to protect
network traffic, because the attacker might get lucky and guess the
key; or don't count on being able to breathe while you type, because
the laws of thermodynamics make it possible for every air molecule in
the room to suddenly move to a tiny region in the corner.  Unless the
data is has a particularly nasty interaction with the hash function,
the chance of O(N**2) behavior in a properly sized table should be
exponentially small and can therefore be made arbitrarily smaller than
the chance of (say) hardware failure or obliteration by meteor impact
or supernova.  At that point, programs shouldn't have to worry about
O(N**2) behavior happening by accident.

The attack in the paper is to use knowledge about the hash function to
deliberately construct data with such a nasty interaction.  If such
knowledge isn't available (e.g. because the hash function is chosen at
random and not disclosed to the attacker), the attacker can't
construct such data and so that attack stops being a problem.

> This isn't news.  What's interesting in the paper is that
> some apps that should know better were caught doing this.

The paper proposed a good solution, which is use a hash function that
the attacker doesn't have enough information to manipulate.