Generating Unique Keys

Wed Jan 29 15:44:14 EST 2003

Paul Rubin <phr-n2003b at NOSPAMnightsong.com> wrote in message news:<7xptqgbirl.fsf at ruckus.brouhaha.com>...
> trevp at trevp.net (Trevor Perrin) writes:
> > you're right, with normal integers I don't think there's a
> > vulnerability, there would only be one if the appended values could
> > somehow have a prefix that matched a previous value's SHA1 padding. 
> > Still, hmac's cheap and easy, it's a good habit to always use it for
> > keyed hashing, so you don't have to worry about this sort of nuance
> > (I'm sure you know this, just for anyone following along)..
> 
> Yeah, I guess so, there's not exactly a big performance bottleneck to
> worry about in this situation.  

And hmac-sha1 is only a tiny, fixed (relative to input size) amount
slower than sha1.

> I think the appending attack is
> irrelevant anyway though, since the token is not being used as a MAC.
> A token isn't automatically valid just because it was made with the
> right key.  It has to belong to an actual session currently active in
> the server.

I think the attack is irrelevant because of how you chose the values
you append to the key.  But the construction isn't "generically
secure".  Consider the general construction

token1 = sha(key+value1)
token2 = sha(key+value2)
token3 = sha(key+value3)
.
.

where the attacker knows the set of possible values, doesn't know the
key, knows some of the tokens (and which values they correspond to),
and succeeds if he can guess the token for any other value.

The alternate construction

token1 = hmac-sha(key, value1)
token2 = hmac-sha(key, value2)
.

is generically secure: for any key (that has enough entropy) and *any*
set of values, the attacker will fail.  hmac guarantees that knowing
the keyed hash for one input won't let you guess the keyed hash for
any other input.

The first construction is not generically secure - for some key and
some set of values it's secure, for others it isn't.  To elaborate,
sha1 pads its input to a multiple of 64 bytes, with first a one bit,
then a string of zero bits, then a 64 bit big-endian representation of
the original input length.  Call the appended padding bytes for some
input pad(input).

sha1 then processes each 64 byte block sequentially, mixing the output
into an intermediate value (this is kind of rough).  After processing
the last block, the intermediate value becomes the return value.  Thus
the length extension property, where you can take any SHA1 return
value, treat it as an intermediate value, and hash further stuff onto
the end of it.

So if, for some valueX and valueY, valueX+pad(key+valueX) is a prefix
of valueY, then an attacker who knows valueX, tokenX, and valueY can
compute tokenY by hashing the remainder of valueY onto the end of
tokenX.

Now the condition above is unlikely in any reasonable scheme for
generating the input values, like just calling str() on 32-bit
integers like you mentioned.  But suppose the values could be
controlled by the attacker, or were generated in some odd way that
tended to produce a 1 followed by several bytes worth of zeros
followed by a few other random bytes..

Unlikely, but that's stuff you shouldn't have to worry about.  hmac is
simple, cheap, easy to use, provably secure, it should be used without
a second thought here.  In my opinion.

[I hope I got all the details right, you might want to double-check..]

> You might also look at the Yarrow paper on counterpane.com.  But this
> stuff is extremely tricky to get right.  It's better to just use OS
> services, which are now provided on most platforms where it matters.

probably true, nowadays.  

Trevor