Is this secure?

Wed Feb 24 14:31:51 EST 2010

mk <mrkafk at gmail.com> writes:
> So I have little in the way of limitations of password length ...>
> The main application will access the data using HTTP (probably), so
> the main point is that an attacker is not able to guess passwords
> using brute force.

If it's HTTP instead of HTTPS and you're sending the password in the
clear, then a serious attacker can simply eavesdrop the connection and
pick up the password.  Again, if the application is a web forum or
something like that, the security requirements probably aren't terribly
high.  If it's (say) a financial application with potentially motivated
attackers, you've got to be a lot more careful than I think you're being
right now, and you should really get security specialists involved.

> Using A-z with 10-char password seems to provide 3 orders of magnitude
> more combinations than a-z:

Yes, 2**10 = 1024 so (57/25)**10 is a little more than that. 

> Even then I'm not getting completely uniform distribution for some reason:

Exact equality of the counts would be surprising and a sign that
something was wrong with the generation process.  It would be like
flipping a coin 10000 times and getting exactly 5000 heads.  The
binomial distribution tells you that the number should be close to 5000,
but that it's unlikely to be -exactly- 5000.

Also, as Michael Rudolf mentioned, getting a letter by taking n%26 where
n is drawn uniformly from [0..255] doesn't give a uniform distribution
because 256 is not a multiple of 26.  I had thought about making an
adjustment for that when I posted, but it didn't seem worth cluttering
up the code.  Uniformity for its own sake doesn't gain you anything;
what matters is entropy.  If you compute the entropy difference between
the slightly nonuniform distribution and a uniform one, it's very small.

To get a more uniform distribution I usually just take a larger n,
rather than conditionalizing the draws.  For example, in the
diceware-like code I posted, I read 10 random bytes (giving a uniform
random number on [0..2**80]) from urandom for each word.  That is still
not perfectly uniform, but it's closer to the point where the difference
would be very hard to detect.

> Aw shucks when will I learn to do the stuff in 3 lines well instead of
> 20, poorly. :-/

Well, that's partly a matter of practice, but I'll mention one way I
simplified the code, which was by reading more bytes from /dev/urandom
than was really necessary.  I read one byte for each random letter
(i.e. throwing away about 3 random bits for each letter) while you tried
to encode the urandom data cleverly and map 4 random bytes to 5
alphabetic letters.  /dev/urandom uses a cryptographic PRNG and it's
pretty fast, so reading a few extra bytes from it to simplify your code
doesn't really cost you anything.