Compression of random binary data

Sat Oct 28 22:34:16 EDT 2017

On Sun, Oct 29, 2017 at 1:32 PM, Chris Angelico <rosuav at gmail.com> wrote:
> On Sun, Oct 29, 2017 at 1:18 PM, Gregory Ewing
> <greg.ewing at canterbury.ac.nz> wrote:
>> You're missing something fundamental about what
>> entropy is in information theory.
>>
>> It's meaningless to talk about the entropy of a single
>> message. Entropy is a function of the probability
>> distribution of *all* the messages you might want to
>> send.
>
> Which is where a lot of "password strength" confusion comes from. How
> much entropy is there in the password "U3ZINVp3PT0="? Strong password
> or weak? What about "dark-next-sure-secret"?
> "with-about-want-really-going"? They were generated by, respectively:
> double-MIME-encoding four bytes from /dev/random (32 bits of entropy),
> picking four words from the top 1024 (40 bits), and picking 5 words
> from the top 64 (30 bits). But just by looking at the passwords
> themselves, you can't tell that.

To clarify: The "top 1024 words" concept is XKCD 936 style password
generation, using my tabletop gaming room's resident parrot. So it's
based on the frequency of words used by D&D players. YMMV if you use a
different corpus :)

ChrisA