Compression of random binary data

Gregory Ewing greg.ewing at canterbury.ac.nz
Tue Oct 24 02:18:21 EDT 2017


danceswithnumbers at gmail.com wrote:
> 1234999988884321
> 
> It only takes seven 8 bit bytes to represent this

This is not surprising. The theoretical minimum size
for 16 arbitrary decimal digits is:

log2(10) * 16 = 53.15 bits = 6.64 bytes

I think you misunderstand what is meant by the phrase
"random data cannot be compressed".

A string of decimal digits represented in ASCII is
*not* completely random. There are 256 possible values
for each byte, and you're only using 10 of them.

What you *can't* do is compress 16 random decimal
digits to less than 6.64 bytes.

If you have something that you think can do better
than that, we would like to see it.

-- 
Greg



More information about the Python-list mailing list