Compression of random binary data

Gregory Ewing greg.ewing at canterbury.ac.nz
Wed Oct 25 04:11:25 EDT 2017


Ben Bacarisse wrote:
> The trouble is a pedagogic one.  Saying "you can't compress random data"
> inevitably leads (though, again, this is just my experience) to endless
> attempts to define random data.

It's more about using terms without making sure everyone agrees
on the definitions being used.

In this context, "random data" really means "uniformly distributed
data", i.e. any bit sequence is equally likely to be presented as
input. *That's* what information theory says can't be compressed.

> I think "arbitrary data" (thereby including the results of compression
> by said algorithm) is the best way to make progress.

I'm not sure that's much better, because it doesn't home in
on the most important thing, which is the probability
distribution.

-- 
Greg



More information about the Python-list mailing list