Compression of random binary data

Ben Bacarisse ben.usenet at bsb.me.uk
Fri Oct 27 05:53:25 EDT 2017


Marko Rauhamaa <marko at pacujo.net> writes:

> Ben Bacarisse <ben.usenet at bsb.me.uk>:
>
>>> In this context, "random data" really means "uniformly distributed
>>> data", i.e. any bit sequence is equally likely to be presented as
>>> input. *That's* what information theory says can't be compressed.
>>
>> But that has to be about the process that gives rise to the data, not
>> the data themselves.  No finite collection of bits has the property you
>> describe.
>
> Correct. Randomness is meaningful only in reference to future events.
> Once the events take place, they cease to be random.
>
> A finite, randomly generated collection of bits can be tested against a
> probabilistic hypothesis using statistical methods.

But beware of parlour tricks.  You can't reliably test for random
looking data that are, in fact, carefully crafted.  If the claim is
about compressing arbitrary data, then the claimant won't mind testing
inputs chosen by me!  The only reason this matters is that such people
usually won't reveal the algorithm.

-- 
Ben.



More information about the Python-list mailing list