Compression of random binary data

jonas.thornvall at gmail.com jonas.thornvall at gmail.com
Tue Jul 12 20:43:22 EDT 2016


Den onsdag 13 juli 2016 kl. 00:24:23 UTC+2 skrev Ian:
> On Tue, Jul 12, 2016 at 11:35 AM,  <jonas.thornvall at gmail.com> wrote:
> >
> > No it is only compressible down to a limit given by the algorithm.
> 
> Then your algorithm does not compress random data as you claimed.
> 
> For some input, determine the limiting output that it ultimately
> compresses down to. Take that output and feed it through your
> algorithm as if it were the original input. If the data are to be
> considered random, then this input is just as probable as the
> original. What output does the algorithm now create? If it just
> returns the input unchanged, then how do you discern the original
> input from this input when decompressing? If it returns a different
> output of the same size, then repeat the process with the new output.
> Now there are *two* outputs of that size that can't be repeated. There
> are only finitely many possible outputs of that size, so eventually
> you're going to have to get to one that either repeats an output -- in
> which case your algorithm produces the same output for two different
> inputs and is therefore incorrect -- or you will get to an input that
> produces an output *larger* in size than the original.

The dataset must have a certain size that is the only requirment, of course you can not compress something into nothing, at least the arithmetic ruleset need to be encoded but what would be the point to just compress something less than a couple of bytes. And of course the number of rounds you applied the algorithm must be stored. But that is no problem for small datasets.




More information about the Python-list mailing list