Compression of random binary data

Paul Moore p.f.moore at gmail.com
Tue Oct 24 06:50:48 EDT 2017


On 24 October 2017 at 11:23, Ben Bacarisse <ben.usenet at bsb.me.uk> wrote:
> For example, run the complete works of Shakespeare through your program.
> The result is very much not random data, but that's the sort of data
> people want to compress.  If you can compress the output of your
> compressor you have made a good start.  Of course what you really want
> to be able to do is to compress the output that results from compressing
> your compressed out.  And, of course, you should not stop there.  Since
> you can compress *any* data (not just the boring random stuff) you can
> keep going -- compressing the compressed output again and again until
> you end up with a zero-length file.

Oh, and just for fun, if you are able to guarantee compressing
arbitrary data, then

1. Take a document you want to compress.
2. Compress it using your magic algorithm. The result is smaller.
3. Compress the compressed data. The result is still smaller.
4. Repeat until you hit 0 bytes.

Congratulations - apparently you have a reversible algorithm that
compresses every data set to an empty file. (Caveat - there's actually
"hidden data" here, as you need to know how many compressions it takes
to hit 0 bytes. Because you decrease the size every time, though, that
number must be no greater than the size of the original file).

Paul



More information about the Python-list mailing list