Compression of random binary data

Joonas Liik liik.joonas at gmail.com
Mon Jul 11 14:09:18 EDT 2016


On 11 July 2016 at 20:52,  <jonas.thornvall at gmail.com> wrote:
> What kind of statistic law or mathematical conjecture  or is it even a physical law is violated by compression of random binary data?
>
> I only know that Shanon theorised it could not be done, but were there any proof?

Compression relies on some items in the dataset being more frequent
than others, if you have some dataset that is completely random it
would be hard to compress as most items have very similar number of
occurrances.

> What is to say that you can not do it if the symbolic representation is richer than the symbolic represenatation of the dataset.
>
> Isn't it a fact that the set of squareroots actually depict numbers in a shorter way than their actual representation.

A square root may be smaller numerically than a number but it
definitely is not smaller in terms of entropy.

lets try to compress the number 2 for instance using square roots.
sqrt(2) = 1.4142
the square root actually takes more space in this case even tho it is
a smaller number. so having the square root would have negative
compression in this case.
with some rounding back and forth we can probably get around the fact
that sqrt(2) would take an infinite amout of memory to accurately
represent but that neccesarily means restricting the values we are
possible of encoding.

for sqrt(2) to not have worse space consumprion than the number 2
itself we basically have to trow away precision so sqrt(2) ~= 1
now i challenge you to get that 2 back out of that 1..



More information about the Python-list mailing list