Algorithm that makes maximum compression of completly diffused data.

jonas.thornvall at gmail.com jonas.thornvall at gmail.com
Wed Oct 30 15:49:00 EDT 2013


Den onsdagen den 30:e oktober 2013 kl. 20:46:57 UTC+1 skrev Modulok:
> On Wed, Oct 30, 2013 at 12:21 PM,  <jonas.t... at gmail.com> wrote:
> 
> 
> 
> I am searching for the program or algorithm that makes the best possible of completly (diffused data/random noise) and wonder what the state of art compression is.
> 
> 
> 
> 
> I understand this is not the correct forum but since i think i have an algorithm that can do this very good, and do not know where to turn for such question i was thinking to start here.
> 
> 
> 
> It is of course lossless compression i am speaking of.
> 
> --
> 
> https://mail.python.org/mailman/listinfo/python-list
> 
> 
>  
> 
> >> I am searching for the program or algorithm that makes the best possible of
> >> completly (diffused data/random noise) and wonder what the state of art
> 
> >> compression is.
> 
> 
> None. If the data to be compressed is truly homogeneous, random noise as you
> describe (for example a 100mb file read from cryptographically secure random
> 
> bit generator such as /dev/random on *nix systems), the state-of-the-art
> lossless compression is zero and will remain that way for the foreseeable
> 
> future.
> 
> 
> There is no lossless algorithm that will reduce truly random (high entropy)
> data by any significant margin. In classical information theory, such an
> 
> algorithm can never be invented. See: Kolmogorov complexity
> 
> 
> Real world data is rarely completely random. You would have to test various
> 
> algorithms on the data set in question. Small things such as non-obvious
> statistical clumping can make a big difference in the compression ratio from
> 
> one algorithm to another. Data that might look "random", might not actually be
> random in the entropy sense of the word.
> 
> 
> 
> >> I understand this is not the correct forum but since i think i have an
> >> algorithm that can do this very good, and do not know where to turn for such
> 
> >> question i was thinking to start here.
> 
> 
> Not to sound like a downer, but I would wager that the data you're testing your
> 
> algorithm on is not as truly random as you imply or is not a large enough body
> of test data to draw such conclusions from. It's akin to inventing a perpetual
> 
> motion machine or an inertial propulsion engine or any other classically
> impossible solutions. (This only applies to truly random data.)
> 
> 
> 
> -Modulok-

My algorithm will compress data from any random data source.



More information about the Python-list mailing list