Algorithm that makes maximum compression of completly diffused data.

Gene Heskett gheskett at wdtv.com
Wed Oct 30 16:32:38 EDT 2013


On Wednesday 30 October 2013 16:29:12 jonas.thornvall at gmail.com did opine:

> Den onsdagen den 30:e oktober 2013 kl. 20:46:57 UTC+1 skrev Modulok:
> > On Wed, Oct 30, 2013 at 12:21 PM,  <jonas.t... at gmail.com> wrote:
> > 
> > 
> > 
> > I am searching for the program or algorithm that makes the best
> > possible of completly (diffused data/random noise) and wonder what
> > the state of art compression is.
> > 
> > 
> > 
> > 
> > I understand this is not the correct forum but since i think i have an
> > algorithm that can do this very good, and do not know where to turn
> > for such question i was thinking to start here.
> > 
> > 
> > 
> > It is of course lossless compression i am speaking of.
> > 
> > --
> > 
> > https://mail.python.org/mailman/listinfo/python-list
> > 
> > 
> >  
> > 
> > >> I am searching for the program or algorithm that makes the best
> > >> possible of completly (diffused data/random noise) and wonder what
> > >> the state of art
> > >> 
> > >> compression is.
> > 
> > None. If the data to be compressed is truly homogeneous, random noise
> > as you describe (for example a 100mb file read from cryptographically
> > secure random
> > 
> > bit generator such as /dev/random on *nix systems), the
> > state-of-the-art lossless compression is zero and will remain that
> > way for the foreseeable
> > 
> > future.
> > 
> > 
> > There is no lossless algorithm that will reduce truly random (high
> > entropy) data by any significant margin. In classical information
> > theory, such an
> > 
> > algorithm can never be invented. See: Kolmogorov complexity
> > 
> > 
> > Real world data is rarely completely random. You would have to test
> > various
> > 
> > algorithms on the data set in question. Small things such as
> > non-obvious statistical clumping can make a big difference in the
> > compression ratio from
> > 
> > one algorithm to another. Data that might look "random", might not
> > actually be random in the entropy sense of the word.
> > 
> > >> I understand this is not the correct forum but since i think i have
> > >> an algorithm that can do this very good, and do not know where to
> > >> turn for such
> > >> 
> > >> question i was thinking to start here.
> > 
> > Not to sound like a downer, but I would wager that the data you're
> > testing your
> > 
> > algorithm on is not as truly random as you imply or is not a large
> > enough body of test data to draw such conclusions from. It's akin to
> > inventing a perpetual
> > 
> > motion machine or an inertial propulsion engine or any other
> > classically impossible solutions. (This only applies to truly random
> > data.)
> > 
> > 
> > 
> > -Modulok-
> 
> Well then i have news for you.

Congratulations Jonas.  My kill file for this list used to have only one 
name, but now has 2.  Unfortunately I will still see the backscatter of 
others trying to tell you how to post and interact with this list.  But for 
now, this person with, since we are quoting IQ's here, a tested 147 says 
good by.

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)

BOFH excuse #275:

Bit rot
A pen in the hand of this president is far more
dangerous than 200 million guns in the hands of
         law-abiding citizens.



More information about the Python-list mailing list