Compression

Peter Otten __peter__ at web.de
Thu Jul 14 04:59:05 EDT 2016


Steven D'Aprano wrote:

> How about some really random data?
> 
> py> import string
> py> data = ''.join(random.choice(string.ascii_letters) for i in
> range(21000)) py> len(codecs.encode(data, 'bz2'))
> 15220
> 
> That's actually better than I expected: it's found some redundancy and
> saved about a quarter of the space. 

It didn't find any redundancy, it found the two unused bits:

>>> math.log(len(string.ascii_letters), 2)
5.700439718141093
>>> 21000./8*_
14963.654260120367





More information about the Python-list mailing list