[SciPy-user] read/write compressed files

Francesc Altet faltet at carabos.com
Thu Jun 21 07:30:41 EDT 2007


El dj 21 de 06 del 2007 a les 12:57 +0200, en/na Dominik Szczerba va
escriure:
> Hi,
> 
> I meant bz2 over zlib due to higher compression, if slower performance.
> This common belief was usually parallel to my experience. However, a
> simple test below made with fresh morning data clearly undermines this
> thinking:
> 
> 
> 
> > du -hsc test9*.dat
> 
> 428M    total
> 
> > time gzip test9*.dat
> 
> real    0m31.663s
> user    0m28.946s
> sys     0m1.612s
> 
> > du -hsc test9*.dat.gz
> 
> 215M    total
> 
> > time gunzip test9*.dat.gz
> 
> real    0m7.447s
> user    0m6.036s
> sys     0m1.264s
> 
> > time bzip2 test9*.dat
> 
> real    2m1.696s
> user    1m54.527s
> sys     0m4.008s
> 
> > du -hsc test9*.dat.bz2
> 
> 219M    total
> 
> > time bunzip2 test9*.dat.bz2
> 
> real    0m43.252s
> user    0m39.926s
> sys     0m2.792s
> 
> 
> I am surprised, as I well remember cases where I could gain 20%.

Yeah, there should be cases where bzip2 is clearly better than zlib and
one of these could be images.  My teammate Ivan has come with this
example:

-rw------- 1 ivan ivan 733373 2007-06-21 13:02 lena1.tif.gz
-rw------- 1 ivan ivan 584478 2007-06-21 13:02 lena2.tif.bz2

(you should already know where the source is: www.lenna.org )

But when it comes to general binary data for scientific uses, the
compression advantages of bzip2 over zlib are less clear.

>  But
> indeed, given the much slower performance, you have me convinced to use
> zlib over bz2.
> 
> thanks for forcing me to do this test,

You are welcome ;)

-- 
Francesc Altet    |  Be careful about using the following code --
Carabos Coop. V.  |  I've only proven that it works, 
www.carabos.com   |  I haven't tested it. -- Donald Knuth




More information about the SciPy-User mailing list