[SciPy-user] read/write compressed files
Francesc Altet
faltet at carabos.com
Thu Jun 21 07:30:41 EDT 2007
El dj 21 de 06 del 2007 a les 12:57 +0200, en/na Dominik Szczerba va
escriure:
> Hi,
>
> I meant bz2 over zlib due to higher compression, if slower performance.
> This common belief was usually parallel to my experience. However, a
> simple test below made with fresh morning data clearly undermines this
> thinking:
>
>
>
> > du -hsc test9*.dat
>
> 428M total
>
> > time gzip test9*.dat
>
> real 0m31.663s
> user 0m28.946s
> sys 0m1.612s
>
> > du -hsc test9*.dat.gz
>
> 215M total
>
> > time gunzip test9*.dat.gz
>
> real 0m7.447s
> user 0m6.036s
> sys 0m1.264s
>
> > time bzip2 test9*.dat
>
> real 2m1.696s
> user 1m54.527s
> sys 0m4.008s
>
> > du -hsc test9*.dat.bz2
>
> 219M total
>
> > time bunzip2 test9*.dat.bz2
>
> real 0m43.252s
> user 0m39.926s
> sys 0m2.792s
>
>
> I am surprised, as I well remember cases where I could gain 20%.
Yeah, there should be cases where bzip2 is clearly better than zlib and
one of these could be images. My teammate Ivan has come with this
example:
-rw------- 1 ivan ivan 733373 2007-06-21 13:02 lena1.tif.gz
-rw------- 1 ivan ivan 584478 2007-06-21 13:02 lena2.tif.bz2
(you should already know where the source is: www.lenna.org )
But when it comes to general binary data for scientific uses, the
compression advantages of bzip2 over zlib are less clear.
> But
> indeed, given the much slower performance, you have me convinced to use
> zlib over bz2.
>
> thanks for forcing me to do this test,
You are welcome ;)
--
Francesc Altet | Be careful about using the following code --
Carabos Coop. V. | I've only proven that it works,
www.carabos.com | I haven't tested it. -- Donald Knuth
More information about the SciPy-User
mailing list