[SciPy-user] read/write compressed files

Dominik Szczerba domi at vision.ee.ethz.ch
Sat Jun 23 05:35:06 EDT 2007


I remember even for my binary data that bzip was about 20% better, but
significantly slower. Best would be of course to have both (and more)
compressors and chose which suits the case best. But in real world
probabely zlib is a more general choice, if only one compressor is intended.
PS. Yes, it's a very bad idea to keep real numbers as ascii.
- Dominik

Antonino Ingargiola wrote:
> Hi,
> 
> 2007/6/21, Francesc Altet <faltet at carabos.com>:
> 
> <snip>
> 
>> Ok, that's fine. In any case, I'm interested in knowing the reasons on
>> why you are using bzip2 instead zlib.  Have you detected some data
>> pattern where you get significantly more compression than by using zlib
>> for example?.
>>
>> I'm asking this because, in my experience with numerical data, I was
>> unable to detect important compression level differences between bzip2
>> and zlib. See:
>>
>> http://www.pytables.org/docs/manual/ch05.html#compressionIssues
>>
>> for some experiments in that regard.
>>
>> I'd appreciate any input on this subject (bzip2 vs zlib).
> 
> Probably not very meaningful, but with ascii data (float as ascii)
> bzip2 seems to have a certain degree of advantages (both in speed and
> compress ratio):
> 
>   $ du -h lena.txt
>   3,1M    lena.txt
> 
>   $ time gzip  -9 lena.txt
> 
>   real    0m4.937s        <=
>   user    0m4.758s
>   sys     0m0.018s
> 
>   $ du -h lena.txt.gz
>   316K    lena.txt.gz
> 
>   $ time gunzip lena.txt.gz
> 
>   real    0m0.092s
>   user    0m0.038s
>   sys     0m0.020s
> 
>   $ time bzip2 lena.txt
> 
>   real    0m2.524s        <=
>   user    0m2.396s
>   sys     0m0.027s
> 
>   $ du -h lena.txt.bz2
>   188K    lena.txt.bz2
> 
>   $ time bunzip2 lena.txt.bz2
> 
>   real    0m0.868s
>   user    0m0.775s
>   sys     0m0.040s
> 
> 
> Even if it's usually a bad idea to put numerical data in ascii format,
> sometimes may be handy.
> 
> Regards,
> 
>     ~ Antonio
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user

-- 
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi



More information about the SciPy-User mailing list