smaller zip-like format?

pirx pirx at mail.com
Sat Feb 8 18:07:55 EST 2003


On 08 Feb 2003 00:22:30 -0600, Ian Bicking <ianb at colorstudy.com> wrote:

> On Sat, 2003-02-08 at 00:13, David Garamond wrote:
>> Does anyone know of an open format for compressed archive like zip, but 
>> with greater compression ratio for lots of small (1K-4K) files, while 
>> still maintaining relative ease for extracting member files from the 
>> archive? I'm looking for something like Microsoft's CHM format. Python 
>> binding is not a must, I'll probably write the binding if it doesn't 
>> exist.
>
> A tar archive compressed with bzip2, perhaps?  I haven't tested them,
> but .tar.gz seems to consistently come out better than zip, and bzip2 is
> consistently better than gzip.  And there's Python bindings!
>

CAB (from MS) has that - you can test it is very effective in these 
situations. So far as I understand that, the reason is that they compress 
_all_ the files as one, unlike winzip etc. Thus the LZW algorithm used by 
almost all searches for commonality strings in _all_files_. The price to 
pay for this approach is that extraction of file no 10 requires reading of 
files 1-9.


-- 
pirx




More information about the Python-list mailing list