smaller zip-like format?

Andrew MacIntyre andymac at bullseye.apana.org.au
Sat Feb 8 20:48:28 EST 2003


On Sat, 8 Feb 2003, David Garamond wrote:

> Well, as a last resort, if no other archive format exists, I think I
> will use zip but instead of putting in individual files, I'll put in
> tar.gz-compressed directories or groups of files. So this is a middle
> ground between a single stream (.tar.(gz|bz2)) and individual files as
> members (zip).

If you're going down this route, I suggest you try several variations to
test out the compromise between size, speed and complexity:-
 - ZIP (no compression) with individual files gzipped/bzipped;
 - ZIP (no compression) with groups of tar.[gz|bz2] files (as above);
 - ZIP (compression) with groups of files in ZIPs w/o compression.

Most compression techniques work better with more source material (if its
compressible) to work with, so aggregating the compressible files before
compression yields better results - as evidenced by the tar.[gz|bz2]
results.

The first and third suggestions, may prove a bit easier to implement and
may produce acceptable compression.

There is a tar file module available (recently imported into Python's CVS
IIRC), and a bz2 module is also available (also now in Python CVS).

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac at bullseye.apana.org.au  | Snail: PO Box 370
        andymac at pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia






More information about the Python-list mailing list