[Python-Dev] Re: [Distutils] Questions about distutils strategy

James C. Ahlstrom jim@interet.com
Thu, 09 Dec 1999 12:43:57 -0500


Jean-Claude Wippler wrote:

> Ouch - what's wrong with zip archives?

Thanks very much for looking over the format.

In general Zip archives store whole branches of a file
system.  A Python ./Lib zip archive would contain:

  N:/python/Python-1.5.2/Lib/string.pyc
  N:/python/Python-1.5.2/Lib/os.pyc
  N:/python/Python-1.5.2/Lib/copy.pyc
  N:/python/Python-1.5.2/Lib/test/testall.pyc

Zip archives are isomorphic to branches of a file system.
That means there must be a sys.path for each zip archive file.
How would this be specified?

The archive format stores modules as dotted names, just as they
appear in the import statement.  The search path is "." in every
archive file by definition.  The import statement "import foo"
just results in a dictionary lookup for key "foo", not a search
through a zip directory along a local search path for "foo.something"
where "something" can be pyc, pyo, py, etc.

The intent was to link the archives to the import statement, not
re-create a directory tree.  It borrowed this feature from
the archive formats of Greg and Gordon.

> There are utilities to convert to/from zip, to re-pack, to mount zip
> transparently so it's entries look like regular files, FTP servers, etc.

Basic operations (to, from, repack) are easy in Python.

> Both Java (jar) and Tcl (Jan Nijtman's "Wrap") have adopted this format.

Hmmm....
 
> Your format has no checksum, which for deployment and long-term storage
> can be important.

Actually the pylib.py "dir()" method reads all *.pyc with marshal,
and I am depending on marshal to object to bad data and also
out-of-date magic numbers.  But this is a good point.

> If you want a marshalled TOC, then why not add a manifest entry for it,
> sort of like what ranlib does with ar?

Sorry, I don't understand.  Please explain.

> You designed the format so archives can be concatenated without any tool
> (other than "cat"), but this works just as well with zip files, as the
> Tcl Wrap approach demonstrates.

Are you saying that cat zip1.zip zip2.zip > myzip.zip works?

An important feature is the ability to concatenate to a binary:
  cat python.exe zip1.zip > myapp.exe
Searching for this isn't fast unless magic numbers are at the
end.  Are zip files recognizable from the end (I don't know)?

> Allow me to very, very loosely paraphrase Guido here: sure, everyone can
> design an archive format, but they are likely to make the same mistakes
> all over again - so why not adopt a format which is tried and tested?
> 
> With all due respect - I sincerely hope you will reconsider and alter
> your code to work with zip files.  It's probably a small adjustment?
> 
> Unless your *intent* is to create a diverging standard, of course...

The intent is to create a standard but not a diverging standard.

Are there any zip experts out there?  Can zip files satisfy all the
design requirements I listed in pylib.html?  Is there zip code
available?  All my code is in Python.

JimA