[Python-Dev] Python 2.3a1 release -- Dec 31

Just van Rossum just@letterror.com
Wed, 25 Dec 2002 10:03:45 +0100


From: "Samuele Pedroni" <pedronis@bluewin.ch>
> >    zipimporter.c
> >    - removed the subdir feature, which allowed the path to the
> >    zip archive to be extended with a subdirectory. PEP 273
> >    stated this was needed for package support (and only for
> >    that). However, with the new import hooks this is no longer
> >    true: a path item containing the plain zip archive path can
> >    also deal with submodules (find_module receives the full
> >    module name after all). Therefore a pkg.__path__ from a
> >    package loaded from a zip archive will contain the *plain*
> >    zip archive path.
> >    - as a consequence I could simplify and clean up lots of
> >    things (esp. zipimporter_init: eg. it no longer needs to
> >    check sys.path_importer_cache; yay). Getting rid of the
> >    zipimporter.prefix attribute altogether helped a lot in
> >    other places.
> >    - this change additionally enabled me to get rid of the
> >    restriction that zip paths must end in .ZIP or .zip; any
> >    extension (or even no extension) will now work.
>
> will not this break __path__ manipulations?

If the particular manipulation did work for zip files at all before, yes
:-(. (It wouldn't have worked with a Zip archive that was packed by a
freeze-like tool, unless the *results* of the manipulations were
explicitly flattened during packaging.)

> further this change is backward incompatible with what is allowed by
> Jython 2.1,
> 
> it was considered a feature to be able to put
> 
> path/to/a.zip/python
> 
> in sys.path, so that the a.b package would be looked up under
> 
> path/to/a.zip/python/a/b.

PEP 273 doesn't document it as such; it only says it's needed for
package imports. Also, to be honest, my implementation had some issues
with that usage: it would look for the plain .zip archive in
sys.path_importer_cache, which would obviously not be found, causing the
zip file index to be read again for every package directory. The only
solution I thought of that could solve that is for the zipimporter
object to *add* entries to sys.path_importer_cache itself, and I found
it bad enough already that it _read_ from the cache itself in the
previous version. It's a messy feature :-(.

I personally don't care about this feature; it's easy enough to package
the archive so that it's not needed.

Regarding the __path__ manipulations: this assumes file-system
properties and can't work for importers in *general* and it feels like a
hack to specially allow it for Zip archives (it definitely was a hack in
my implementation, therefore I'm happy to get rid of it ;-). In many
other respects Zip archives also won't be able to be compatible with a
real file system anyway, eg. why should __path__ manipulations work and
not __file__ manipulations? (Now if we had a virtual file system with
Zip file support, things would be different!)

I still think that __path__ manipulations are evil, as would be
module-specific sys.path manipulation. To me, sys.path is the domain of
*applications*, which implies that pkg.__path__ should be left alone
also (at least by the package itself). It seems Guido is going the
opposite direction with pkgutil.py :-(.

Just