[Python-Dev] Zipping the standard library.

Thomas Wouters thomas at python.org
Sat Mar 10 23:49:24 CET 2012


Since Python 2.3 (with the introduction of the zipimport module) it's been
sort-of possible to zip up the standard library.
Modules/getpath.c:calculate_path even adds a specific location
($prefix/lib/python33.zip) to sys.path if it exists to facilitate that. Or
you can include the zipfile alongside an application that embeds Python, or
even embed the zipfile in the same application. Actually setting things up
is not quite that simple, though, at least on non-Windows: you need to know
what to include in the zipfile (only .py, .pyc and .pyo files, no .so
files), what to leave in the old location (os.py, or at least *some file
called os.py*, needs to stay in $prefix/lib/python3.3) and how to deal with
tests and modules that don't like the stdlib living in zipfiles -- and
there's more of those than I expected. Also, depending on what else you
want to put in the zipfile, you may have to be aware of zipimports limited
implementation of zipfiles that involve various 32k-filecount and
2Gb-filesize limits. (And in case you're wondering, yes, we are doing this
with Python 2.7 at Google to save space. And yes, hitting the 2Gb limit is
quite possible for us.)

So with importlib going in, should we do something with zipimport as well?
Its deficiencies can easily be fixed by reimpementing it in Python instead
-- the zipfile module has long since fixed the same 32k/2Gb issues (reading
signed instead of unsigned numbers) and actually supports zip64 extensions
(to break the 64k-filecount and 4Gb-filesize limits in "normal" zipfiles.)
Actually supporting zipping the stdlib then becomes a bit harder: the
importlib bootstrapping would need to include the zipfile module. If we do
that, it would be nice to actually support zipping the stdlib in the Python
build: making a build target that actually does that, and runs the tests
with it. However, this requires modification of a whole bunch of tests, for
example ones that assume that stdlib modules (and the tests themselves!)
have actual files you can open() as their __file__ attribute, and we'd need
to run the testsuite with the stdlib as a zip to prevent new ones from
sneaking in. Also, at that point the question becomes if we need a
transparent interface for opening module sourcefiles or arbitrary files
living in packages, that could grab things out of zipfiles (like setuptools
has in... one of the modules) -- or other archives of course.

(And, yes, I'm zipping up the stdlib for Python 2.7 at Google, to reduce
the impact on the aforementioned million of machines :)
-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120310/0d28072c/attachment.html>


More information about the Python-Dev mailing list