[Distutils] Parallel installation of incompatible versions

Nick Coghlan ncoghlan at gmail.com
Mon Mar 18 23:04:06 CET 2013


pkg_resources.requires() is our only current solution for parallel
installation of incompatible versions. This can be made to work and is
a lot better than the nothing we had before it was created, but also
has quite a few issues (and it can be a nightmare to debug when it
goes wrong).

Based on the exchanges with Mark McLoughlin the other week, and
chatting to Matthias Klose here at the PyCon US sprints, I think I
have a design that will let us support parallel installs in a way that
builds on existing standards, while behaving more consistently in edge
cases and without making sys.path ridiculously long even in systems
with large numbers of potentially incompatible dependencies.

The core of this proposal is to create an updated version of the
installation database format that defines semantics for *.pth files
inside .dist-info directories.

Specifically, whereas *.pth files directly in site-packages are
processed automatically when Python starts up, those inside dist-info
directories would be processed only when explicitly requested
(probably through a new distlib API). The processing of the *.pth file
would insert it into the path immediately before the path entry
containing the .dist-info directory (this is to avoid an issue with
the pkg_resources insert-at-the-front-of-sys.path behaviour where
system packages can end up shadowing those from a local source
checkout, without running into the issue with
append-to-the-end-of-sys.path where a specifically requested version
is shadowed by a globally installed version)

To use CherryPy2 and CherryPy3 on Fedora as an example, what this
would allow is for CherryPy3 to be installed normally (i.e. directly
in site-packages), while CherryPy2 would be installed as a split
install, with the .dist-info going into site-packages and the actual
package going somewhere else (more on that below). A cherrypy2.pth
file inside the dist-info directory would reference the external
location where cherrypy 2.x can be found.

To use this at runtime, you would do something like:

    distlib.some_new_requires_api("CherryPy (2.2)")
    import cherrypy

The other part of this question is how to avoid the potential
explosion of one sys.path entry per dependency. The first part of that
is that for cases where there is no incompatible version installed,
there won't be a *.pth file, and hence no extra sys.path entry (the
module/package will just be installed directly into site-packages as
usual).

The second part has to do with a possible way to organise the
versioned installs: group them by the initial fragment of the version
number according to semantic versioning. For example, define a
"versioned-packages" directory that sits adjacent to "site-packages".
When doing the parallel install of CherryPy2 the actual *code* would
be installed into "versioned-packages/2/", with the cherrypy2.pth file
pointing to that directory. For 0.x releases, there would be a
directory per minor version, while for higher releases, there would
only be a directory per major version.

The nice thing though is that Python wouldn't actually care about the
actual layout of the installed versions, so long as the *.pth files in
the dist-info directories described the mapping correctly.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Distutils-SIG mailing list