[Distutils] Library instability on PyPI and impact on OpenStack

Mark McLoughlin markmc at redhat.com
Mon Mar 4 18:36:45 CET 2013


Hey,

On parallel installs ...

On Mon, 2013-03-04 at 18:11 +1000, Nick Coghlan wrote:
> On Mon, Mar 4, 2013 at 1:54 AM, Mark McLoughlin <markmc at redhat.com> wrote:
> >   - Incompatible versions of the same library are routinely installed
> >     in parallel. Does PEP426 here, or is all the work to be done in
> >     tools like PyPI, pip, setuptools, etc. Apps somehow specify which
> >     version they want to use since the incompatible versions of the
> >     library use the same namespace.
> 
> PEP 426 doesn't help here, and isn't intended to. virtualenv (and the
> integrated venv in 3.3+) are the main solution being offered in this
> space (the primary difference with simple bundling is that you still
> keep track of your dependencies, so you have some hope of properly
> rolling out security fixes). Fedora (at least) is experimenting with
> similar capabilities through software collections. IMO, the success of
> iOS, Android (and Windows) as deployment targets means the ISVs have
> spoken: bundling is the preferred option once you get above the core
> OS level. That means it's on us to support bundling in a way that
> doesn't make rolling out security updates a nightmare for system
> administrators, rather than trying to tell ISVs that bundling isn't
> supported.

The adoption of semantic versioning without parallel installs really
worries me. It says that incompatible API changes are ok if you bump
your major version, but the incompatible change is no less painful for
users and distros.

Your "bundling is the preferred option once you get above core OS level"
is exactly the kind of clarity I hoped for from this thread. However, it
does mean that OpenStack and distros who include OpenStack need to
figure out how to bundle OpenStack and its required libraries as a
single stack.

You're right that Fedora has been experimenting with Software
Collections, but that doesn't mean it's a solved problem:

  http://lists.fedoraproject.org/pipermail/devel/2012-December/thread.html#174872

> > How do parallel installs work in
> > practice? etc.
> 
> At the moment, they really don't. setuptools/distribute do allow it to
> some degree, but it can go wrong if you're not sufficiently careful
> with it (e.g. http://git.beaker-project.org/cgit/beaker/commit/?h=develop&id=d4077a118627b947a3c814cd3ff9280afeeecd73).

We do something similar in Fedora right now for sqlalchemy 0.7 and
migrate 0.5. It's not pretty.

Is there any work going on to make this more usable?

> > == Versioning ==
> >
> > Semantic versioning is appealing here because, assuming all libraries
> > adopt it, it becomes very easy for us to predict which versions will
> > be incompatible.
> >
> > For any API unstable library (0.x in semantic versioning), we need to
> > pin to a very specific version and require distributions to package
> > that exact version in order to run OpenStack. When moving to a newer
> > version of the library, we need to move all OpenStack projects at
> > once. Ideally, we'd just avoid such libraries.
> >
> > Implied in semantic versioning, though, is that it's possible for
> > distributions to include both version X.y.z and X+N.y.z
> 
> My point of view is that the system Python is there primarily to run
> system utilities and user scripts, rather than arbitrary Python
> applications. Users can install alternate versions of software into
> their user site directories, or into virtual environments. Projects
> are, of course, also free to include part of their version number in
> the project name.

You mentioned Software Collections - that means bundling all OpenStack's
Python requirements in e.g.

  /opt/openstack-grizzly/

> The challenge of dynamic linking different on-disk versions of a
> module into a process is that:
> - the import system simply isn't set up to work that way
> (setuptools/distribute try to fake it by adjusting sys.path, but that
> can go quite wrong at times)
> - it's confusing for users, since it isn't always clear which version
> they're going to see
> - errors can appear arbitrarily late, since module loading is truly dynamic

If parallel incompatible installs is a hopeless problem in Python, why
the push to semantic versioning then rather than saying that
incompatible API changes should mean a name change?

Thanks,
Mark.



More information about the Distutils-SIG mailing list