[Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

Tarek Ziadé ziade.tarek at gmail.com
Thu Oct 8 10:31:40 CEST 2009


Here's a quick summary of the main things that are going to happen in
Distutils, and Distribute, and a few words on virtualenv and pip.
(there is much much more work going on, but I don't want to drown
people with details)

= Distutils =

Distutils is a package manager and competes with OS package managers.
This is a good thing because, unless you are developing a library or
an application that will only run one specific system that has its own
packaging system like Debian, you will be able to reach much more
people. Of course the goal is to avoid making the work of a Debian
packager (or any other OS that has a package manager) too hard. In
other words, re-packaging a Distutils-based project should be easy and
Distutils should not get in their way (or as less as possible).

But right now Distutils is incomplete in many ways and we are trying to fix'em.

== What's installed ? what's the installation format ? how to uninstall ? ==

First, it's an incomplete package manager : you can install a
distribution using it, but there's no way to list installed
distributions. Worst, you can't uninstall a distribution.

PEP 376 resolves this, and once it's finished, the goal is to include
the APIs described there into Distutils itself and into the pkgutil
module in stdlib. Notice that there's an implementation at
http://bitbucket.org/tarek/pep376 that is kept up to date with PEP 376
so people can see what we are talking about.

Another problem that popped during the last years is the fact that, in
the same site-packages, depending on the tool that was used to install
a Distribution, and depending if this distribution uses Distutils or
Setuptools, you can have different installation formats.

End-users end up with zipped eggs (one file), unzipped eggs (one
self-contained format in a directory) and regular Distutils (packages
and modules in site-packages). And the Metadata are also located in
many different places depending on the installation format used.

That can't be. there's no point to keep various installation format in
the *same* site-packages directory.

PEP 376 also resolves this by describing a *unique* format that works
for all. Once this is finished, Distutils will implement it by
changing the install command accordingly.

- Work left to do in PEP 376 : restrict its scope to a disk-based,
file-based site-packages.
- Goal: 2.7 / 3.2

== Dependencies ==

The other feature that makes a packaging system nice is dependencies.
e.g. a way to list in a distribution, the distributions it requires to
run. As a matter of fact, PEP 314 has introduced in the Metadata new
fields for this purpose ("Requires", "Provides and "Obsoletes"). So,
you can write things like "Requires: lxml >= 2.2.1", meaning that your
distribution requires lxml 2.2.1 or a newer version to run. But this
was just description fields and Distutils was not providing any
feature based on these new fields.

In fact, no third-party tool either provided a feature based on those
fields. Setuptools provided "easy_install" a script that looks for the
dependencies and install them, by querying the Python Package Index
(PyPI). But this feature was implemented with its own metadata: you
can add an "install_requires" option in the setup() call in setup.py,
and it will end up in a "requires.txt" file at installation time that
is located alongside the Metadata for you distribution.

So the goal is to review PEP 314 and update the Metadata w.r.t. the
setuptools feedback and community usage. Once it's done, Distutils
will implement this new metadata version and promote its usage.
Promoting its usage means that Distutils will provide some APIs to
work with these APIs, like a version comparison algorithm.

And while we're at it, we need to work out some inconsistency with the
"Author" and "Maintainer" fields. (The latter doesn't exists in the
Metadata but exists on setup.py side).

- Work left to do in PEP 314 : finish PEP 386, finish the discussion
on the "maintainer" field.
- Goal: 2.7 / 3.2

== Version comparison ==

Once you provide dependency fields in the metadata, you need to
provide a version scheme: a way to compare two versions. Distutils has
two version comparison algorithms that are not used in its code and in
only one place in the stdlib where it could be removed with no pain.
One version scheme is "strict" and one is "loose". And Setuptools has
another one, which is more heuristic (it will deal with any version
string and compare it, wether it's wrong or not).

PEP 386 goal is to describe a version scheme that can be used by all
and if we can meet a consensus there, we can move on and add
it as a reference in the update done in  PEP 314, besides the
dependencies fields. Then, in Distutils we can deprecate the existing
version
comparison algorithms and provide a new one based on PEP 386 and
promote its usage.

One very important point: we will not force the community to use the
scheme described in PEP 386, but *there is* already a
de-facto convention on version schemes at PyPI if you use Pip or
easy_install, so let's have a documented standard for this,
and a reference implementation in Distutils.

There's an implementation at
http://bitbucket.org/tarek/distutilsversion that is kept up-to-date
with PEP 386.

- Work left to do in PEP 386 : another round with the community
- Goal: 2.7 / 3.2

== The fate of setup.py, and static metadata ==

Setup.py is a CLI to create distribution, install them etc. You can
also use it to retrieve the metadata of a distribution. For
example you can call "python setup.py --name" and the name will be
displayed. That's fine. That's great for developers.

But there's a major flaw: it's Python code. It's a problem because,
depending on the complexity of this file, an OS packager that just
wants to get the metadata for the platform he's working on, will run
arbitrary code that mught do unwanted things (or even that light not
work)

So we are going to separate the metadata description from setup.py, in
a static configuration file, that can be open and read by anyone
without
running any code. The only problem with this is the fact that some
metadata fields might depend on the execution environment. For
instance, once "Requires" is re-defined and re-introduced via PEP 314,
we will have cases where "pywin32" will be a dependency to have only
on win32 systems.

So we've worked on that lately in Distutils-SIG and came up with a
micro-language, based on a ConfigParser file, that allows
writing metadata fields that depends on sys.platform etc. I won't
detail the syntax here but the idea is that the interpretation
of this file can be done with a vanilla Python without running arbitrary code.

In other words : we will be able to get the metadata for a
distribution without having to install it or to run any setup.py
command.
One use case is the ability to list all dependencies a distribution
requires for a given platform, just by querying PyPI.

So I am adding this in Distutils for 2.7.

Of course setup.py stays, and this is backward compatible.

- Work left to do : publish the final syntax, and do the implementation
- Goal: 2.7 / 3.2

== The fate of bdist_* commands ==

During last Pycon summit we said that we would remove commands like
bdist_rpm because Python is unable, due to its release cycle,
to do a good work there. Here's an example: I have from time to time
cryptic issues in the issue tracker from people from Fedora (or any
rpm-based system), and I have all the pain in the world for these very
specific problems to do the proper fix unless some RPM expert helps
around. And by the time it's detected then fixed, it can be year(s)
before it's available on their side. That's why, depending on the
communities, commands like bdist_rpm are just totally ignored, and OS
packager have their own tools.

So the best way to handle this is to ask these communities to build
their own tool and to encourage them to use Distutils as a basis for
that.

This does not concern bdist_* commands for win32 because those are
very stable and don't change too much: Windows doesn't have a package
manager that would require these commands to evolve with it.

Anyways, when we said that we would remove bdist_rpm, this was very
controversial because some people use it and love it.

So what is going to happen is a status-quo: no bdist_* command will be
removed but no new bdist_* command wil be added. That's why I've
encouraged Andrew and Garry, that are working on a bdist_deb command,
to keep it in the "stdeb" project, and eventually we will
refer to it in the Distutils documentation if this bdist_deb comply
with Distutils standard. It doesn't right now because it uses a
custom version of the Distribution class (through Setuptools) that
doesn't behave like Distutils' one anymore.

For Distutils, I'll add some documentation explaining this, and a
section that will list community-driven commands.

- Work left to do : write the documentation
- Goal: 2.7 / 3.2

= Distribute =

I won't explain here again why we have forked, I think it's obvious to
anyone here now. I'll rather explain what
we are planning in Distribute and how it will interact with Distutils.

Distribute has two branches:

- 0.6.x : provides a Setuptools-0.6c9 compatible version
- 0.7.x : will provide a refactoring

== 0.6.x ==

Not "much" is going to happen here, we want this branch to be helpful
to the community *today* by addressing the 40-or-so bugs
that were found in Setuptools and never fixed. This is eventually
happen soon because its development is
fast : there are up to 5 commiters that are working on it very often
(and the number grows weekly.)

The biggest issue with this branch is that it is providing the same
packages and modules setuptools does, and this
requires some bootstrapping work where we make sure once Distribute is
installed, all Distribution that requires Setuptools
will continue to work. This is done by faking the metadata of
Setuptools 0.6c9. That's the only way we found to do this.

There's one major thing though: thanks to the work of Lennart, Alex,
Martin, this branch supports Python 3,
which is great to have to speed up Py3 adoption.

The goal of the 0.6.x is to remove as much bugs as we can, and try if
possible to remove the patches done
on Distutils. We will support 0.6.x maintenance for years and we will
promote its usage everywhere instead of
Setuptools.

Some new commands are added there, when they are helpful and don't
interact with the rest. I am thinking
about "upload_docs" that let you upload documentation to PyPI. The
goal is to move it to Distutils
at some point, if the documentation feature of PyPI stays and starts to be used.

== 0.7.x ==

We've started to refactor Distribute with this roadmap in mind (and
no, as someone said, it's not vaporware,
we've done a lot already)

- 0.7.x can be installed and used with 0.6.x

- easy_install is going to be deprecated ! use Pip !

- the version system will be deprecated, in favor of the one in Distutils

- no more Distutils monkey-patch that happens once you use the code
 (things like 'from distutils import cmd; cmd.Command = CustomCommand')

- no more custom site.py (that is: if something misses in Python's
site.py we'll add it there instead of patching it)

- no more namespaced packages system, if PEP 381 (namespaces package
support) makes it to 2.7

- The code is splitted in many packages and might be distributed under
several distributions.

   - distribute.resources: that's the old pkg_resources, but
reorganized in clean, pep-8 modules. This package will
     only contain the query APIs and will focus on being PEP 376
compatible. We will promote its usage and see if Pip wants
     to use it as a basis. And maybe PyPM once it's open source ?
(<hint> <hint>).
	It will probably shrink a lot though, once the stdlib provides PEP 376 support.

   - distribute.entrypoints: that's the old pkg_resources entry points
system, but on its own. it uses distribute.resources

   - distribute.index: that's package_index and a few other things.
everything required to interact with PyPI. We will promote
     its usage and see if Pip wants to use it as a basis.

   - distribute.core (might be renamed to main): that's everything
else, and uses the other packages.


Goal: A first release before (or when) Python 2.7 / 3.2 is out.


= Virtualenv and the multiple version support in Distribute =

(I am not saying "We" here because this part was not discussed yet
with everyone)

Virtualenv allows you to create an isolated environment to install
some distribution without polluting the
main site-packages, a bit like a user site-packages.

My opinion is that this tool exists only because Python doesn't
support the installation of multiple versions for the same
distributions.
But if PEP 376 and PEP 386 support are added in Python, we're not far
from being able to provide multiple version support with
the help of importlib.

Setuptools provided a multiple version support but I don't like its
implementation and the way its works.
I would like to create a new site-packages format that can contains
several versions of the same distribution, and :

- a special import system using importlib that would automatically
pick the latest version, thanks to PEP 376.
- an API to force at runtime a specific version (that would be located
at the beginning of all imports, like __future__)
- a layout that is compatible with the way OS packagers works with
python packages

Goal: a prototype asap (one was started under the "VSP" name (virtual
site-packages) but not finished yet)

Regards
Tarek

-- 
Tarek Ziadé | http://ziade.org | オープンソースはすごい! | 开源传万世,因有你参与


More information about the Python-Dev mailing list