Jeremy Hylton : weblog : 2003-09-24

Python Package Index Tutorial

Wednesday, September 24, 2003

PyPI: the Python Package Index is the latest attempt to create a comprehensive catalog of third-party Python packages. The catalog is integrated with distutils. This tutorial explains how to use setup.py to create PyPI entries.

There are six simple steps to follow to create a setup script that will work with PyPI. Once the script is written, it requires very little maintenance to update the index on each subsequent release.

  1. Register with PyPI
  2. Collect metadata
  3. Add metadata to setup.py
  4. Check PKG-INFO
  5. Run register command
  6. Check the listing on python.org

Register with PyPI

The first step is to complete the user registration form. When you create or update an entry, you need to provide a username and password. The goal, I presume, is to prevent someone else from modifying your entries.

The PyPI authentication is very weak. The username and plaintext password are passed as part of the form data. Anyone who can guess your username and password can impersonate you. The password is transmitted in cleartext and a simple hash is stored on the server; it's vulnerable to dictionary attacks and simple theft.

You need to save the username and password in a .pypirc file in your home directory. For example:

[server-login]
username:jeremy
password:aaaaaaabb
distutils will read this information when you run the register command.

Collect metadata

You need to provide metadata in your setup.py script. When you run python setup.py register, the script will package up the metadata and submit it to python.org. The metadata is described in PEP 241: Metadata for Python Software Packages.

The metadata elements are passed as keyword arguments to the setup() call. Some of the metadata, like name and version, is used to create the file names for distributions. Others, like the Trove classifiers, are only used by PyPI.

Necessary metadata
Name
The name of the package
Version
A version number like 3.1.4 or 1.0a3
Summary
A one-line summary of what the package does. It's like the first line of a doc string.
Home-page
The URL of the package's home page
Author
The name of the author
Author-email
The email address of the author. (PEP 241 mentions that this might be used as a unique key in some catalog of packages, but I don't know if it actually is.)
License
PEP 241 says to put the name of the license here, but I don't think that's a good idea. The name doesn't identify a specific license. There are several Zope licenses and several PSF licenses. I recommend putting the URL of the license.
Description
A longer description of the package. PEP 241 says this is optional, but it seems too useful to omit. If someone is searching a package index, how else will they know what your package does?
Platform
A comma-separated list of supported platforms. It's not clear what exactly this is used for, so for code that should run anywhere you could just say "Any."

You can also include an arbitrary number of Trove classifiers. The classifiers describe the software according to a predefined vocabulary. It answers questions like: "What is the intended audience of the package?" and "What is its development status?"

Note that there is some overlap between the Trove classifiers and the other metadata. The classifiers include entries for license and platforms supported. It seems to me that you ought to provide the information in both places, because different software may only look in one place or the other.

Add metadata to setup.py

I'll start with a concrete example -- the parts of ZODB's setup.py that relate to metadata.

"""Zope Object Database: object database and persistence

The Zope Object Database provides an object-oriented database for
Python that provides a high-degree of transparency. Applications can
take advantage of object database features with few, if any, changes
to application logic.  ZODB includes features such as a plugable storage
interface, rich transaction support, and undo.
"""

classifiers = """\
Development Status :: 5 - Production/Stable
Intended Audience :: Developers
License :: OSI Approved :: Zope Public License
Programming Language :: Python
Topic :: Database
Topic :: Software Development :: Libraries :: Python Modules
Operating System :: Microsoft :: Windows
Operating System :: Unix
"""

from distutils.core import setup

if sys.version_info < (2, 3):
    _setup = setup
    def setup(**kwargs):
        if kwargs.has_key("classifiers"):
            del kwargs["classifiers"]
        _setup(**kwargs)

doclines = __doc__.split("\n")

setup(name="ZODB3",
      version="3.2b3",
      maintainer="Zope Corporation",
      maintainer_email="zodb-dev@zope.org",
      url = "http://www.zope.org/Wikis/ZODB/FrontPage",
      license = "http://www.zope.org/Resources/ZPL",
      platforms = ["any"],
      description = doclines[0],
      classifiers = filter(None, classifiers.split("\n")),
      long_description = "\n".join(doclines[2:]),
      )

There are only a few interesting things about the specific code. First, Python 2.3 is the earliest Python version that understands the classifiers keyword. If you want the setup script to work with earlier Pythons, you need to add some kind of workaround. (distutils wasn't designed for graceful evolution. It complains about arguments it doesn't understand.)

I create the description and long_description from the script's docstring. It seems convenient to have the information in a regular docstring, because that's what I'm used to doing with other modules.

The classifiers must be passed as a list of strings. I write them in a block as a triple-quoted string and then split them into individual strings in the setup() call. platforms also expects a list of strings.

Check PKG-INFO

You can use the distutils PKG-INFO file to debug the metadata you entered in setup.py. When you create a distribution using setup.py, distutils includes a PKG-INFO file that contains all the package metadata. When you run, python setup.py sdist, distutils builds a source tarball and puts it in the dist directory. The tarball contains a PKG-INFO file in the top-level directory.

It's a little inconvenient to read the extra PKG-INFO, but it is helpful to double-check your metadata before uploading it to python.org.

Run register command

You should have an account, with username and password in .pypirc, and a setup.py script with all the metadata. Now run:

python setup.py register

That's it.

Check the listing on python.org

Go to http://www.python.org/pypi. You should see your package in the list of the last 20 updates. If you login, you will also see the package in the left navigation bar under the heading "Your Packages."

You can reach the newly generated package record by clicking on the name in the "last 20 updates" table. The individual package pages have a link that says "edit." You can use the edit form to correct any problems you discover after running register.


Thanks to Andrew Kuchling and Richard Jones for comments and corrections.