[Python-ideas] PEP: Distributing a Subset of the Standard Library

Tomas Orsava torsava at redhat.com
Mon Nov 28 08:28:40 EST 2016


Hi!
We have written a draft PEP entitled "Distributing a Subset of the 
Standard Library" that aims to standardize and improve how Python 
handles omissions from its standard library. This is relevant both to 
Python itself as well as to Linux and other distributions that are 
packaging it, as they already separate out parts of the standard library 
into optionally installable packages.

Ideas leading up to this PEP were discussed on the python-dev mailing list:
https://mail.python.org/pipermail/python-dev/2016-July/145534.html

Rendered PEP: https://fedora-python.github.io/pep-drafts/pep-A.html

------------------------------------------------------------------------

Abstract
========

Python is sometimes being distributed without its full standard library.
However, there is as of yet no standardized way of dealing with importing a
missing standard library module.  This PEP proposes a mechanism for 
identifying
which standard library modules are missing and puts forth a method of how
attempts to import a missing standard library module should be handled.


Motivation
==========

There are several use cases for including only a subset of Python's standard
library.  However, there is so far no formal specification of how to 
properly
implement distribution of a subset of the standard library.  Namely, how to
safely handle attempts to import a missing *stdlib* module, and display an
informative error message.


CPython
-------

When one of Python standard library modules (such as ``_sqlite3``) cannot be
compiled during a Python build because of missing dependencies (e.g. SQLite
header files), the module is simply skipped.

If you then install this compiled Python and use it to try to import one 
of the
missing modules, Python will go through the ``sys.path`` entries looking for
it.  It won't find it among the *stdlib* modules and thus it will 
continue onto
``site-packages`` and fail with a ModuleNotFoundError_ if it doesn't 
find it.

.. _ModuleNotFoundError:
https://docs.python.org/3.7/library/exceptions.html#ModuleNotFoundError

This can confuse users who may not understand why a cleanly built Python is
missing standard library modules.


Linux and other distributions
-----------------------------

Many Linux and other distributions are already separating out parts of the
standard library to standalone packages.  Among the most commonly excluded
modules are the ``tkinter`` module, since it draws in a dependency on the
graphical environment, and the ``test`` package, as it only serves to test
Python internally and is about as big as the rest of the standard 
library put
together.

The methods of omission of these modules differ.  For example, Debian 
patches
the file ``Lib/tkinter/__init__.py`` to envelop the line ``import 
_tkinter`` in
a *try-except* block and upon encountering an ``ImportError`` it simply adds
the following to the error message: ``please install the python3-tk 
package``
[#debian-patch]_.  Fedora and other distributions simply don't include the
omitted modules, potentially leaving users baffled as to where to find them.


Specification
=============

When, for any reason, a standard library module is not to be included 
with the
rest, a file with its name and the extension ``.missing.py`` shall be 
created
and placed in the directory the module itself would have occupied. This file
can contain any Python code, however, it *should* raise a 
ModuleNotFoundError_
with a helpful error message.

Currently, when Python tries to import a module ``XYZ``, the ``FileFinder``
path hook goes through the entries in ``sys.path``, and in each location 
looks
for a file whose name is ``XYZ`` with one of the valid suffixes (e.g. 
``.so``,
..., ``.py``, ..., ``.pyc``).  The suffixes are tried in order.  If none of
them are found, Python goes on to try the next directory in ``sys.path``.

The ``.missing.py`` extension will be added to the end of the list, and
configured to be handled by ``SourceFileLoader``.  Thus, if a module is not
found in its proper location, the ``XYZ.missing.py`` file is found and
executed, and further locations are not searched.

The CPython build system will be modified to generate ``.missing.py`` 
files for
optional modules that were not built.


Rationale
=========

The mechanism of handling missing standard library modules through the 
use of
the ``.missing.py`` files was chosen due to its advantages both for CPython
itself and for Linux and other distributions that are packaging it.

The missing pieces of the standard library can be subsequently installed 
simply
by putting the module files in their appropriate location.  They will 
then take
precedence over the corresponding ``.missing.py`` files.  This makes
installation simple for Linux package managers.

This mechanism also solves the minor issue of importing a module from
``site-packages`` with the same name as a missing standard library module.
Now, Python will import the ``.missing.py`` file and won't ever look for a
*stdlib* module in ``site-packages``.

In addition, this method of handling missing *stdlib* modules can be
implemented in a succinct, non-intrusive way in CPython, and thus won't 
add to
the complexity of the existing code base.

The ``.missing.py`` file can be customized by the packager to provide any
desirable behaviour.  While we strongly recommend that these files only 
raise a
ModuleNotFoundError_ with an appropriate message, there is no reason to 
limit
customization options.

Ideas leading up to this PEP were discussed on the `python-dev mailing 
list`_.

.. _`python-dev mailing list`:
https://mail.python.org/pipermail/python-dev/2016-July/145534.html


Backwards Compatibility
=======================

No problems with backwards compatibility are expected. Distributions 
that are
already patching Python modules to provide custom handling of missing
dependencies can continue to do so unhindered.


Reference Implementation
========================

Reference implementation can be found on `GitHub`_ and is also accessible in
the form of a `patch`_.

.. _`GitHub`: https://github.com/torsava/cpython/pull/1
.. _`patch`: https://github.com/torsava/cpython/pull/1.patch


References
==========

.. [#debian-patch]
http://bazaar.launchpad.net/~doko/python/pkg3.5-debian/view/head:/patches/tkinter-import.diff


Copyright
=========

This document has been placed in the public domain.

------------------------------------------------------------------------
Regards,
Tomas Orsava
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20161128/85dcb56a/attachment.html>


More information about the Python-ideas mailing list