Notice: While Javascript is not essential for this website, your interaction with the content will be limited. Please turn Javascript on for the full experience.

PEP 648 -- Extensible customizations of the interpreter at startup

PEP:0648
Title:Extensible customizations of the interpreter at startup
Author:Mario Corchero <mariocj89 at gmail.com>
Sponsor:Pablo Galindo
BDFL-Delegate:XXXX
Discussions-To:https://discuss.python.org/t/pep-648-extensible-customizations-of-the-interpreter-at-startup/6403
Status:Draft
Type:Standards Track
Created:30-Dec-2020
Python-Version:3.10
Post-History:python-ideas: 16th Dec. python-dev: 18th Dec.

Abstract

This pep proposes supporting extensible customization of the interpreter, by allowing users to install scripts that will be executed at startup.

Motivation

System administrators, tools that repackage the interpreter and some libraries need to customize aspects of the interpreter at startup time.

This is usually achieved via sitecustomize.py for system administrators whilst libraries rely on exploiting pth files. This PEP proposes a way of achieving the same in a more user-friendly and structured way.

Limitations of pth files

If a library needs to perform any customization before an import or that relates to the general working of the interpreter, they often rely on the fact that pth files, which are loaded at startup, can include Python code that will be executed when the pth file is evaluated.

Note that pth files were originally developed to just add additional directories to sys.path, but they may also contain lines which start with "import", which will be ``exec``ed. Users have exploited this feature to allow the customizations that they needed. See setuptools [4] or betterexceptions [5] as examples.

Using pth files for this purpose is far from ideal for library developers, as they need to inject code into a single line preceded by an import, making it rather unreadable. Library developers following that practice will usually create a module that performs all actions on import, as done by betterexceptions [5], but the approach is still not really user friendly.

Additionally, it is also non-ideal for users of the interpreter as if they want to inspect what is being executed at Python startup they need to review all the pth files for potential code execution which can be spread across all site paths. Most of those pth will be "legit" pth files that just modify the path, answering the question of "what is changing my interpreter at startup" a rather complex one.

Lastly, there have been multiple suggestions for removing code execution from pth files, see [1] and [2].

Limitations of sitecustomize.py

Whilst sitecustomize is an acceptable solution, it assumes a single person is in charge of the system and the interpreter. If both the system administrator and the responsibility of provisioning the interpreter want to add customizations at the interpreter startup they need to agree on the contents of the file and combine all the changes. This is not a major limitation though, and it is not the main driver of this change, but should the change happen, it will also improve the situation for these users, as rather than having a sitecustomize.py which performs all those actions, they can have custom isolated files named after the features they want to enhance. As an example, Ubuntu could change their current sitecustomize.py to just be ubuntu_apport_python_hook. This not only better represents its intent but also gives users of the interpreter a better understanding of the modifications happening on their interpreter.

Benefits of __sitecustomize__

Having a structured way of injecting custom startup scripts, will allow supporting in a better way the cases presented above. It will result in both maintainers and users better experience as detailed, and allow CPython to deprecate and eventually remove code execution from pth files, as desired in the previously mentioned bpos. Additionally, these solutions provide a unique way to support all use-cases that before have been fulfilled via the misuse of pth files, sitecustomize.py and usercustomize.py. The use of a __sitecustomize__ will allow for packages, tools and system admins to inject scripts that will be loaded at startups through an easy to understand mechanism.

Rationale

This PEP proposes supporting extensible customization of the interpreter at startup by allowing users to install scripts into a folder named __sitecustomize__ located in a site path. Those scripts will be executed at startup time.

The site module will expose an option on its main function that allows listing all scripts that will be executed, which will allow users to quickly see all customizations that affect an interpreter.

We will also work with build backends on facilitating the installation of these files.

Why __sitecustomize__

The name aims to follow the already existing concept of sitecustomize.py. As the folder will be within sys.path, given that it is located in site paths, we choose to use double underscore around its name, to prevent colliding with the already existing sitecustomize.py.

Disabling start scripts

In some scenarios, like when the startup time is key, it might be desired to disable this option altogether. Whilst we could have added a new flag to do so, we think that the already existing flag -S [3] is already good enough, as it disables all site related manipulation. If the flag is passed in, __sitecustomize__ will not be used.

Order of execution

The scripts in __sitecustomize__ will be executed in alphabetic order after the evaluation of pth files. We considered executing them in random order, but that could result in different results depending on how the interpreter chooses to pick up those files. So even if it won't be a good practice to rely on other files being executed, we think that is better than having randomly different results on interpreter startup. We chose to run the scripts after the pth files in case an user needs to add items to the path before running a script.

Impact on startup time

If an interpreter is not using this mechanism, the impact on performance is expected to be minimal as this PEP just adds a check for __sitecustomize__ when site.py is walking the site paths looking for pth files. This impact will be reduced in the future as we will remove two other imports: "sitecustomize.py" and "usercustomize.py".

If the user has custom scripts, we think that the impact on the performance of walking each of the folders is acceptable, as the user wants to use this feature. If they need to run a time-sensitive application, they can always use -S to disable this entirely.

Running "./python -c pass" with perf on 50 iterations, repeating 50 times the command on each and getting the geometric mean on a commodity laptop did not reveal any substantial raise on CPU time.

Failure handling

Any error on any of the scripts will not be logged unless the interpreter is run in verbose mode and it should not stop the evaluation of other scripts. The user will just receive a message saying that the script failed to be executed, that verbose mode can be used to get more information. This behaviour follows the one already existing for sitecustomize.py.

Scripts naming convention

Packages will be encouraged to include the name of the package within the name of the script to avoid collisions between packages.

Relationship with sitecustomize and usercustomize

The existing logic for sitecustomize.py and usercustomize.py will be left as is, later deprecated and scheduled for removal. Once __sitecustomize__ is supported, it will provide better integration for all existing users, and even if it will indeed require a migration for system administrators, we expect the effort required to be minimal, it will just require moving and renaming the current sitecustomize.py into the new provided folder.

Identifying all installed scripts

To facilitate debugging of the Python startup, a new option will be added to the main of the site module to list all scripts that will be executed as part of the __sitecustomize__ initialization.

How to teach this

This can be documented and taught as simple as saying that the interpreter will try to look for the __sitecustomize__ folder at startup in its site paths and if it finds any scripts with .py extension, it will then execute it one by one.

For system administrators and tools that package the interpreter, we can now recommend placing files in __sitecustomize__ as they used to place sitecustomize.py. Being more comfortable on that their content won't be overridden by the next person, as they can provide with specific files to handle the logic they want to customize.

Library developers should be able to specify a new argument on tools like setuptools that will inject those new files. Something like sitecustomize_scripts=["scripts/betterexceptions.py"], which allows them to add those. Should the build backend not support that, they can manually install them as they used to do with pth files. We will recommend them to include the name of the package as part of the script's name.

Backward compatibility

We propose to add support for __sitecustomize__ in the next release of Python, add a warning on the three next releases on the deprecation and future removal of sitecustomize.py, usercustomize.py and code execution in pth files, and remove it after maintainers have had 4 releases to migrate. Ignoring those lines in pth files.

Whilst the existing sitecutzomize.py mechanism was created targeting System Administrators that placed it in a site path, the file could be actually placed anywhere in the path at the time that the interpreter was starting up. The new mechanism does not allow for users to place __sitecustomize__ folders anywhere in the path, but only in site paths. System administrators can recover a similar behavior to sitecustomize.py if they need it by adding a custom script in __sitecustomize__ which just imports sitecustomize as a migration path.

Reference Implementation

An initial implementation that passes the CPython test suite is available for evaluation [6].

This implementation is just for the reviewer to play with and check potential issues that this PEP could generate.

Rejected Ideas

Do nothing

Whilst the current status "works" it presents the issues listed in the motivation. After analysing the impact of this change, we believe it is worth given the enhanced experience it brings.

Formalize using pth files

Another option would be to just glorify and document the usage of pth files to inject code at startup code, but that is a suboptimal experience for users as listed in the motivation.

Making __sitecustomize__ a namespace package

We considered making the folder a namespace package and just import all the modules within it, which allowed searching across all paths in sys.path at initialization time and provided a way to declare dependencies between scripts by importing each other. This was rejected for multiple reasons:

  1. This was unnecessarily broadening the list of paths where arbitrary scripts are executed.
  2. The logic brought additional complexity, like what to do if a package were to install an __init__.py file in one of the locations.
  3. It's cheaper to search for __sitecustomize__ as we are looking for pth files already in the site paths compared to performing an actual import of a namespace package.

Support for shutdown custom scripts

init.d users might be tempted to implement this feature in a way that users could also add code at shutdown, but extra support for that is not needed, as Python users can already do that via atexit.

Source: https://github.com/python/peps/blob/master/pep-0648.rst