[Python-checkins] r80132 - peps/trunk/pep-3147.txt
barry.warsaw
python-checkins at python.org
Sat Apr 17 01:08:19 CEST 2010
Author: barry.warsaw
Date: Sat Apr 17 01:08:18 2010
New Revision: 80132
Log:
PEP 3147 is accepted and will be final (i.e. implemented) momentarily.
Modified:
peps/trunk/pep-3147.txt
Modified: peps/trunk/pep-3147.txt
==============================================================================
--- peps/trunk/pep-3147.txt (original)
+++ peps/trunk/pep-3147.txt Sat Apr 17 01:08:18 2010
@@ -3,7 +3,7 @@
Version: $Revision$
Last-Modified: $Date$
Author: Barry Warsaw <barry at python.org>
-Status: Draft
+Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 2009-12-16
@@ -34,15 +34,15 @@
source file is `foo.py`, CPython caches the byte code in a `foo.pyc`
file right next to the source.
-Byte code files contain two 32-bit numbers followed by the marshaled
-[2]_ code object. The 32-bit numbers represent a magic number and a
-timestamp. The magic number changes whenever Python changes the byte
-code format, e.g. by adding new byte codes to its virtual machine.
-This ensures that pyc files built for previous versions of the VM
-won't cause problems. The timestamp is used to make sure that the pyc
-file is not older than the py file that was used to create it. When
-either the magic number or timestamp do not match, the py file is
-recompiled and a new pyc file is written.
+Byte code files contain two 32-bit big-endian numbers followed by the
+marshaled [2]_ code object. The 32-bit numbers represent a magic
+number and a timestamp. The magic number changes whenever Python
+changes the byte code format, e.g. by adding new byte codes to its
+virtual machine. This ensures that pyc files built for previous
+versions of the VM won't cause problems. The timestamp is used to
+make sure that the pyc file match the py file that was used to create
+it. When either the magic number or timestamp do not match, the py
+file is recompiled and a new pyc file is written.
In practice, it is well known that pyc files are not compatible across
Python major releases. A reading of import.c [3]_ in the Python
@@ -58,12 +58,15 @@
Ubuntu 9.10 Karmic Koala users can install Python 2.5, 2.6, and 3.1,
with Python 2.6 being the default.
-This causes a conflict for Python source files installed by the
-system (including third party packages), because you cannot compile a
-single Python source file for more than one Python version at a time.
-Thus if your system wanted to install a `/usr/share/python/foo.py`, it
-could not create a `/usr/share/python/foo.pyc` file usable across all
-installed Python versions.
+This causes a conflict for third party Python source files installed
+by the system, because you cannot compile a single Python source file
+for more than one Python version at a time. When Python finds a `pyc`
+file with a non-matching magic number, it falls back to the slower
+process of recompiling the source. Thus if your system installed a
+`/usr/share/python/foo.py`, two different versions of Python would
+fight over the `pyc` file and rewrite it each time the source is
+compiled. (The standard library is unaffected by this, since multiple
+versions of the stdlib *are* installed on such distributions..)
Furthermore, in order to ease the burden on operating system packagers
for these distributions, the distribution packages do not contain
@@ -75,10 +78,8 @@
the sheer number of packages available, this amount of work is
infeasible.
-C extensions can be source compatible across multiple versions of
-Python. Compiled extension modules are usually not compatible though,
-and PEP 384 [7]_ has been proposed to address this by defining a
-stable ABI for extension modules.
+(PEP 384 [7]_ has been proposed to address binary compatibility issues
+of third party extension modules across different versions of Python.)
Because these distributions cannot share pyc files, elaborate
mechanisms have been developed to put the resulting pyc files in
@@ -101,10 +102,19 @@
code cache files in a single directory inside every Python package
directory. This directory will be called `__pycache__`.
-Further, pyc file names will contain a magic string (tag) that
-differentiates the Python version they were compiled for. This allows
-multiple byte compiled cache files to co-exist for a single Python
-source file.
+Further, pyc file names will contain a magic string (called a "tag")
+that differentiates the Python version they were compiled for. This
+allows multiple byte compiled cache files to co-exist for a single
+Python source file.
+
+The magic tag is implementation defined, but should contain the
+implementation name and a version number shorthand, e.g. `cpython-32`.
+It must be unique among all versions of Python, and whenever the magic
+number is bumped, a new magic tag must be defined. An example `pyc`
+file for Python 3.2 is thus `foo.cpython-32.pyc`.
+
+The magic tag is available in the `imp` module via the `get_tag()`
+function. This is parallel to the `imp.get_magic()` function.
This scheme has the added benefit of reducing the clutter in a Python
package directory.
@@ -233,22 +243,28 @@
When Python searches for a module to import (say `foo`), it may find
one of several situations. As per current Python rules, the term
"matching pyc" means that the magic number matches the current
-interpreter's magic number, and the source file is not newer than the
-`pyc` file.
+interpreter's magic number, and the source file's timestamp matches
+the timestamp in the `pyc` file exactly.
-Case 1: The first import
+Case 0: The steady state
------------------------
When Python is asked to import module `foo`, it searches for a
`foo.py` file (or `foo` package, but that's not important for this
-discussion) along its `sys.path`. When Python locates the `foo.py`
-file it will look for a `__pycache__` directory in the directory where
-it found the `foo.py`. If the `__pycache__` directory is missing,
-Python will create it. Then it will parse and byte compile the
-`foo.py` file and save the byte code in `__pycache__/foo.<magic>.pyc`,
-where <magic> is defined by the Python implementation, but will be a
-human readable string such as `cpython-32`.
+discussion) along its `sys.path`. If found, Python looks to see if
+there is a matching `__pycache__/foo.<magic>.pyc` file, and if so,
+that `pyc` file is loaded.
+
+
+Case 1: The first import
+------------------------
+
+When Python locates the `foo.py`, if the `__pycache__/foo.<magic>.pyc`
+file is missing, Python will create it, also creating the
+`__pycache__` directory if necessary. Python will parse and byte
+compile the `foo.py` file and save the byte code in
+`__pycache__/foo.<magic>.pyc`.
Case 2: The second import
@@ -303,42 +319,6 @@
:scale: 75
-Magic identifiers
-=================
-
-pyc files inside of the `__pycache__` directories contain a magic
-identifier in their file names. These are mnemonic tags for the
-actual magic numbers used by the importer. For example, in Python
-3.2, we could use the hexlified [10]_ magic number as a unique
-identifier::
-
- >>> from binascii import hexlify
- >>> from imp import get_magic
- >>> 'foo.{}.pyc'.format(hexlify(get_magic()).decode('ascii'))
- 'foo.580c0d0a.pyc'
-
-This isn't particularly human friendly though. Instead, this PEP
-proposes a *magic tag* that uniquely defines `.pyc` files for the
-current version of Python. Whenever the magic number is bumped, a new
-magic tag is defined which is unique among all versions and
-implementations of Python. The actual contents of the magic tag is
-left up to the implementation, although it is recommended that the tag
-include the implementation name and a version shorthand. In general,
-magic numbers never change between Python micro releases, but the
-convention can be extended to handle magic number changes between
-pre-release development versions.
-
-For example, CPython 3.2 would have a magic tag of `cpython-32` and
-write pyc files like this: `foo.cpython-32.pyc`. When the `-O` flag
-is used, it would write `foo.cpython-32.pyo`. For backports of this
-feature to Python 2, when the `-U` flag is used, a file such as
-`foo.cpython-27u.pyc` can be written.
-
-The magic tag is available in the `imp` module via the `get_tag()`
-function. This is analogous to the `get_magic()` function already
-available in that module.
-
-
Alternative Python implementations
==================================
@@ -355,7 +335,9 @@
This feature is targeted for Python 3.2, solving the problem for those
and all future versions. It may be back-ported to Python 2.7.
Vendors are free to backport the changes to earlier distributions as
-they see fit.
+they see fit. For backports of this feature to Python 2, when the
+`-U` flag is used, a file such as `foo.cpython-27u.pyc` can be
+written.
Effects on existing code
@@ -466,6 +448,28 @@
Alternatives
============
+This section describes some alternative approaches or details that
+were considered and rejected during the PEP's development.
+
+
+Hexadecimal magic tags
+----------------------
+
+pyc files inside of the `__pycache__` directories contain a magic tag
+in their file names. These are mnemonic tags for the actual magic
+numbers used by the importer. We could have used the hexadecimal
+representation [10]_ of the binary magic number as a unique
+identifier. For example, in Python 3.2::
+
+ >>> from binascii import hexlify
+ >>> from imp import get_magic
+ >>> 'foo.{}.pyc'.format(hexlify(get_magic()).decode('ascii'))
+ 'foo.580c0d0a.pyc'
+
+This isn't particularly human friendly though, thus the magic tag
+proposed in this PEP.
+
+
PEP 304
-------
More information about the Python-checkins
mailing list