[Python-checkins] r80132 - peps/trunk/pep-3147.txt

barry.warsaw python-checkins at python.org
Sat Apr 17 01:08:19 CEST 2010


Author: barry.warsaw
Date: Sat Apr 17 01:08:18 2010
New Revision: 80132

Log:
PEP 3147 is accepted and will be final (i.e. implemented) momentarily.


Modified:
   peps/trunk/pep-3147.txt

Modified: peps/trunk/pep-3147.txt
==============================================================================
--- peps/trunk/pep-3147.txt	(original)
+++ peps/trunk/pep-3147.txt	Sat Apr 17 01:08:18 2010
@@ -3,7 +3,7 @@
 Version: $Revision$
 Last-Modified: $Date$
 Author: Barry Warsaw <barry at python.org>
-Status: Draft
+Status: Final
 Type: Standards Track
 Content-Type: text/x-rst
 Created: 2009-12-16
@@ -34,15 +34,15 @@
 source file is `foo.py`, CPython caches the byte code in a `foo.pyc`
 file right next to the source.
 
-Byte code files contain two 32-bit numbers followed by the marshaled
-[2]_ code object.  The 32-bit numbers represent a magic number and a
-timestamp.  The magic number changes whenever Python changes the byte
-code format, e.g. by adding new byte codes to its virtual machine.
-This ensures that pyc files built for previous versions of the VM
-won't cause problems.  The timestamp is used to make sure that the pyc
-file is not older than the py file that was used to create it.  When
-either the magic number or timestamp do not match, the py file is
-recompiled and a new pyc file is written.
+Byte code files contain two 32-bit big-endian numbers followed by the
+marshaled [2]_ code object.  The 32-bit numbers represent a magic
+number and a timestamp.  The magic number changes whenever Python
+changes the byte code format, e.g. by adding new byte codes to its
+virtual machine.  This ensures that pyc files built for previous
+versions of the VM won't cause problems.  The timestamp is used to
+make sure that the pyc file match the py file that was used to create
+it.  When either the magic number or timestamp do not match, the py
+file is recompiled and a new pyc file is written.
 
 In practice, it is well known that pyc files are not compatible across
 Python major releases.  A reading of import.c [3]_ in the Python
@@ -58,12 +58,15 @@
 Ubuntu 9.10 Karmic Koala users can install Python 2.5, 2.6, and 3.1,
 with Python 2.6 being the default.
 
-This causes a conflict for Python source files installed by the
-system (including third party packages), because you cannot compile a
-single Python source file for more than one Python version at a time.
-Thus if your system wanted to install a `/usr/share/python/foo.py`, it
-could not create a `/usr/share/python/foo.pyc` file usable across all
-installed Python versions.
+This causes a conflict for third party Python source files installed
+by the system, because you cannot compile a single Python source file
+for more than one Python version at a time.  When Python finds a `pyc`
+file with a non-matching magic number, it falls back to the slower
+process of recompiling the source.  Thus if your system installed a
+`/usr/share/python/foo.py`, two different versions of Python would
+fight over the `pyc` file and rewrite it each time the source is
+compiled.  (The standard library is unaffected by this, since multiple
+versions of the stdlib *are* installed on such distributions..)
 
 Furthermore, in order to ease the burden on operating system packagers
 for these distributions, the distribution packages do not contain
@@ -75,10 +78,8 @@
 the sheer number of packages available, this amount of work is
 infeasible.
 
-C extensions can be source compatible across multiple versions of
-Python.  Compiled extension modules are usually not compatible though,
-and PEP 384 [7]_ has been proposed to address this by defining a
-stable ABI for extension modules.
+(PEP 384 [7]_ has been proposed to address binary compatibility issues
+of third party extension modules across different versions of Python.)
 
 Because these distributions cannot share pyc files, elaborate
 mechanisms have been developed to put the resulting pyc files in
@@ -101,10 +102,19 @@
 code cache files in a single directory inside every Python package
 directory.  This directory will be called `__pycache__`.
 
-Further, pyc file names will contain a magic string (tag) that
-differentiates the Python version they were compiled for.  This allows
-multiple byte compiled cache files to co-exist for a single Python
-source file.
+Further, pyc file names will contain a magic string (called a "tag")
+that differentiates the Python version they were compiled for.  This
+allows multiple byte compiled cache files to co-exist for a single
+Python source file.
+
+The magic tag is implementation defined, but should contain the
+implementation name and a version number shorthand, e.g. `cpython-32`.
+It must be unique among all versions of Python, and whenever the magic
+number is bumped, a new magic tag must be defined.  An example `pyc`
+file for Python 3.2 is thus `foo.cpython-32.pyc`.
+
+The magic tag is available in the `imp` module via the `get_tag()`
+function.  This is parallel to the `imp.get_magic()` function.
 
 This scheme has the added benefit of reducing the clutter in a Python
 package directory.
@@ -233,22 +243,28 @@
 When Python searches for a module to import (say `foo`), it may find
 one of several situations.  As per current Python rules, the term
 "matching pyc" means that the magic number matches the current
-interpreter's magic number, and the source file is not newer than the
-`pyc` file.
+interpreter's magic number, and the source file's timestamp matches
+the timestamp in the `pyc` file exactly.
 
 
-Case 1: The first import
+Case 0: The steady state
 ------------------------
 
 When Python is asked to import module `foo`, it searches for a
 `foo.py` file (or `foo` package, but that's not important for this
-discussion) along its `sys.path`.  When Python locates the `foo.py`
-file it will look for a `__pycache__` directory in the directory where
-it found the `foo.py`.  If the `__pycache__` directory is missing,
-Python will create it.  Then it will parse and byte compile the
-`foo.py` file and save the byte code in `__pycache__/foo.<magic>.pyc`,
-where <magic> is defined by the Python implementation, but will be a
-human readable string such as `cpython-32`.
+discussion) along its `sys.path`.  If found, Python looks to see if
+there is a matching `__pycache__/foo.<magic>.pyc` file, and if so,
+that `pyc` file is loaded.
+
+
+Case 1: The first import
+------------------------
+
+When Python locates the `foo.py`, if the `__pycache__/foo.<magic>.pyc`
+file is missing, Python will create it, also creating the
+`__pycache__` directory if necessary.  Python will parse and byte
+compile the `foo.py` file and save the byte code in
+`__pycache__/foo.<magic>.pyc`.
 
 
 Case 2: The second import
@@ -303,42 +319,6 @@
    :scale: 75
 
 
-Magic identifiers
-=================
-
-pyc files inside of the `__pycache__` directories contain a magic
-identifier in their file names.  These are mnemonic tags for the
-actual magic numbers used by the importer.  For example, in Python
-3.2, we could use the hexlified [10]_ magic number as a unique
-identifier::
-
-    >>> from binascii import hexlify
-    >>> from imp import get_magic
-    >>> 'foo.{}.pyc'.format(hexlify(get_magic()).decode('ascii'))
-    'foo.580c0d0a.pyc'
-
-This isn't particularly human friendly though.  Instead, this PEP
-proposes a *magic tag* that uniquely defines `.pyc` files for the
-current version of Python.  Whenever the magic number is bumped, a new
-magic tag is defined which is unique among all versions and
-implementations of Python.  The actual contents of the magic tag is
-left up to the implementation, although it is recommended that the tag
-include the implementation name and a version shorthand.  In general,
-magic numbers never change between Python micro releases, but the
-convention can be extended to handle magic number changes between
-pre-release development versions.
-
-For example, CPython 3.2 would have a magic tag of `cpython-32` and
-write pyc files like this: `foo.cpython-32.pyc`.  When the `-O` flag
-is used, it would write `foo.cpython-32.pyo`.  For backports of this
-feature to Python 2, when the `-U` flag is used, a file such as
-`foo.cpython-27u.pyc` can be written.
-
-The magic tag is available in the `imp` module via the `get_tag()`
-function.  This is analogous to the `get_magic()` function already
-available in that module.
-
-
 Alternative Python implementations
 ==================================
 
@@ -355,7 +335,9 @@
 This feature is targeted for Python 3.2, solving the problem for those
 and all future versions.  It may be back-ported to Python 2.7.
 Vendors are free to backport the changes to earlier distributions as
-they see fit.
+they see fit.  For backports of this feature to Python 2, when the
+`-U` flag is used, a file such as `foo.cpython-27u.pyc` can be
+written.
 
 
 Effects on existing code
@@ -466,6 +448,28 @@
 Alternatives
 ============
 
+This section describes some alternative approaches or details that
+were considered and rejected during the PEP's development.
+
+
+Hexadecimal magic tags
+----------------------
+
+pyc files inside of the `__pycache__` directories contain a magic tag
+in their file names.  These are mnemonic tags for the actual magic
+numbers used by the importer.  We could have used the hexadecimal
+representation [10]_ of the binary magic number as a unique
+identifier.  For example, in Python 3.2::
+
+    >>> from binascii import hexlify
+    >>> from imp import get_magic
+    >>> 'foo.{}.pyc'.format(hexlify(get_magic()).decode('ascii'))
+    'foo.580c0d0a.pyc'
+
+This isn't particularly human friendly though, thus the magic tag
+proposed in this PEP.
+
+
 PEP 304
 -------
 


More information about the Python-checkins mailing list