[Python-checkins] peps: PEP 509

victor.stinner python-checkins at python.org
Tue Apr 19 06:21:43 EDT 2016


https://hg.python.org/peps/rev/a30e820a8003
changeset:   6287:a30e820a8003
user:        Victor Stinner <victor.stinner at gmail.com>
date:        Tue Apr 19 10:43:02 2016 +0200
summary:
  PEP 509

* __setitem__() now always increases the version: remove the micro-optimization
  dict[key] is new_value
* be more explict on version++: explain that the operation must be atomic, and
  that dict methods are already atomic thanks to the GIL
* "Guard against changing dict during iteration": don't guess if the new dict
  version can be used or not. Let's discuss that later.
* rephrase/complete some sections
* add links to new threads on python-dev

files:
  pep-0509.txt |  199 ++++++++++++++++++++------------------
  1 files changed, 103 insertions(+), 96 deletions(-)


diff --git a/pep-0509.txt b/pep-0509.txt
--- a/pep-0509.txt
+++ b/pep-0509.txt
@@ -22,11 +22,11 @@
 =========
 
 In Python, the builtin ``dict`` type is used by many instructions. For
-example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the
+example, the ``LOAD_GLOBAL`` instruction looks up a variable in the
 global namespace, or in the builtins namespace (two dict lookups).
 Python uses ``dict`` for the builtins namespace, globals namespace, type
-namespaces, instance namespaces, etc. The local namespace (namespace of
-a function) is usually optimized to an array, but it can be a dict too.
+namespaces, instance namespaces, etc. The local namespace (function
+namespace) is usually optimized to an array, but it can be a dict too.
 
 Python is hard to optimize because almost everything is mutable: builtin
 functions, function code, global variables, local variables, ... can be
@@ -35,26 +35,29 @@
 these checks "guards".
 
 The speedup of optimizations depends on the speed of guard checks. This
-PEP proposes to add a version to dictionaries to implement fast guards
-on namespaces.
+PEP proposes to add a private version to dictionaries to implement fast
+guards on namespaces.
 
 Dictionary lookups can be skipped if the version does not change which
-is the common case for most namespaces. Since the version is globally
-unique, the version is also enough to check if the namespace dictionary
-was not replaced with a new dictionary. The performance of a guard does
-not depend on the number of watched dictionary entries, complexity of
-O(1), if the dictionary version does not change.
+is the common case for most namespaces. The version is globally unique,
+so checking the version is also enough to check if the namespace
+dictionary was not replaced with a new dictionary.
+
+When the dictionary version does not change, the performance of a guard
+does not depend on the number of watched dictionary entries: the
+complexity is O(1).
 
 Example of optimization: copy the value of a global variable to function
 constants.  This optimization requires a guard on the global variable to
-check if it was modified. If the variable is modified, the variable must
-be loaded at runtime when the function is called, instead of using the
-constant.
+check if it was modified. If the global variable is not modified, the
+function uses the cached copy. If the global variable is modified, the
+function uses a regular lookup, and maybe also deoptimize the function
+(to remove the overhead of the guard check for next function calls).
 
 See the `PEP 510 -- Specialized functions with guards
 <https://www.python.org/dev/peps/pep-0510/>`_ for the concrete usage of
-guards to specialize functions and for the rationale on Python static
-optimizers.
+guards to specialize functions and for a more general rationale on
+Python static optimizers.
 
 
 Guard example
@@ -77,7 +80,7 @@
             """Return True if the dictionary entry did not change
             and the dictionary was not replaced."""
 
-            # read the version of the dict structure
+            # read the version of the dictionary
             version = dict_get_version(self.dict)
             if version == self.version:
                 # Fast-path: dictionary lookup avoided
@@ -98,21 +101,21 @@
 Usage of the dict version
 =========================
 
-Speedup method calls 1.2x
--------------------------
+Speedup method calls
+--------------------
 
 Yury Selivanov wrote a `patch to optimize method calls
 <https://bugs.python.org/issue26110>`_. The patch depends on the
-`implement per-opcode cache in ceval
+`"implement per-opcode cache in ceval"
 <https://bugs.python.org/issue26219>`_ patch which requires dictionary
 versions to invalidate the cache if the globals dictionary or the
 builtins dictionary has been modified.
 
 The cache also requires that the dictionary version is globally unique.
-It is possible to define a function in a namespace and call it
-in a different namespace: using ``exec()`` with the *globals* parameter
-for example. In this case, the globals dictionary was changed and the
-cache must be invalidated.
+It is possible to define a function in a namespace and call it in a
+different namespace, using ``exec()`` with the *globals* parameter for
+example. In this case, the globals dictionary was replaced and the cache
+must also be invalidated.
 
 
 Specialized functions using guards
@@ -123,28 +126,37 @@
 specialized functions with guards. It allows to implement static
 optimizers for Python without breaking the Python semantics.
 
-Example of a static Python optimizer: the `fatoptimizer
-<http://fatoptimizer.readthedocs.org/>`_ of the `FAT Python
-<http://faster-cpython.readthedocs.org/fat_python.html>`_ project
-implements many optimizations which require guards on namespaces.
-Examples:
+The `fatoptimizer <http://fatoptimizer.readthedocs.org/>`_ of the `FAT
+Python <http://faster-cpython.readthedocs.org/fat_python.html>`_ project
+is an example of a static Python optimizer. It implements many
+optimizations which require guards on namespaces:
 
 * Call pure builtins: to replace ``len("abc")`` with ``3``, guards on
   ``builtins.__dict__['len']`` and ``globals()['len']`` are required
 * Loop unrolling: to unroll the loop ``for i in range(...): ...``,
   guards on ``builtins.__dict__['range']`` and ``globals()['range']``
   are required
+* etc.
 
 
 Pyjion
 ------
 
 According of Brett Cannon, one of the two main developers of Pyjion,
-Pyjion can also benefit from dictionary version to implement
-optimizations.
+Pyjion can benefit from dictionary version to implement optimizations.
 
-Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET
-Core runtime).
+`Pyjion <https://github.com/Microsoft/Pyjion>`_ is a JIT compiler for
+Python based upon CoreCLR (Microsoft .NET Core runtime).
+
+
+Cython
+------
+
+Cython can benefit from dictionary version to implement optimizations.
+
+`Cython <http://cython.org/>`_ is an optimising static compiler for both
+the Python programming language and the extended Cython programming
+language.
 
 
 Unladen Swallow
@@ -153,8 +165,8 @@
 Even if dictionary version was not explicitly mentioned, optimizing
 globals and builtins lookup was part of the Unladen Swallow plan:
 "Implement one of the several proposed schemes for speeding lookups of
-globals and builtins." Source: `Unladen Swallow ProjectPlan
-<https://code.google.com/p/unladen-swallow/wiki/ProjectPlan>`_.
+globals and builtins." (source: `Unladen Swallow ProjectPlan
+<https://code.google.com/p/unladen-swallow/wiki/ProjectPlan>`_).
 
 Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler
 implemented with LLVM. The project stopped in 2011: `Unladen Swallow
@@ -172,23 +184,16 @@
 global version. The global version is also incremented and copied to the
 dictionary version at each dictionary change:
 
-* ``clear()`` if the dict was non-empty
+* ``clear()`` if the dict is non-empty
 * ``pop(key)`` if the key exists
 * ``popitem()`` if the dict is non-empty
-* ``setdefault(key, value)`` if the `key` does not exist
-* ``__detitem__(key)`` if the key exists
-* ``__setitem__(key, value)`` if the `key` doesn't exist or if the value
-  is not ``dict[key]``
-* ``update(...)`` if new values are different than existing values:
-  values are compared by identity, not by their content; the version can
-  be incremented multiple times
+* ``setdefault(key, value)`` if the key does not exist
+* ``__delitem__(key)`` if the key exists
+* ``__setitem__(key, value)`` always increases the version
+* ``update(...)`` if called with arguments
 
-The ``PyDictObject`` structure is not part of the stable ABI.
-
-The field is called ``ma_version_tag`` rather than ``ma_version`` to
-suggest to compare it using ``version_tag == old_version_tag`` rather
-than ``version <= old_version`` which makes the integer overflow much
-likely.
+The version increase must be atomic. In CPython, the Global Interpreter
+Lock (GIL) already protects ``dict`` methods to make them atomic.
 
 Example using an hypothetical ``dict_get_version(dict)`` function::
 
@@ -205,25 +210,23 @@
     >>> dict_get_version(d)
     103
 
-The version is not incremented if an existing key is set to the same
-value. For efficiency, values are compared by their identity:
-``new_value is old_value``, not by their content:
-``new_value == old_value``. Example::
+``dict.__setitem__(key, value)`` and ``dict.update(...)`` always
+increases the version, even if the new value is identical or is equal to
+the current value (even if ``(dict[key] is value) or (dict[key] ==
+value)``).
 
-    >>> d = {}
-    >>> value = object()
-    >>> d['key'] = value
-    >>> dict_get_version(d)
-    40
-    >>> d['key'] = value
-    >>> dict_get_version(d)
-    40
+The field is called ``ma_version_tag``, rather than ``ma_version``, to
+suggest to compare it using ``version_tag == old_version_tag``, rather
+than ``version <= old_version`` which is wrong most of the time after an
+integer overflow.
 
-.. note::
-   CPython uses some singleton like integers in the range [-5; 257],
-   empty tuple, empty strings, Unicode strings of a single character in
-   the range [U+0000; U+00FF], etc. When a key is set twice to the same
-   singleton, the version is not modified.
+
+Backwards Compatibility
+=======================
+
+Since the ``PyDictObject`` structure is not part of the stable ABI and
+the new dictionary version not exposed at the Python scope, changes are
+backward compatible.
 
 
 Implementation and Performance
@@ -234,7 +237,10 @@
 this PEP.
 
 On pybench and timeit microbenchmarks, the patch does not seem to add
-any overhead on dictionary operations.
+any overhead on dictionary operations. For example, the following timeit
+micro-benchmarks takes 318 nanoseconds before and after the change::
+
+    python3.6 -m timeit 'd={1: 0}; d[2]=0; d[3]=0; d[4]=0; del d[1]; del d[2]; d.clear()'
 
 When the version does not change, ``PyDict_GetItem()`` takes 14.8 ns for
 a dictionary lookup, whereas a guard check only takes 3.8 ns. Moreover,
@@ -286,19 +292,19 @@
   all mapping types. Implementing a new mapping type would require extra
   work for no benefit, since the version is only required on the
   ``dict`` type in practice.
-* All Python implementations must implement this new property, it gives
-  more work to other implementations, whereas they may not use the
-  dictionary version at all.
-* Exposing the dictionary version at Python level can lead the
+* All Python implementations would have to implement this new property,
+  it gives more work to other implementations, whereas they may not use
+  the dictionary version at all.
+* Exposing the dictionary version at the Python level can lead the
   false assumption on performances. Checking ``dict.__version__`` at
   the Python level is not faster than a dictionary lookup. A dictionary
-  lookup has a cost of 48.7 ns and checking a guard has a cost of 47.5
-  ns, the difference is only 1.2 ns (3%)::
+  lookup in Python has a cost of 48.7 ns and checking the version has a
+  cost of 47.5 ns, the difference is only 1.2 ns (3%)::
 
 
-    $ ./python -m timeit -s 'd = {str(i):i for i in range(100)}' 'd["33"] == 33'
+    $ python3.6 -m timeit -s 'd = {str(i):i for i in range(100)}' 'd["33"] == 33'
     10000000 loops, best of 3: 0.0487 usec per loop
-    $ ./python -m timeit -s 'd = {str(i):i for i in range(100)}' 'd.__version__ == 100'
+    $ python3.6 -m timeit -s 'd = {str(i):i for i in range(100)}' 'd.__version__ == 100'
     10000000 loops, best of 3: 0.0475 usec per loop
 
 * The ``__version__`` can be wrapped on integer overflow. It is error
@@ -313,6 +319,7 @@
   `abc.get_cache_token()
   <https://docs.python.org/3/library/abc.html#abc.get_cache_token>`_.
 * ``__version__``
+* ``__version_tag__``
 * ``__timestamp__``
 
 
@@ -322,16 +329,16 @@
 A single version per dictionary requires to keep a strong reference to
 the value which can keep the value alive longer than expected. If we add
 also a version per dictionary entry, the guard can only store the entry
-version to avoid the strong reference to the value (only strong
-references to the dictionary and to the key are needed).
+version (a simple integer) to avoid the strong reference to the value:
+only strong references to the dictionary and to the key are needed.
 
-Changes: add a ``me_version`` field to the ``PyDictKeyEntry`` structure,
-the field has the C type ``PY_UINT64_T``. When a key is created or
-modified, the entry version is set to the dictionary version which is
-incremented at any change (create, modify, delete).
+Changes: add a ``me_version_tag`` field to the ``PyDictKeyEntry``
+structure, the field has the C type ``PY_UINT64_T``. When a key is
+created or modified, the entry version is set to the dictionary version
+which is incremented at any change (create, modify, delete).
 
 Pseudo-code of an fast guard to check if a dictionary key was modified
-using hypothetical ``dict_get_version(dict)``
+using hypothetical ``dict_get_version(dict)`` and
 ``dict_get_entry_version(dict)`` functions::
 
     UNSET = object()
@@ -347,18 +354,19 @@
             """Return True if the dictionary entry did not change
             and the dictionary was not replaced."""
 
-            # read the version of the dict structure
+            # read the version of the dictionary
             dict_version = dict_get_version(self.dict)
             if dict_version == self.version:
                 # Fast-path: dictionary lookup avoided
                 return True
 
-            # lookup in the dictionary
+            # lookup in the dictionary to read the entry version
             entry_version = get_dict_key_version(dict, key)
             if entry_version == self.entry_version:
                 # another key was modified:
                 # cache the new dictionary version
                 self.dict_version = dict_version
+                self.entry_version = entry_version
                 return True
 
             # the key was modified
@@ -366,7 +374,7 @@
 
 The main drawback of this option is the impact on the memory footprint.
 It increases the size of each dictionary entry, so the overhead depends
-on the number of buckets (dictionary entries, used or unused yet). For
+on the number of buckets (dictionary entries, used or not used). For
 example, it increases the size of each dictionary entry by 8 bytes on
 64-bit system.
 
@@ -386,8 +394,8 @@
 use the ``verdict`` for namespaces (module namespace, type namespace,
 instance namespace, etc.) instead of ``dict``.
 
-Leave the ``dict`` type unchanged to not add any overhead (memory
-footprint) when guards are not needed.
+Leave the ``dict`` type unchanged to not add any overhead (CPU, memory
+footprint) when guards are not used.
 
 Technical issue: a lot of C code in the wild, including CPython core,
 expecting the exact ``dict`` type. Issues:
@@ -396,7 +404,7 @@
   use ``globals={}``. It is not possible to cast the ``dict`` to a
   ``dict`` subtype because the caller expects the ``globals`` parameter
   to be modified (``dict`` is mutable).
-* Functions call directly ``PyDict_xxx()`` functions, instead of calling
+* C functions call directly ``PyDict_xxx()`` functions, instead of calling
   ``PyObject_xxx()`` if the object is a ``dict`` subtype
 * ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some
   functions require the exact ``dict`` type.
@@ -427,7 +435,7 @@
 version tag" (``tp_version_tag``) and a "valid version tag" flag to
 types (the ``PyTypeObject`` structure).
 
-The type version tag is not available at the Python level.
+The type version tag is not exposed at the Python level.
 
 The version tag has the C type ``unsigned int``. The cache is a global
 hash table of 4096 entries, shared by all types. The cache is global to
@@ -478,7 +486,8 @@
 the field has the C type ``size_t``.
 
 Thread on python-dev: `About dictionary lookup caching
-<https://mail.python.org/pipermail/python-dev/2006-December/070348.html>`_.
+<https://mail.python.org/pipermail/python-dev/2006-December/070348.html>`_
+(December 2006).
 
 
 Guard against changing dict during iteration
@@ -488,13 +497,7 @@
 iteration (issue #19332) <https://bugs.python.org/issue19332>`_ which
 adds a ``ma_count`` field to the ``PyDictObject`` structure (``dict``
 type), the field has the C type ``size_t``.  This field is incremented
-when the dictionary is modified, and so is very similar to the proposed
-dictionary version.
-
-Sadly, the dictionary version proposed in this PEP doesn't help to
-detect dictionary mutation. The dictionary version changes when values
-are replaced, whereas modifying dictionary values while iterating on
-dictionary keys is legit in Python.
+when the dictionary is modified.
 
 
 PySizer
@@ -515,6 +518,10 @@
 
 Thread on the mailing lists:
 
+* python-dev: `Updated PEP 509
+  <https://mail.python.org/pipermail/python-dev/2016-April/144250.html>`_
+* python-dev: `RFC: PEP 509: Add a private version to dict
+  <https://mail.python.org/pipermail/python-dev/2016-April/144137.html>`_
 * python-dev: `PEP 509: Add a private version to dict
   <https://mail.python.org/pipermail/python-dev/2016-January/142685.html>`_
   (january 2016)

-- 
Repository URL: https://hg.python.org/peps


More information about the Python-checkins mailing list