[Python-Dev] PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements

Brett Cannon brett at python.org
Sat Apr 16 22:57:09 CEST 2011


In the grand python-dev tradition of "silence means acceptance", I consider
this PEP finalized and implicitly accepted.

On Tue, Apr 12, 2011 at 15:07, Brett Cannon <brett at python.org> wrote:

> Here is the next draft of the PEP. I changed the semantics requirement to
> state that 100% branch coverage is required for any Python code that is
> being replaced by accelerated C code instead of the broad "must be
> semantically equivalent". Also tweaked wording here and there to make
> certain things more obvious.
>
> ----------------------------------
>
> PEP: 399
> Title: Pure Python/C Accelerator Module Compatibility Requirements
>
> Version: $Revision: 88219 $
> Last-Modified: $Date: 2011-01-27 13:47:00 -0800 (Thu, 27 Jan 2011) $
> Author: Brett Cannon <brett at python.org>
> Status: Draft
> Type: Informational
> Content-Type: text/x-rst
> Created: 04-Apr-2011
> Python-Version: 3.3
> Post-History: 04-Apr-2011, 12-Apr-2011
>
>
> Abstract
> ========
>
> The Python standard library under CPython contains various instances
> of modules implemented in both pure Python and C (either entirely or
> partially). This PEP requires that in these instances that the
> C code *must* pass the test suite used for the pure Python code
> so as to act as much as a drop-in replacement as possible
> (C- and VM-specific tests are exempt). It is also required that new
>
> C-based modules lacking a pure Python equivalent implementation get
> special permissions to be added to the standard library.
>
>
> Rationale
> =========
>
> Python has grown beyond the CPython virtual machine (VM). IronPython_,
> Jython_, and PyPy_ all currently being viable alternatives to the
> CPython VM. This VM ecosystem that has sprung up around the Python
> programming language has led to Python being used in many different
> areas where CPython cannot be used, e.g., Jython allowing Python to be
> used in Java applications.
>
> A problem all of the VMs other than CPython face is handling modules
> from the standard library that are implemented (to some extent) in C.
>
> Since they do not typically support the entire `C API of Python`_ they
> are unable to use the code used to create the module. Often times this
> leads these other VMs to either re-implement the modules in pure
> Python or in the programming language used to implement the VM
> (e.g., in C# for IronPython). This duplication of effort between
> CPython, PyPy, Jython, and IronPython is extremely unfortunate as
> implementing a module *at least* in pure Python would help mitigate
> this duplicate effort.
>
> The purpose of this PEP is to minimize this duplicate effort by
> mandating that all new modules added to Python's standard library
> *must* have a pure Python implementation _unless_ special dispensation
> is given. This makes sure that a module in the stdlib is available to
> all VMs and not just to CPython (pre-existing modules that do not meet
> this requirement are exempt, although there is nothing preventing
> someone from adding in a pure Python implementation retroactively).
>
>
> Re-implementing parts (or all) of a module in C (in the case
> of CPython) is still allowed for performance reasons, but any such
> accelerated code must pass the same test suite (sans VM- or C-specific
> tests) to verify semantics and prevent divergence. To accomplish this,
> the test suite for the module must have 100% branch coverage of the
> pure Python implementation before the acceleration code may be added.
>
> This is to prevent users from accidentally relying
> on semantics that are specific to the C code and are not reflected in
> the pure Python implementation that other VMs rely upon. For example,
> in CPython 3.2.0, ``heapq.heappop()`` does an explicit type
> check in its accelerated C code while the Python code uses duck
> typing::
>
>
>     from test.support import import_fresh_module
>
>     c_heapq = import_fresh_module('heapq', fresh=['_heapq'])
>     py_heapq = import_fresh_module('heapq', blocked=['_heapq'])
>
>
>     class Spam:
>         """Tester class which defines no other magic methods but
>         __len__()."""
>         def __len__(self):
>             return 0
>
>
>     try:
>         c_heapq.heappop(Spam())
>     except TypeError:
>         # Explicit type check failure: "heap argument must be a list"
>
>         pass
>
>     try:
>         py_heapq.heappop(Spam())
>     except AttributeError:
>         # Duck typing failure: "'Foo' object has no attribute 'pop'"
>
>         pass
>
> This kind of divergence is a problem for users as they unwittingly
> write code that is CPython-specific. This is also an issue for other
> VM teams as they have to deal with bug reports from users thinking
> that they incorrectly implemented the module when in fact it was
> caused by an untested case.
>
>
> Details
> =======
>
> Starting in Python 3.3, any modules added to the standard library must
> have a pure Python implementation. This rule can only be ignored if
> the Python development team grants a special exemption for the module.
> Typically the exemption will be granted only when a module wraps a
>
> specific C-based library (e.g., sqlite3_). In granting an exemption it
> will be recognized that the module will be considered exclusive to
>
> CPython and not part of Python's standard library that other VMs are
> expected to support. Usage of ``ctypes`` to provide an
> API for a C library will continue to be frowned upon as ``ctypes``
> lacks compiler guarantees that C code typically relies upon to prevent
> certain errors from occurring (e.g., API changes).
>
> Even though a pure Python implementation is mandated by this PEP, it
> does not preclude the use of a companion acceleration module. If an
> acceleration module is provided it is to be named the same as the
> module it is accelerating with an underscore attached as a prefix,
> e.g., ``_warnings`` for ``warnings``. The common pattern to access
> the accelerated code from the pure Python implementation is to import
> it with an ``import *``, e.g., ``from _warnings import *``. This is
> typically done at the end of the module to allow it to overwrite
> specific Python objects with their accelerated equivalents. This kind
> of import can also be done before the end of the module when needed,
> e.g., an accelerated base class is provided but is then subclassed by
> Python code. This PEP does not mandate that pre-existing modules in
> the stdlib that lack a pure Python equivalent gain such a module. But
> if people do volunteer to provide and maintain a pure Python
> equivalent (e.g., the PyPy team volunteering their pure Python
> implementation of the ``csv`` module and maintaining it) then such
> code will be accepted.
>
> This requirement does not apply to modules already existing as only C
> code in the standard library. It is acceptable to retroactively add a
> pure Python implementation of a module implemented entirely in C, but
> in those instances the C version is considered the reference
> implementation in terms of expected semantics.
>
> Any new accelerated code must act as a drop-in replacement as close
> to the pure Python implementation as reasonable. Technical details of
> the VM providing the accelerated code are allowed to differ as
> necessary, e.g., a class being a ``type`` when implemented in C. To
> verify that the Python and equivalent C code operate as similarly as
> possible, both code bases must be tested using the same tests which
> apply to the pure Python code (tests specific to the C code or any VM
> do not follow under this requirement). To make sure that the test
> suite is thorough enough to cover all relevant semantics, the tests
> must have 100% branch coverage for the Python code being replaced by
> C code. This will make sure that the new acceleration code will
> operate as much like a drop-in replacement for the Python code is as
> possible. Testing should still be done for issues that come up when
> working with C code even if it is not explicitly required to meet the
> coverage requirement, e.g., Tests should be aware that C code typically
> has special paths for things such as built-in types, subclasses of
> built-in types, etc.
>
> Acting as a drop-in replacement also dictates that no public API be
>
> provided in accelerated code that does not exist in the pure Python
> code.  Without this requirement people could accidentally come to rely
>  on a detail in the accelerated code which is not made available to
>
> other VMs that use the pure Python implementation. To help verify
> that the contract of semantic equivalence is being met, a module must
> be tested both with and without its accelerated code as thoroughly as
> possible.
>
> As an example, to write tests which exercise both the pure Python and
> C accelerated versions of a module, a basic idiom can be followed::
>
>
>     import collections.abc
>     from test.support import import_fresh_module, run_unittest
>     import unittest
>
>     c_heapq = import_fresh_module('heapq', fresh=['_heapq'])
>     py_heapq = import_fresh_module('heapq', blocked=['_heapq'])
>
>
>     class ExampleTest(unittest.TestCase):
>
>         def test_heappop_exc_for_non_MutableSequence(self):
>             # Raise TypeError when heap is not a
>             # collections.abc.MutableSequence.
>             class Spam:
>                 """Test class lacking many ABC-required methods
>                 (e.g., pop())."""
>                 def __len__(self):
>                     return 0
>
>             heap = Spam()
>             self.assertFalse(isinstance(heap,
>                                 collections.abc.MutableSequence))
>             with self.assertRaises(TypeError):
>                 self.heapq.heappop(heap)
>
>
>     class AcceleratedExampleTest(ExampleTest):
>
>         """Test using the accelerated code."""
>
>
>         heapq = c_heapq
>
>
>     class PyExampleTest(ExampleTest):
>
>         """Test with just the pure Python code."""
>
>         heapq = py_heapq
>
>
>     def test_main():
>         run_unittest(AcceleratedExampleTest, PyExampleTest)
>
>
>     if __name__ == '__main__':
>         test_main()
>
>
> If this test were to provide 100% branch coverage for
> ``heapq.heappop()`` in the pure Python implementation then the
> accelerated C code would be allowed to be added to CPython's standard
> library. If it did not, then the test suite would need to be updated
> until 100% branch coverage was provided before the accelerated C code
> could be added.
>
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
>
> .. _IronPython: http://ironpython.net/
> .. _Jython: http://www.jython.org/
> .. _PyPy: http://pypy.org/
> .. _C API of Python: http://docs.python.org/py3k/c-api/index.html
> .. _sqlite3: http://docs.python.org/py3k/library/sqlite3.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110416/28c1e5f7/attachment-0001.html>


More information about the Python-Dev mailing list