[Python-ideas] PEP 511: API for code transformers

Andrew Barnert abarnert at yahoo.com
Fri Jan 15 11:12:58 EST 2016


You linked to PEP 510 #changes. I think you wanted https://www.python.org/dev/peps/pep-0511/

Sent from my iPhone

> On Jan 15, 2016, at 10:10, Victor Stinner <victor.stinner at gmail.com> wrote:
> 
> Hi,
> 
> This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API
> to implement a static Python optimizer specializing functions with
> guards.
> 
> If the PEP is accepted, it will solve a long list of issues, some
> issues are old, like #1346238 which is 11 years old ;-) I found 12
> issues:
> 
> * http://bugs.python.org/issue1346238
> * http://bugs.python.org/issue2181
> * http://bugs.python.org/issue2499
> * http://bugs.python.org/issue2506
> * http://bugs.python.org/issue4264
> * http://bugs.python.org/issue7682
> * http://bugs.python.org/issue10399
> * http://bugs.python.org/issue11549
> * http://bugs.python.org/issue17068
> * http://bugs.python.org/issue17430
> * http://bugs.python.org/issue17515
> * http://bugs.python.org/issue26107
> 
> I worked to make the PEP more generic that "this hook is written for
> FAT Python". Please read the full PEP to see a long list of existing
> usages in Python of code transformers.
> 
> You may read again the discussion which occurred 4 years ago about the
> same topic:
> https://mail.python.org/pipermail/python-dev/2012-August/121286.html
> (the thread starts with an idea of AST optimizer, but is moves quickly
> to a generic API to transform the code)
> 
> Thanks to Red Hat for giving me time to experiment on this.
> Victor
> 
> 
> HTML version:
> https://www.python.org/dev/peps/pep-0510/#changes
> 
> 
> PEP: 511
> Title: API for code transformers
> Version: $Revision$
> Last-Modified: $Date$
> Author: Victor Stinner <victor.stinner at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 4-January-2016
> Python-Version: 3.6
> 
> Abstract
> ========
> 
> Propose an API to register bytecode and AST transformers. Add also ``-o
> OPTIM_TAG`` command line option to change ``.pyc`` filenames, ``-o
> noopt`` disables the peephole optimizer. Raise an ``ImportError``
> exception on import if the ``.pyc`` file is missing and the code
> transformers required to transform the code are missing.  code
> transformers are not needed code transformed ahead of time (loaded from
> ``.pyc`` files).
> 
> 
> Rationale
> =========
> 
> Python does not provide a standard way to transform the code. Projects
> transforming the code use various hooks. The MacroPy project uses an
> import hook: it adds its own module finder in ``sys.meta_path`` to
> hook its AST transformer. Another option is to monkey-patch the
> builtin ``compile()`` function. There are even more options to
> hook a code transformer.
> 
> Python 3.4 added a ``compile_source()`` method to
> ``importlib.abc.SourceLoader``. But code transformation is wider than
> just importing modules, see described use cases below.
> 
> Writing an optimizer or a preprocessor is out of the scope of this PEP.
> 
> Usage 1: AST optimizer
> ----------------------
> 
> Transforming an Abstract Syntax Tree (AST) is a convenient
> way to implement an optimizer. It's easier to work on the AST than
> working on the bytecode, AST contains more information and is more high
> level.
> 
> Since the optimization can done ahead of time, complex but slow
> optimizations can be implemented.
> 
> Example of optimizations which can be implemented with an AST optimizer:
> 
> * `Copy propagation
>  <https://en.wikipedia.org/wiki/Copy_propagation>`_:
>  replace ``x=1; y=x`` with ``x=1; y=1``
> * `Constant folding
>  <https://en.wikipedia.org/wiki/Constant_folding>`_:
>  replace ``1+1`` with ``2``
> * `Dead code elimination
>  <https://en.wikipedia.org/wiki/Dead_code_elimination>`_
> 
> Using guards (see the `PEP 510
> <https://www.python.org/dev/peps/pep-0510/>`_), it is possible to
> implement a much wider choice of optimizations. Examples:
> 
> * Simplify iterable: replace ``range(3)`` with ``(0, 1, 2)`` when used
>  as iterable
> * `Loop unrolling <https://en.wikipedia.org/wiki/Loop_unrolling>`_
> * Call pure builtins: replace ``len("abc")`` with ``3``
> * Copy used builtin symbols to constants
> * See also `optimizations implemented in fatoptimizer
>  <https://fatoptimizer.readthedocs.org/en/latest/optimizations.html>`_,
>  a static optimizer for Python 3.6.
> 
> The following issues can be implemented with an AST optimizer:
> 
> * `Issue #1346238
>  <https://bugs.python.org/issue1346238>`_: A constant folding
>  optimization pass for the AST
> * `Issue #2181 <http://bugs.python.org/issue2181>`_:
>  optimize out local variables at end of function
> * `Issue #2499 <http://bugs.python.org/issue2499>`_:
>  Fold unary + and not on constants
> * `Issue #4264 <http://bugs.python.org/issue4264>`_:
>  Patch: optimize code to use LIST_APPEND instead of calling list.append
> * `Issue #7682 <http://bugs.python.org/issue7682>`_:
>  Optimisation of if with constant expression
> * `Issue #10399 <https://bugs.python.org/issue10399>`_: AST
>  Optimization: inlining of function calls
> * `Issue #11549 <http://bugs.python.org/issue11549>`_:
>  Build-out an AST optimizer, moving some functionality out of the
>  peephole optimizer
> * `Issue #17068 <http://bugs.python.org/issue17068>`_:
>  peephole optimization for constant strings
> * `Issue #17430 <http://bugs.python.org/issue17430>`_:
>  missed peephole optimization
> 
> 
> Usage 2: Preprocessor
> ---------------------
> 
> A preprocessor can be easily implemented with an AST transformer. A
> preprocessor has various and different usages.
> 
> Some examples:
> 
> * Remove debug code like assertions and logs to make the code faster to
>  run it for production.
> * `Tail-call Optimization <https://en.wikipedia.org/wiki/Tail_call>`_
> * Add profiling code
> * `Lazy evaluation <https://en.wikipedia.org/wiki/Lazy_evaluation>`_:
>  see `lazy_python <https://github.com/llllllllll/lazy_python>`_
>  (bytecode transformer) and `lazy macro of MacroPy
>  <https://github.com/lihaoyi/macropy#lazy>`_ (AST transformer)
> * Change dictionary literals into collection.OrderedDict instances
> * Declare constants: see `@asconstants of codetransformer
>  <https://pypi.python.org/pypi/codetransformer>`_
> * Domain Specific Language (DSL) like SQL queries. The
>  Python language itself doesn't need to be modified. Previous attempts
>  to implement DSL for SQL like `PEP 335 - Overloadable Boolean
>  Operators <https://www.python.org/dev/peps/pep-0335/>`_ was rejected.
> * Pattern Matching of functional languages
> * String Interpolation, but `PEP 498 -- Literal String Interpolation
>  <https://www.python.org/dev/peps/pep-0498/>`_ was merged into Python
>  3.6.
> 
> `MacroPy <https://github.com/lihaoyi/macropy>`_ has a long list of
> examples and use cases.
> 
> This PEP does not add any new code transformer. Using a code transformer
> will require an external module and to register it manually.
> 
> See also `PyXfuscator <https://bitbucket.org/namn/pyxfuscator>`_: Python
> obfuscator, deobfuscator, and user-assisted decompiler.
> 
> 
> Usage 3: Disable all optimization
> ---------------------------------
> 
> Ned Batchelder asked to add an option to disable the peephole optimizer
> because it makes code coverage more difficult to implement. See the
> discussion on the python-ideas mailing list: `Disable all peephole
> optimizations
> <https://mail.python.org/pipermail/python-ideas/2014-May/027893.html>`_.
> 
> This PEP adds a new ``-o noopt`` command line option to disable the
> peephole optimizer. In Python, it's as easy as::
> 
>    sys.set_code_transformers([])
> 
> It will fix the `Issue #2506 <https://bugs.python.org/issue2506>`_: Add
> mechanism to disable optimizations.
> 
> 
> Usage 4: Write new bytecode optimizers in Python
> ------------------------------------------------
> 
> Python 3.6 optimizes the code using a peephole optimizer. By
> definition, a peephole optimizer has a narrow view of the code and so
> can only implement basic optimizations. The optimizer rewrites the
> bytecode. It is difficult to enhance it, because it written in C.
> 
> With this PEP, it becomes possible to implement a new bytecode optimizer
> in pure Python and experiment new optimizations.
> 
> Some optimizations are easier to implement on the AST like constant
> folding, but optimizations on the bytecode are still useful. For
> example, when the AST is compiled to bytecode, useless jumps can be
> emited because the compiler is naive and does not try to optimize
> anything.
> 
> 
> Use Cases
> =========
> 
> This section give examples of use cases explaining when and how code
> transformers will be used.
> 
> Interactive interpreter
> -----------------------
> 
> It will be possible to use code transformers with the interactive
> interpreter which is popular in Python and commonly used to demonstrate
> Python.
> 
> The code is transformed at runtime and so the interpreter can be slower
> when expensive code transformers are used.
> 
> Build a transformed package
> ---------------------------
> 
> It will be possible to build a package of the transformed code.
> 
> A transformer can have a configuration. The configuration is not stored
> in the package.
> 
> All ``.pyc`` files of the package must be transformed with the same code
> transformers and the same transformers configuration.
> 
> It is possible to build different ``.pyc`` files using different
> optimizer tags. Example: ``fat`` for the default configuration and
> ``fat_inline`` for a different configuration with function inlining
> enabled.
> 
> A package can contain ``.pyc`` files with different optimizer tags.
> 
> 
> Install a package containing transformed .pyc files
> ---------------------------------------------------
> 
> It will be possible to install a package which contains transformed
> ``.pyc`` files.
> 
> All ``.pyc`` files with any optimizer tag contained in the package are
> installed, not only for the current optimizer tag.
> 
> 
> Build .pyc files when installing a package
> ------------------------------------------
> 
> If a package does not contain any ``.pyc`` files of the current
> optimizer tag (or some ``.pyc`` files are missing), the ``.pyc`` are
> created during the installation.
> 
> Code transformers of the optimizer tag are required. Otherwise, the
> installation fails with an error.
> 
> 
> Execute transformed code
> ------------------------
> 
> It will be possible to execute transformed code.
> 
> Raise an ``ImportError`` exception on import if the ``.pyc`` file of the
> current optimizer tag is missing and the code transformers required to
> transform the code are missing.
> 
> The interesting point here is that code transformers are not needed to
> execute the transformed code if all required ``.pyc`` files are already
> available.
> 
> 
> Code transformer API
> ====================
> 
> A code transformer is a class with ``ast_transformer()`` and/or
> ``code_transformer()`` methods (API described below) and a ``name``
> attribute.
> 
> For efficiency, do not define a ``code_transformer()`` or
> ``ast_transformer()`` method if it does nothing.
> 
> The ``name`` attribute (``str``) must be a short string used to identify
> an optimizer. It is used to build a ``.pyc`` filename. The name must not
> contain dots (``'.'``), dashes (``'-'``) or directory separators: dots
> are used to separated fields in a ``.pyc`` filename and dashes areused
> to join code transformer names to build the optimizer tag.
> 
> .. note::
>   It would be nice to pass the fully qualified name of a module in the
>   *context* when an AST transformer is used to transform a module on
>   import, but it looks like the information is not available in
>   ``PyParser_ASTFromStringObject()``.
> 
> 
> code_transformer()
> ------------------
> 
> Prototype::
> 
>    def code_transformer(code, consts, names, lnotab, context):
>        ...
>        return (code, consts, names, lnotab)
> 
> Parameters:
> 
> * *code*: the bytecode (``bytes``)
> * *consts*: a sequence of constants
> * *names*: tuple of variable names
> * *lnotab*: table mapping instruction offsets to line numbers
>  (``bytes``)
> 
> The code transformer is run after the compilation to bytecode
> 
> 
> ast_transformer()
> ------------------
> 
> Prototype::
> 
>    def ast_transformer(tree, context):
>        ...
>        return tree
> 
> Parameters:
> 
> * *tree*: an AST tree
> * *context*: an object with a ``filename`` attribute (``str``)
> 
> It must return an AST tree. It can modify the AST tree in place, or
> create a new AST tree.
> 
> The AST transformer is called after the creation of the AST by the
> parser and before the compilation to bytecode. New attributes may be
> added to *context* in the future.
> 
> 
> Changes
> =======
> 
> In short, add:
> 
> * ``-o OPTIM_TAG`` command line option
> * ``ast.Constant``
> * ``ast.PyCF_TRANSFORMED_AST``
> * ``sys.get_code_transformers()``
> * ``sys.implementation.optim_tag``
> * ``sys.set_code_transformers(transformers)``
> 
> 
> API to get/set code transformers
> --------------------------------
> 
> Add new functions to register code transformers:
> 
> * ``sys.set_code_transformers(transformers)``: set the list of code
>  transformers and update ``sys.implementation.optim_tag``
> * ``sys.get_code_transformers()``: get the list of code
>  transformers.
> 
> The order of code transformers matter. Running transformer A and then
> transformer B can give a different output than running transformer B an
> then transformer A.
> 
> Example to prepend a new code transformer::
> 
>    transformers = sys.get_code_transformers()
>    transformers.insert(0, new_cool_transformer)
>    sys.set_code_transformers(transformers)
> 
> All AST tranformers are run sequentially (ex: the second transformer
> gets the input of the first transformer), and then all bytecode
> transformers are run sequentially.
> 
> 
> Optimizer tag
> -------------
> 
> Changes:
> 
> * Add ``sys.implementation.optim_tag`` (``str``): optimization tag.
>  The default optimization tag is ``'opt'``.
> * Add a new ``-o OPTIM_TAG`` command line option to set
>  ``sys.implementation.optim_tag``.
> 
> Changes on ``importlib``:
> 
> * ``importlib`` uses ``sys.implementation.optim_tag`` to build the
>  ``.pyc`` filename to importing modules, instead of always using
>  ``opt``. Remove also the special case for the optimizer level ``0``
>  with the default optimizer tag ``'opt'`` to simplify the code.
> * When loading a module, if the ``.pyc`` file is missing but the ``.py``
>  is available, the ``.py`` is only used if code optimizers have the
>  same optimizer tag than the current tag, otherwise an ``ImportError``
>  exception is raised.
> 
> Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can
> be compiled to import a module::
> 
>    def transformers_tag():
>        transformers = sys.get_code_transformers()
>        if not transformers:
>            return 'noopt'
>        return '-'.join(transformer.name
>                        for transformer in transformers)
> 
>    def use_py():
>        return (transformers_tag() == sys.implementation.optim_tag)
> 
> The order of ``sys.get_code_transformers()`` matter. For example, the
> ``fat`` transformer followed by the ``pythran`` transformer gives the
> optimizer tag ``fat-pythran``.
> 
> The behaviour of the ``importlib`` module is unchanged with the default
> optimizer tag (``'opt'``).
> 
> 
> Peephole optimizer
> ------------------
> 
> By default, ``sys.implementation.optim_tag`` is ``opt`` and
> ``sys.get_code_transformers()`` returns a list of one code transformer:
> the peephole optimizer (optimize the bytecode).
> 
> Use ``-o noopt`` to disable the peephole optimizer. In this case, the
> optimizer tag is ``noopt`` and no code transformer is registered.
> 
> Using the ``-o opt`` option has not effect.
> 
> 
> AST enhancements
> ----------------
> 
> Enhancements to simplify the implementation of AST transformers:
> 
> * Add a new compiler flag ``PyCF_TRANSFORMED_AST`` to get the
>  transformed AST. ``PyCF_ONLY_AST`` returns the AST before the
>  transformers.
> * Add ``ast.Constant``: this type is not emited by the compiler, but
>  can be used in an AST transformer to simplify the code. It does not
>  contain line number and column offset informations on tuple or
>  frozenset items.
> * ``PyCodeObject.co_lnotab``: line number delta becomes signed to
>  support moving instructions (note: need to modify MAGIC_NUMBER in
>  importlib). Implemented in the `issue #26107
>  <https://bugs.python.org/issue26107>`_
> * Enhance the bytecode compiler to support ``tuple`` and ``frozenset``
>  constants. Currently, ``tuple`` and ``frozenset`` constants are
>  created by the peephole transformer, after the bytecode compilation.
> * ``marshal`` module: fix serialization of the empty frozenset singleton
> * update ``Tools/parser/unparse.py`` to support the new ``ast.Constant``
>  node type
> 
> 
> Examples
> ========
> 
> .pyc filenames
> --------------
> 
> Example of ``.pyc`` filenames of the ``os`` module.
> 
> With the default optimizer tag ``'opt'``:
> 
> ===========================   ==================
> .pyc filename                 Optimization level
> ===========================   ==================
> ``os.cpython-36.opt-0.pyc``                    0
> ``os.cpython-36.opt-1.pyc``                    1
> ``os.cpython-36.opt-2.pyc``                    2
> ===========================   ==================
> 
> With the ``'fat'`` optimizer tag:
> 
> ===========================   ==================
> .pyc filename                 Optimization level
> ===========================   ==================
> ``os.cpython-36.fat-0.pyc``                    0
> ``os.cpython-36.fat-1.pyc``                    1
> ``os.cpython-36.fat-2.pyc``                    2
> ===========================   ==================
> 
> 
> Bytecode transformer
> --------------------
> 
> Scary bytecode transformer replacing all strings with
> ``"Ni! Ni!  Ni!"``::
> 
>    import sys
> 
>    class BytecodeTransformer:
>        name = "knights_who_say_ni"
> 
>        def code_transformer(self, code, consts, names, lnotab, context):
>            consts = ['Ni! Ni! Ni!' if isinstance(const, str) else const
>                      for const in consts]
>            return (code, consts, names, lnotab)
> 
>    # replace existing code transformers with the new bytecode transformer
>    sys.set_code_transformers([BytecodeTransformer()])
> 
>    # execute code which will be transformed by code_transformer()
>    exec("print('Hello World!')")
> 
> Output::
> 
>    Ni! Ni! Ni!
> 
> 
> AST transformer
> ---------------
> 
> Similary to the bytecode transformer example, the AST transformer also
> replaces all strings with ``"Ni! Ni! Ni!"``::
> 
>    import ast
>    import sys
> 
>    class KnightsWhoSayNi(ast.NodeTransformer):
>        def visit_Str(self, node):
>            node.s = 'Ni! Ni! Ni!'
>            return node
> 
>    class ASTTransformer:
>        name = "knights_who_say_ni"
> 
>        def __init__(self):
>            self.transformer = KnightsWhoSayNi()
> 
>        def ast_transformer(self, tree, context):
>            self.transformer.visit(tree)
>            return tree
> 
>    # replace existing code transformers with the new AST transformer
>    sys.set_code_transformers([ASTTransformer()])
> 
>    # execute code which will be transformed by ast_transformer()
>    exec("print('Hello World!')")
> 
> Output::
> 
>    Ni! Ni! Ni!
> 
> 
> Other Python implementations
> ============================
> 
> The PEP 511 should be implemented by all Python implementation, but the
> bytecode and the AST are not standardized.
> 
> By the way, even between minor version of CPython, there are changes on
> the AST API. There are differences, but only minor differences. It is
> quite easy to write an AST transformer which works on Python 2.7 and
> Python 3.5 for example.
> 
> 
> Discussion
> ==========
> 
> * `[Python-Dev] AST optimizer implemented in Python
>  <https://mail.python.org/pipermail/python-dev/2012-August/121286.html>`_
>  (August 2012)
> 
> 
> Prior Art
> =========
> 
> AST optimizers
> --------------
> 
> In 2011, Eugene Toder proposed to rewrite some peephole optimizations in
> a new AST optimizer: issue #11549, `Build-out an AST optimizer, moving
> some functionality out of the peephole optimizer
> <https://bugs.python.org/issue11549>`_.  The patch adds ``ast.Lit`` (it
> was proposed to rename it to ``ast.Literal``).
> 
> In 2012, Victor Stinner wrote the `astoptimizer
> <https://bitbucket.org/haypo/astoptimizer/>`_ project, an AST optimizer
> implementing various optimizations. Most interesting optimizations break
> the Python semantics since no guard is used to disable optimization if
> something changes.
> 
> In 2015, Victor Stinner wrote the `fatoptimizer
> <http://fatoptimizer.readthedocs.org/>`_ project, an AST optimizer
> specializing functions using guards.
> 
> The Issue #17515 `"Add sys.setasthook() to allow to use a custom AST"
> optimizer <https://bugs.python.org/issue17515>`_ was a first attempt of
> API for code transformers, but specific to AST.
> 
> 
> Python Preprocessors
> --------------------
> 
> * `MacroPy <https://github.com/lihaoyi/macropy>`_: MacroPy is an
>  implementation of Syntactic Macros in the Python Programming Language.
>  MacroPy provides a mechanism for user-defined functions (macros) to
>  perform transformations on the abstract syntax tree (AST) of a Python
>  program at import time.
> * `pypreprocessor <https://code.google.com/p/pypreprocessor/>`_: C-style
>  preprocessor directives in Python, like ``#define`` and ``#ifdef``
> 
> 
> Bytecode transformers
> ---------------------
> 
> * `codetransformer <https://pypi.python.org/pypi/codetransformer>`_:
>  Bytecode transformers for CPython inspired by the ``ast`` module’s
>  ``NodeTransformer``.
> * `byteplay <http://code.google.com/p/byteplay/>`_: Byteplay lets you
>  convert Python code objects into equivalent objects which are easy to
>  play with, and lets you convert those objects back into living Python
>  code objects. It's useful for applying crazy transformations on Python
>  functions, and is also useful in learning Python byte code
>  intricacies. See `byteplay documentation
>  <http://wiki.python.org/moin/ByteplayDoc>`_.
> 
> See also:
> 
> * `BytecodeAssembler <http://pypi.python.org/pypi/BytecodeAssembler>`_
> 
> 
> Copyright
> =========
> 
> This document has been placed in the public domain.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list