[Python-ideas] PEP 511: API for code transformers

Victor Stinner victor.stinner at gmail.com
Sun Jan 17 06:48:59 EST 2016


2016-01-16 12:06 GMT+01:00 Petr Viktorin <encukou at gmail.com <javascript:;>>:
> This PEP addresses two things that would benefit from different
> approaches: let's call them optimizers and extensions.
>
> Optimizers, such as your FAT, don't change Python semantics. They're
> designed to run on *all* code, including the standard library. It makes
> sense to register them as early in interpreter startup as possible, but
> if they're not registered, nothing breaks (things will just be slower).
> Experiments with future syntax (like when async/await was being
> developed) have the same needs.
>
> Syntax extensions, such as MacroPy or Hy, tend to target specific
> modules, with which they're closely coupled: The modules won't run
> without the transformer. And with other modules, the transformer either
> does nothing (as with MacroPy, hopefully), or would fail altogether (as
> with Hy). So, they would benefit from specific packages opting in. The
> effects of enabling them globally range from inefficiency (MacroPy) to
> failures or needing workarounds (Hy).

To be clear, Hylang will not benefit from my PEP. That's why it is not
mentioned in the PEP.

"Syntax extensions" only look like a special case of optimizers. I'm not
sure that it's worth to make them really different.

> The PEP is designed optimizers. It would be good to stick to that use
> case, at least as far as the registration is concerned. I suggest noting
> in the documentation that Python semantics *must* be preserved, and
> renaming the API, e.g.::
>
>     sys.set_global_optimizers([])

I would prefer to not restrict the PEP to a specific usage.

> The "transformer" API can be used for syntax extensions as well, but the
> registration needs to be different so the effects are localized. For
> example it could be something like::
>
>     importlib.util.import_with_transformer(
>         'mypackage.specialmodule', MyTransformer())

Brett may help on this part. I don't think that it's the best way to use
importlib. importlib is already pluggable. As I wrote in the PEP, MacroPy
uses an import hook. (Maybe it should continue to use an import hook?)

> or a special flag in packages::
>
>     __transformers_for_submodules__ = [MyTransformer()]

Does it mean that you have to parse a .py file to then decide how to
transform it? It will slow down compilation of code not using transformers.

I would prefer to do that differently: always register transformers very
early, but configure each transformer to only apply it on some files. The
transformer can use the filename (file extension? importlib is currently
restricted to .py files by default no?), it can use a special variable in
the file (ex: fatoptimizer searchs for a __fatoptimizer__ variable which is
used to configure the optimizer), a configuration loaded when the
transformer is created, etc.

> or extendeding exec (which you actually might want to add to the PEP, to
> make giving examples easier)::
>
>     exec("print('Hello World!')", transformers=[MyTransformer()])

There are a lot of ways to load, compile and execute code. Starting to add
optional parameters will end as my old PEP 410 (
https://www.python.org/dev/peps/pep-0410/ ) which was rejected because it
added an optional parameter a lot of functions (at least 16 functions!).
(It was not the only reason to reject the PEP.)

Brett Canon proposed to add hooks to importlib, but it would restrict the
feature to imports. See use cases in the PEP, I would like to use the same
code transformers everywhere.

> Another thing: this snippet from the PEP sounds too verbose::
>
>     transformers = sys.get_code_transformers()
>     transformers.insert(0, new_cool_transformer)
>     sys.set_code_transformers(transformers)
>
> Can this just be a list, as with sys.path? Using the "optimizers" term::
>
>     sys.global_optimizers.insert(0, new_cool_transformer)

set_code_transformers() checks the transformer name and ensures that the
transformer has at least a AST transformer or a bytecode transformer.
That's why it's a function and not a simple list.

set_code_transformers() also gets the AST and bytecode transformers methods
only once, to provide a simple C structure for PyAST_CompileObject
(bytecode transformers) and PyParser_ASTFromStringObject (AST transformers).

Note: sys.implementation.cache_tag is modifiable without any check. If you
mess it, importlib will probably fail badly. And the newly added
sys.implementation.optim_tag can also be modified without any check.

> This::
>
>     def code_transformer(code, consts, names, lnotab, context):
>
> It's a function, so it would be better to name it::
>
>     def transform_code(code):

Fair enough :-) But I want the context parameter to pass additional
information.

Note: if we pass a code object, the filename is already in the code object,
but there are other informations (see below).

> And this::
>
>     def ast_transformer(tree, context):
>
> might work better with keyword arguments::
>
>     def transform_ast(tree, *, filename, **kwargs):
>
> otherwise people might use context objects with other attributes than
> "filename", breaking when a future PEP assigns a specific meaning to them.

The idea of a context object is to be "future-proof". Future versions of
Python can add new attributes without having to modify all code
transformers (or even worse, having to use kind of "#ifdef" in the code
depending on the Python version).

> It actually might be good to make the code transformer API extensible as
> well, and synchronize with the AST transformer::
>
>     def transform_code(code, *, filename, **kwargs):

**kwargs and context is basically the same, but I prefer a single parameter
rather than an ugly **kwargs. IMHO "**kwargs" cannot be called an API.

By the way, I added lately the bytecode transformers to the PEP. In fact,
we already can more informations to its context:

* compiler_flags: flags like
* optimization_level (int): 0, 1 or 2 depending on the -O and -OO command
line options
* interactive (boolean): True if interactive mode
* etc.

=> see the compiler structure in Python/compile.c.

We will have to check that these attributes make sense to other Python
implementations, or make it clear in the PEP that as sys.implementation,
each Python implementation can add specific attributes, and only a few of
them are always available.

Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160117/23f0bed9/attachment-0001.html>


More information about the Python-ideas mailing list