[Python-ideas] PEP 511: API for code transformers

Fri Jan 15 17:16:38 EST 2016

2016-01-15 21:39 GMT+01:00 Yury Selivanov <yselivanov.ml at gmail.com>:
> All your PEPs are very interesting, thanks for your hard work!
> I'm very happy to see that we're trying to make CPython faster.

Thanks.

> It's important to say that all of those issues (except 2506)
> are not bugs, but proposals to implement some nano- and
> micro- optimizations.

Hum, let me see.

>> * http://bugs.python.org/issue1346238
https://bugs.python.org/issue11549

"A constant folding optimization pass for the AST" & "Build-out an AST
optimizer, moving some functionality out of the peephole optimizer"

Well, that's a way to start working on larger optimizations.

Anyway, the peephole optimizer has many limits. Raymond Hettinger
keeps repeating that it was designed to be simple and limited. And
each time, suggested to reimplement the peephole optimize in pure
Python (as I'm proposing).

On AST, we can do much better than just 1+1, even without changing the
Python semantics.

But I'm ok that speedup are minor on such changes. Without
specialization and guards, you are limited.

>> * http://bugs.python.org/issue2181

"optimize out local variables at end of function"

Alone, this optimization is not really interesting. But other
optimizations can produce inefficient code. Example with loop
unrolling:

    for i in range(2):
        print(i)

is replaced with:

    i = 0
    print(i)

    i = 1
    print(i)

with constant propagation, it becomes:

    i = 0
    print(0)

    i = 1
    print(1)

at the point, i variable becomes useless and can removed the
optimization mentioned in http://bugs.python.org/issue2181

    print(0)
    print(1)

>> * http://bugs.python.org/issue10399

"AST Optimization: inlining of function calls"

IMHO this one is really interesting. But again, not alone, but when
combined with other optimizations.

>> Usage 2: Preprocessor
>> ---------------------
>>
>> A preprocessor can be easily implemented with an AST transformer. A
>> preprocessor has various and different usages.
>>   3.6.
>
> [..]
>
> I think that most of those examples are rather weak.  Things like
> tail-call optimizations, constants declarations, pattern matching,
> case classes (from MacroPy) are nice concepts, but they should be
> either directly implemented in Python language or not used at all
> (IMHO).

At least, it allows to experiment new things. If a transformer becomes
popular, we can start to discuss integrating into Python.

About tail recursion, I recall that Guido wrote something about it:
http://neopythonic.blogspot.fr/2009/04/tail-recursion-elimination.html

I found a lot of code transformers projects. I understand that there
is a real need.

In a previous job, we used a text preprocessor to remove all calls to
log.debug() to release the code to the production. It was in the
embedded world (set top boxes), where performances matter. The
preprocessor was based on long and unreliable regular expressions. I
would prefer to use AST for that. That's my first item in the list:
"Remove debug code like assertions and logs to make the code faster to
run it for production."

> Things like auto-changing dictionary literals to OrderedDict
> objects or in-Python DSLs will only help in creating hard to
> maintain code base.  I say this because I have a first-hand
> experience with decorators that patch opcodes, and import
> hooks that rewrite AST.  When you get back to your code years
> after it was written, you usually regret about doing those things.

To be honest, I don't plan to use such macros, they look too magic,
and change Python semantics too much. But I dont want to restrict
users to do cool things in their sandbox. In my experience, Python
developers are good enough to make decision.

When the f-string PEP was discussed, I was strongly opposed to allow
*any* Python expressions in f-string. But Guido said that the language
designers must not restrict users. Well, something like, I probably
misuse his quote ;-)

> All in all, I think that adding a blessed API for preprocessors
> shouldn't be a focus of this PEP.  MacroPy works right now
> with importlib, and I think it's a good solution for it.

Do you mean that we should add the feature but add a warning in the
doc like "don't use it for evil things"?

I don't think that we can forbid users for specific usage of an API.
The only strong solution to ensure that users will not misuse an API
is to not add the API (reject the PEP) :-) So I chose instead to
document different kinds of usage of code transformers, just to know
how they can be used.

> I propose to only expose new APIs on the C level,
> and explicitly mark them as provisional and experimental.
> It should be clear, that those APIs are only for
> *writing optimizers*, and nothing else.

Currently, the PEP adds:

* -o OPTIM_TAG command line option
* sys.implementation.optim_tag
* sys.get_code_transformers()
* sys.set_code_transformers(transformers)
* ast.Constant
* ast.PyCF_TRANSFORMED_AST

importlib uses sys.implementation.optim_tag and
sys.get_code_transformers(). *If* we want to remove them, we should
find a way to expose these information to importlib.

I really like ast.Constant, I would like to add it, but it's really a
minor part of the PEP. I don't think that it's controversal.

PyCF_TRANSFORMED_AST can only be exposed at the C level.

"-o OPTIM_TAG command line option" is a shortcut to set
sys.implementation.optim_tag. optim_tag can be set manually. But the
problem is to be able to set the optim_tag before the first Python
module is imported. It doesn't seem easy to avoid this change.
According to Brett, the whole PEP can be simplified to this single
command line option :-)

> [off-topic] I do think that having a macro system similar to
> Rust might be a good idea.  However, macro in Rust have explicit
> and distinct syntax, they have the necessary level of
> documentation and tooling.  But this is a separate matter
> deserving its own PEP ;)

I agree that extending the Python syntax is out of the scope of the PEP 511.

> Would it be possible to (or does it make any sense):
>
> 1. Add new APIs for AST transformers (only exposed on the C
> level!)
>
> 2. Remove the peephole optimizer.

FYI my fatoptimizer is quite slow. But it implements a lot of
optimizations, much more than the Python peephole optimizer.

I fear that the conversions are expensive:

* AST (light) internal objects => Python (heavy) AST objects
* (run AST optimizers implemented in Python)
* Python (heavy) AST objects => AST (light) internal objects

So in a near future, I prefer to keep the peephole optimizer
implemented in C. The performance of the optimizer itself matters when
you run a short script using "python script.py" (without compilation
ahead of time).

> I also want to say this: I'm -1 on implementing all three PEPs
> until we see that FAT is able to give us at least 10% performance
> improvement on micro-benchmarks.  We still have several months
> before 3.6beta to see if that's possible.

I prefer to not start benchmarking fatoptimizer because I spent 3
months just to design the API, fix bugs, etc. I only few a small
fraction of time on writing optimizations. I expect significan
speedups with more optimizations like function inlining. If you are
curious, take a look at the todo list:
https://fatoptimizer.readthedocs.org/en/latest/todo.html

I understand that an optimizer which does not produce faster code is
not really interesting. My PEPs request many changes which become part
of the public API and have to be maintained later.

I already changed the PEP 509 and 510 to make the changes private
(only visible in the C API).

Victor