[Python-Dev] Release of astoptimizer 0.3

Nick Coghlan ncoghlan at gmail.com
Tue Sep 11 14:57:25 CEST 2012


On Tue, Sep 11, 2012 at 8:41 PM, Victor Stinner
<victor.stinner at gmail.com> wrote:
> * Call builtin functions if arguments are constants. Examples:
>
>   - len("abc") => 3
>   - ord("A") => 65

This is fine in an external project, but should never be added to the
standard library. The barrier to semantic changes that break
monkeypatching should be high.

Yes, this is frustrating as it eliminates a great many interesting
static optimisations that are *probably* OK. That's one of the reasons
why PyPy uses tracing - it can perform these optimisations *and* still
include the appropriate dynamic checks.

However, the double barrier of third party module + off by default is
a suitable activation barrier for ensuring people know that what
they're doing is producing bytecode that doesn't behave like standard
Python any more (e.g. tests won't be able to shadow builtins or
optimised module references). Optimisations that break the language
semantics are heading towards the same territory as the byteplay and
withhacks modules (albeit not as evil internally).

> * Call methods of builtin types if the object and arguments are constants.
>   Examples:
>
>   - u"h\\xe9ho".encode("utf-8") => b"h\\xc3\\xa9ho"
>   - "python2.7".startswith("python") => True
>   - (32).bit_length() => 6
>   - float.fromhex("0x1.8p+0") => 1.5

That last one isn't constant, it's a name lookup. Very cool
optimisations for literals, though.

> * Call functions of math and string modules for functions without
>   border effect. Examples:
>
>   - math.log(32) / math.log(2) => 5.0
>   - string.atoi("5") => 5

Same comment applies here as for the builtin optimisation: fine in an
external project, not in the standard library (even if it's off by
default - merely having it there is still an official endorsement of
deliberately breaking the dynamic lookup semantics of our own
language).

> * Format strings for str%args and print(arg1, arg2, ...) if arguments
>   are constants and the format string is valid.
>   Examples:
>
>   - "x=%s" % 5 => "x=5"
>   - print(1.5) => print("1.5")

The print example runs afoul of the general rule above: not in the
standard library, because you're changing the values seen by a mocked
version of print()

> * Simplify expressions. Examples:
>
>   - not(x in y) => x not in y

This (and the "is") equivalent should be OK

>   - 4 and 5 and x and 6 => x and 6

So long as this is just constant folding, that should be fine, too.

>
> * Loop: replace range() with xrange() on Python 2, and list with
>   tuple.  Examples:
>
>   - for x in range(n): ... => for x in xrange(n): ...
>   - for x in [1, 2, 3]: ... => for x in (1, 2, 3): ...

Name lookup optimisations again: not in the standard library.

> * Evaluate unary and binary operators, subscript and comparaison if all
>   arguments are constants. Examples:
>
>   - 1 + 2 * 3 => 7
>   - not True => False
>   - "abc" * 3 => "abcabcabc"
>   - abcdef[:3] => abc
>   - (2, 7, 3)[1] => 7
>   - frozenset("ab") | frozenset("bc") => frozenset("abc")
>   - None is None => True
>   - "2" in "python2.7" => True
>   - "def f(): return 2 if 4 < 5 else 3" => "def f(): return 2"

Yep, literals are good.

> * Remove dead code. Examples:
>
>   - def f(): return 1; return 2 => def f(): return 1
>   - if DEBUG: print("debug") => pass with DEBUG declared as False
>   - while 0: ... => pass

Dangerous.

def f(): return 1; yield
if DEBUG: yield
while 0: yield

>>> def f():
...     if 0:
...         global x
...     return x
...
>>> f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in f
NameError: global name 'x' is not defined


> Unsafe optimizations are disabled by default. Optimizations can be enabled
> using a Config class with "features" like "builtin_funcs" (builtin functions
> like len()) or "pythonbin" (optimized code will be execute by the same
> Python binary executable).
>
> astoptimizer.patch_compile() can be used to hook the optimizer in the
> compile() builtin function. On Python 3.3, it is enough to use the optimizer
> on imports (thanks to the importlib). On older versions, the compileall
> module can be used to compile a whole project using the optimizer.
>
> I didn't start to benchmark anything yet, I focused on fixing bugs (not
> generating invalid code). I will start benchmarks when the "variables"
> feature (ex: "x=1; print(x)" => "x=1; print(1)") will work. There is an
> experimental support of variables, but it is still too agressive and
> generate invalid code in some cases (see the TODO file).
>
> I plan to implement other optimizations like unrolling loop or convert
> a loop to a list comprehension, see the TODO file.
>
> Don't hesitate to propose more optimizations if you have some ideas ;-)

Mainly just a request to be *very*, *very* clear that the unsafe
optimisations will produce bytecode that *does not behave like Python*
with respect to name lookup semantics, thus mock based testing that
relies on name shadowing will not work correctly, and neither will
direct monkeypatching.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list