[Python-Dev] Release of astoptimizer 0.3

Victor Stinner victor.stinner at gmail.com
Tue Sep 11 23:47:06 CEST 2012


>> * Call builtin functions if arguments are constants. Examples:
>>
>>    - len("abc") => 3
>>    - ord("A") => 65
>
> Does this mean transformation print("A") => None and output at compile time?

Only a subset of builtin functions are called at compile time, and not
with any constant value. See astoptimizer/config_builtin_funcs.py.

print() has other optimizations, but it is not called at compile time.

>> * Loop: replace range() with xrange() on Python 2,
>
> range(0, 10**10, 10**9)

Oh. I didn't know this difference between range and xrange. I should
add more checks on arguments.

>> * Evaluate unary and binary operators, subscript and comparaison if all
>>    arguments are constants. Examples:
>>
>>    - 1 + 2 * 3 => 7
>
> Does this mean 1/0 raises exception at compile time?

Such trap are handled carefully. There are many checks to decide if an
operator can be optimized or not.
Example: "\U0010ffff"[0] is not optimized by default.

>> * Remove dead code. Examples:
>>
>>    - def f(): return 1; return 2 => def f(): return 1
>>    - if DEBUG: print("debug") => pass with DEBUG declared as False
>>    - while 0: ... => pass
>
> As Nick said, it could change the semantics, if there were local variable
> assignment or yield in removed code.

yield is handled correctly, but local variable is a new problem.

I should find something to detect local variables. I already have code
to handle variable, but it not fully enabled because there are known
bugs.

> It would be good if the optimizer checked if the built-in names overridden
> in this module or function and disabled this dangerous optimization in such
> case.

Simple cases like "len=chr" or "from math import floor as int" are now
handled correctly in the new version 0.3.1.

> set([1, 2, 3]) => {1, 2, 3}

I don't see how to create literal set or frozen in AST.

set([...]) may be converted to set((...)) at least.

It would be nice to have a new ast.Lit() in Python 3.4: the idea was
already proposed (with a patch, 0_ast.patch) in the issue #11549.

> set([x for ...]) => {x for ...}
> dict([(k, v) for ...]) => {k: v for ...}
> dict((k, v) for ...) => {k: v for ...}
> ''.join([s for ...]) => ''.join(s for ...)
> a.extend([s for ...]) => a.extend(s for ...)

These optimizations look correct.

> (f(x) for x in a) => map(f, a)
> (x.y for x in a) => map(operator.attrgetter('y'), a)
> (x[0] for x in a) => map(operator.itemgetter(0), a)
> (2 * x for x in a) => map((2).__mul__, a)
> (x in b for x in a) => map(b.__contains__, a)
> map(lambda x: x.strip(), a) => (x.strip() for x in a)

Is it faster? :-)

> x in ['i', 'em', 'cite'] => x in {'i', 'em', 'cite'}

A list can contain non-hashable objects, whereas a set can not.

> x == 'i' or x == 'em' or x == 'cite'] => x in {'i', 'em', 'cite'}

You need to know the type of x. Depending on the type, x.__eq__ and
x.__contains__ may be completly different.

> a = []; for ...: a.append(x) => a = [x for ...]
> a = (); for ...: a.add(x) => a = {x for ...}
> a = {}; for ...: a[k] = v => a = {k: v for ...}

Yeah, the first example is already in the TODO list :-) But it's not
trivial. I prefer to start with unrolling loops :-)

> for ...: f.write(...) => __fwrite = f.write; for ...: __fwrite(...)

f.write lookup cannot be optimized.

> x = x + 1 => x += 1
> x = x + ' ' => x += ' '

I don't know if these optimizations are safe.

> x = x + [y] => x.append(y)
> x = x + [y, z] => x.extend([y, z])

It behaves differently if x is not a list, but str for example.

> 'x=%s' % repr(x) => 'x=%a' % (x,)

I don't understand this one.

> 'x=%s' % x + s => 'x=%s%s' % (x, s)
> x = x + ', [%s]' % y => x = '%s, [%s]' % (x, y)

Doesn't work if s type is not str.

> range(0, x) => range(x)

Is it faster?

> for i in range(len(a)): x = a[i] ... => for i, x in enumerate(a): ...
> i = 0; for x in a: i += 1 ... => for i, x in enumerate(a, 1): ...
> i = 1; for x in a: ... i += 1 => for i, x in enumerate(a, 1): ...
> s = 0; for ...: if ...: s += 1 => s = sum(1 for ... if ...)

These optimizations look interesting.

> while True: s = f.readline(); if not s: break; ... => for s in f: ...

Too much assumptions on f type.

> def f(x): ... len() ... => def f(x, __len = len): ... __len() ...

Ah yes, I read somewhere than locals are faster than builtins. But
this specific example is wrong, f(1, 2) must fail. It is possible to
write something like:

def create_f():
  len = len
  def f(x):
     ... len ...
  return f
f = create_f(); del f

Victor


More information about the Python-Dev mailing list