[Python-ideas] Adding "+" and "+=" operators to dict

Eric Snow ericsnowcurrently at gmail.com
Fri Feb 13 19:18:22 CET 2015


On Fri, Feb 13, 2015 at 9:55 AM, Chris Barker <chris.barker at noaa.gov> wrote:
> But this:
> "Some languages may be able to optimize a + b + c + d "
>
> Got me thinking -- this is actually a really core performance problem for
> numpy. In that case, the operators are really being used for math, so you
> can't argue that they shouldn't be supported for that reason -- and we
> really want readable code:
>
> y = a*x**2 + b*x + c
>
> really reads well, but it does create a lot of temporaries that kill
> performance for large arrays. You can optimize that by hand by doing
> somethign like:
>
> y = x**2
> y *= a
> y += b*x
> y += c
>
> which really reads poorly!
>
> So I've thought for years that we should have a "numpython" interpreter that
> would parse out each expression, check its types, and if they were all numpy
> arrays, generate an optimized version that avoided temporaries, maybe even
> did nifty things like do the operations in cache-friendly blocks, multi
> thread them, whatever. (there is a package called numexp that does all these
> things already).

This should be pretty do-able with an import hook (or similarly a REPL
hook) without the need to roll your own interpreter.  There is already
prior art. [1][2][3][4]  I've been meaning for a while to write a
library to make this easier and mitigate the gotchas.  One of the
trickiest is that it can skew tracebacks (a la macros or compiler
optimizations in GDB).

>
>
> But maybe cPython itself could do an optimization step like that -- examine
> an entire expression for types, and if they all support the right
> operations, re-structure it in a more optimized way.

It's tricky in a dynamic language like Python to do a lot of
optimization at compile time, particularly in the case in the face of
operator overloading.  The problem is that there is no guarantee of
some name's type (other than `object`) nor of the behavior of the
types methods (including operators).  So optimizations like the one
you are suggesting make assumptions that cannot be checked until
run-time (to ensure the optimization is applied safely).

The stinky part is that the vast majority of the time, perhaps even
always, your optimization could be applied at compile-time.  But
because Python is so flexible, the the possibility is always there
that someone did something that breaks your assumptions.  So run-time
optimization is your only recourse.

To some extent PyPy is state-of-the-art when in comes to
optimizations, so it may be worth taking a look if the problem is
tractable for Python in general.  PEP 484 ("Type Hints") may help in a
number of ways too.  As for CPython, it has a peephole optimizer for
compile-time, and there has been some effort to optimize at the AST
level.  However, as already noted you would need the optimization to
happen at run-time.

The only solution that comes to my mind is that the compiler (and/or
optimizers) could leave a hint (emit an extra byte code, etc.) that
subsequent code is likely to be able to have some optimization
applied.  Then the interpreter could do whatever check it needs to
make sure and then apply the optimized code.  Alternately the compiler
could generate an explicit branch for the potential optimization (so
that the interpreter wouldn't need to deal with it).  I suppose there
are a number of possibilities along that line, but I'm not an expert
on the compiler and interpreter.

>
> Granted, doing this is the general case would be pretty impossible but if
> the hooks are there, then individual type-bases optimizations could be done
> -- like the current interpreter has for adding strings.

Hmm. That is another approach too.  You pay a cost for not baking it
into the compiler/interpreter, but you also get more flexibility and
portability.

-eric

[1] Peter Wang did a lightning talk at PyCon (in Santa Clara).
[2] I believe Numba uses an import hook to do its thing.
[3] Paul Tagliamonte's hy: https://github.com/hylang/hy
[4] Li Haoyi's macropy: https://github.com/lihaoyi/macropy


More information about the Python-ideas mailing list