Multiple assignments simplification

bearophileHUGS at lycos.com bearophileHUGS at lycos.com
Wed Oct 12 20:10:53 EDT 2005


The current version of ShedSkin (http://shedskin.sourceforge.net/
experimental Python to C++ compiler) is rather alpha, and the
development work is focused on debugging and implementing some more
basic Python functionality. But hopefully in future versions more work
will be spent to make the resulting C++ programs as fast as possible.

One of the many possible details that may be improved, is the compiling
to C++ of the Python "parallel" assignments. There are complex
situations like this one:

| a = 1
| b = 2
| c = 3
| def fun(a):
|   global b
|   return c, b ** (b+a)
| (a, b), c = fun(a), a
| print a, b, c # prints 3 8 1


Probably it isn't necessary to find an optimal solution for complex
situations like this one, ShedSkin (SS) uses a basic and simple
algorithm to translate all the complex cases.

But maybe for simpler and more regular situations it can be useful to
find better/optimal solution, like with a swap:

a = 1
b = 2
a, b = b, a

At the moment SS translates it as something like:

int __0, __1, a, b;
int __main() {
  a = 1;
  b = 2;
  __0 = b;
  __1 = a;
  a = __0;
  b = __1;

SS just copies all the variables before the assignment.
If such swap is inside a very long loop, then maybe a simplification
can speed up a program a little (I don't know if C++ compilers can do
such optimizations).

This is another example of such "regular" situations:

a, b, c, d, e, f = range(6)
a, b, c, d, e = b, d, e, f, c
print a, b, c, d, e, f
At the moment SS translates its central part just as:

__1 = b;
__2 = d;
__3 = e;
__4 = f;
__5 = c;
a = __1;
b = __2;
c = __3;
d = __4;
e = __5;

The two sides of the assignment aren't just permutations, because some
variables can be different (like f), and some variables can be present
two or more times (duplication), some other can be absent.
A code like this can be faster (and hopefully still correct):

a = b
aux_1 = c
c = e
e = aux_1
b = d
d = f

That assignment line of code can be represented as:
[0, 1, 2, 3, 4], [1, 3, 4, 5, 2]
(Numbers represent variables. The first list is always sorted,
equivalent to range(n) ).

Do you know some algorithm (or you can give some suggestions) to
minimize the number of simple assignments needed for a "regular"
situation like that?

Note that in most situations the number of variables is quite small
(but it can be big).
(Also note that in the "regular" situation I've ignored the problem of
mixed variable types, this is something SS has to take care too).

Bye and thank you,
bearophile




More information about the Python-list mailing list