[Python-ideas] CPS transform for Python

Devin Jeanpierre jeanpierreda at gmail.com
Sat Nov 10 16:50:26 CET 2012


On Sat, Nov 10, 2012 at 6:57 AM, Laurens Van Houtven <_ at lvh.cc> wrote:
> The README suggests that doing this optimization on a bytecode level may be
> better than doing it on a source/AST level. Can you explain why?

I would suggest it's because CPS directly represents control flow, as
does bytecode (which has explicit jumps and so on). ASTs represent
control flow indirectly, in that there are more constructs that can do
more varied forms of jumping around. I think that it's often harder to
write program analysis or transformation on syntax trees than on
bytecode or control flow graphs, but that's from my own experience
doing such, and may not be true for some sorts of situations.

In particular there's a directly analogous CPS form for bytecode,
where every bytecode instruction in a bytecode string is considered to
be equivalent to a CPS function, and continuations represent returns
from functions and not expression evaluation, and the CPS functions
just navigate through the bytecode stream using tail calls, while
keeping track of the expression evaluation stack and the call stack
and globals dict.

Some examples follow here (I'm doing purely functional stuff because
too tired to think through the effects of mutation). I'm obviously not
trying to be completely correct here, just trying to get the idea
across that bytecode and CPS are more closely linked than AST and CPS.
Note that exception support is entirely missing, and needs its own
stack to keep track of exception handlers.

A[i] = JUMP_FORWARD(d)
def B[i](continuation, stack, callstack, globals):
    B[i + d](continuation, stack, callstack, globals)

A[i] = BINARY_MULTIPLY()
def B[i](continuation, stack, callstack, globals):
    B[i+1](continuation, stack[:-2] + [stack[-2] * stack[-1]],
callstack, globals)

A[i] = LOAD_FAST(local)
def B[i](continuation, stack, callstack, globals):
    B[i+1](continuation, stack + [callstack[local]], callstack, globals)

A[i] = RETURN_VALUE()
def B[i](continuation, stack, callstack, globals):
    continuation(stack[-1])

A[i] = CALL_FUNCTION(argc)
def B[i](continuation, stack, callstack, globals):
    f = stack[-1];
    stack = stack[:-1]
    args = stack[-argc:]
    stack = stack[- argc]
    # let's please pretend getcallargs does the right thing
    # (it doesn't.)
    locals = inspect.getcallargs(f, *args)
    # get where f is in the bytecode (magic pseudocode)
    jump_location = f.magic_bytecode_location
    B[jump_location](
        lambda returnvalue: B[i+1](
            continuation,
            stack + [returnvalue],
            callstack,
            globals),
        [],
        callstack + [{}], # new locals dict
        f.func_globals)

and so on.

At least, I think I have that right.

-- Devin



More information about the Python-ideas mailing list