[Python-Dev] Generator details

Guido van Rossum guido@CNRI.Reston.VA.US
Sat, 10 Jul 1999 11:48:43 -0400


I've been thinking some more about Tim's single-frame generators, and
I think I understand better how to implement them now.

(And yes, it was a mistake of me to write that the suspend() and
resume() methods shouldn't be accessible to the user!  Also thanks for
the clarification of how to write a recursive generator.)

Let's say we have a generator function like this:

generator reverse(l):
    i = len(l)
    while i > 0:
        i = i-1
        suspend l[i]

and a for loop like this:

for i in reverse(range(10)):
    print i

What is the expanded version of the for loop?  I think this will work:

__value, __frame = call_generator(reverse, range(10))
while __frame:
    i = __value
    # start of original for loop body
    print i
    # end of original for loop body
    __value, __frame = resume_frame(__frame)

(Note that when the original for loop body contains 'continue', this
should jump to the resume_frame() call.  This is just pseudo code.)

Now we must define two new built-in functions: call_generator() and
resume_frame().

- call_generator() is like apply() but it returns a pair (result,
frame) where result is the function result and frame is the frame,
*if* the function returned via suspend.  If it returned via return,
call_generator() returns None for the frame.

- resume_frame() does exactly what its name suggests.  It has the same
return convention as call_generator().

Note that the for loop throws away the final (non-suspend) return
value of the generator -- this just signals the end of the loop.

How to translate the generator itself?  I've come up with two
versions.

First version: add a new bytecode SUSPEND, which does the same as
RETURN but also marks the frame as resumable.  call_generator() then
calls the function using a primitive which allows it to specify the
frame (e.g. a variant of eval_code2 taking a frame argument).  When
the call returns, it looks at the resumable bit of the frame to decode
whether to return (value, frame) or (value, None).  resume_frame()
simply marks the frame as non-resumable and continues its execution;
upon return it does the same thing as call_generator().

Alternative translation version: introduce a new builtin get_frame()
which returns the current frame.  The statement "suspend x" gets
translated to "return x, get_frame()" and the statement "return x"
(including the default "return None" at the end of the function) gets
translated to "return x, None".  So our example turns into:

def reverse(l):
    i = len(l)
    while i > 0:
        i = i-1
        return l[i], get_frame()
    return None, None

This of course means that call_generator() can be exactly the same as
apply(), and in fact we better get rid of it, so the for loop
translation becomes:

__value, __frame = reverse(range(10))
while __frame:
    ...same as before...

In a real implementation, get_frame() could be a new bytecode; but it
doesn't have to be (making for easier experimentation).

(get_frame() makes a fine builtin; there's nothing inherently
dangerous to it, in fact people get it all the time, currently using
horrible hacks!).

I'm not sure which is better; the version without call_generator()
allows you to create your own generator without using the 'generator'
and 'suspend' keywords, calling get_frame() explicitly.

Loose end: what to do when there's a try/finally around a suspend?
E.g.

generator foo(l):
    try:
        for i in l:
           suspend i+1
    finally:
        print "Done"

The second translation variant would cause "Done" to be printed on
each suspend *and* on the final return.  This is confusing (and in
fact I think resuming the frame would be a problem since the return
breaks down the try-finally blocks).

So I guess the SUSPEND bytecode is a better implementation -- it can
suspend the frame without going through try-finally clauses.

Then of course we create another loose end: what if the for loop
contains a break?  Then the frame will never be resumed and its
finally clause will never be executed!  This sounds bad.  Perhaps the
destructor of the frame should look at the 'resumable' bit and if set,
resume the frame with a system exception, "Killed", indicating an
abortion?  (This is like the kill() call in Generator.py.)  We can
increase the likelihood that the frame's desctructor is called at the
expected time (right when the for loop terminates), by deleting
__frame at the end of the loop.  If the resumed frame raises another
exception, we ignore it.  Its return value is ignored.  If it suspends
itself again, we resume it with the "Killed" exception again until it
dies (thoughts of the Blank Knight come to mind).

I am beginning to like this idea.  (Not that I have time for an
implementation...  But it could be done without Christian's patches.)

--Guido van Rossum (home page: http://www.python.org/~guido/)