Revisiting Generators and Subgenerators

Winston winstonw at stratolab.com
Thu Mar 25 17:16:12 EDT 2010


I have been reading PEP 380 because I am writing a video game/
simulation in Jython and I need cooperative multitasking. PEP 380 hits
on my problem, but does not quite solve it for me. I have the
following proposal as an alternative to PEP380. I don't know if this
is the right way for me to introduce my idea, but below is my writeup.
Any thoughts?

------------------------
Proposal for a new Generator Syntax in Python 3K--
A Baton object for generators to allow subfunction to yield, and to
make
them symetric.

Abstract
--------
    Generators can be used to make coroutines. But they require the
programmer
    to take special care in how he writes his generator. In
particular,
    only the generator function may yield a value. We propose a
modification
    to generators in Python 3 where a "Baton" object is given to both
    sides of a generator. Both sides use the baton object to pass
execution
    to the other side, and also to pass values to the other side.

    The advantages of a baton object over the current scheme are: (1)
    the generator function can pass the baton to a subfunction,
solving the
    needs of PEP 380, (2) after creation both sides of the generator
function
    are symetric--they both can call yield(), send(), next(). They do
the
    same thing. This means programming with generators is the same as
    programming with normal functions. No special contortions are
needed
    to pass values back up to a yield command at the top.

Motivation
----------
    Generators make certain programming tasks easier, such as (a) an
iterator
    which is of infinite length, (b) using a "trampoline function"
they can
    emulate coroutines and cooperative multitasking, (c) they can be
used
    to make both sides of a producer-consumer pattern easy to write--
both
    sides can appear to be the caller.

    On the down side, generators as they currently are implemented in
Python 3.1
    require the programmer to take special care in how he writes his
    generator. In particular, only the generator function may yield a
    value--subfunctions called by the generator function may not yield
a value.

    Here are  two use-cases in which generators are commonly used, but
where the
    current limitation causes less readable code:

    1) a long generator function which the programmer wants to split
into several
        functions. The subfunctions should be able to yield a result.
Currently
        the subfunctions have to pass values up to the main generator
and have
        it yield the results back. Similarly subfunctions cannot
receive values
        that the caller sends with generator.send()

    2) generators are great for cooperative multitasking. A common use-
case is
        agent simulators where many small "tasklets" need to run and
then pass
        execution over to other tasklets. Video games are a common
scenario,
        as is SimPy. Without cooperative multitasking, each tasklet
must be
        contorted to run in a small piece and then return. Generators
help this,
        but a more complicated algorithm which is best decomposed into
several
        functions must be contorted because the subfuctions cannot
yield or
        recive data from the generator.send().

    Here is also a nice description of how coroutines make programs
    easier to read and write:
        http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Proposal
--------
    If there is a way to make a sub-function of a generator yield and
receive
    data from generator.send(), then the two problems above are
solved.

    For example, this declares a generator. The first parameter of the
generator
    is the "context" which represents the other side of the execution
frame.

    a Baton object represents a passing of the execution from one line
    of code to another. A program creates a Baton like so:

        generator f( baton ):
            # compute something
            baton.yield( result )
            # compute something
            baton.yield( result )

        baton = f()
        while True:
            print( baton.yield() )


    A generator function, denoted with they keyword "generator"
instead of "def"
    will return a "baton". Generators have the following methods:
        __call__( args... ) --
            This creates a Baton object which is passed back to the
caller,
            i.e. the code that executed the Baton() command. Once the
baton
            starts working, the two sides are symetric. So we will
call the
            first frame, frame A and the code inside 'function' frame
B.
            Frame is is returned a baton object. As soon as frame A
            calls baton.yield(), frame B begins, i.e. 'function'
starts
            to run. function is passed the baton as its first
argument,
            and any additional arguments are also passed in. When
frame B
            yields, any value that it yields will be returned to frame
A
            as the result of it's yield().
    Batons have the following methods:
        yield( arg=None ) -- This method will save the current
execution state,
            restore the other execution state, and start running the
            other function from where it last left off, or from the
beginning
            if this is the first time.
            If the optional 'arg' is given, then the other side will
be "returned"
            this value from it's last yield(). Note that like
generators, the first
            call to yield may not pass an argument.
        next() -- This method is the same as yield(None). next()
            allows the baton to be an iterator.
        __iter__() -- A baton is an iterator so this just returns the
baton
            back. But it is needed to allow use of batons in "for"
statements.
        start() -- This starts the frame B function running. It may
only be called on
            a new baton. It starts the baton running in frame B, and
            returns the Baton object to the caller in frame A. Any
value
            from the first yield is lost.

                baton = Baton( f ).start()
            It is equivalent to:
                baton = Baton( f )  # Create the baton
                baton.yield()       # Begin executing in frame B



Examples
--------

    Simple Generator:
        generator doubler( baton, sequence ):
            for a in sequence:
                print( a )
                baton.yield( a+a )

        baton = doubler( [3,8,2] )
        for j in baton: # For statement calls baton.__iter__, and then
baton.next()
            print j

    Complicated Generator broken into parts:

        generator Complicated( baton, sequence ):
            '''A generator function, but there are no yield statements
                in this function--they are in subfunctions.'''
            a = sequence.next()
            if is_special(a):
                parse_special( baton, a, sequence)
            else:
                parse_regular( baton, a, sequence )

        def parse_special( baton, first, rest ):
            # process first
            baton.yield()
            b = rest.next()
            parse_special( baton, b, rest )

        def parse_regular( baton, first, rest ):
            # more stuff
            baton.yield()

        baton = Complicated( iter('some data here') )
        baton.yield()


    Cooperative Multitasker:

        class Creature( object ):
            def __init__(self, world):
                self.world = world

            generator start( self, baton ):
                '''Designated entry point for tasklets'''
                # Baton saved for later. Used in other methods like
escape()
                self.baton = baton
                self.run()

            def run(self):
                pass # override me in your subclass

            def escape(self):
                # set direction and velocity away from baton creatures
                self.baton.yield()

            def chase(self):
                while True:
                    # set direction and velocity TOWARDS nearest
creature
                    self.baton.yield()
                    # if near enough, try to pounce
                    self.baton.yield()


        class Vegetarian( Tasklet ):
            def run(self):
                if self.world.is_creature_visible():
                    self.escape()
                else:
                    # do nothing
                    self.baton.yield()

        class Carnivore( Tasklet ):
            def run(self):
                if self.world.is_creature_visible():
                    self.chase()
                else:
                    # do nothing
                    self.baton.yield()

        w = SimulationWorld()
        v = Vegetarian( w ).start()
        c = Carnivore( w ).start()
        while True:
            v.yield()
            c.yield()


Benefits
--------

    This new syntax for a generator provides all the benefits of the
old
    generator, including use like a coroutine. Additionally, it makes
    both sides of the generator almost symetric, i.e. they both
"yield"
    or "send" to the other. And since the baton objects are passed
    around, subfunctions can yield back to the other execution frame.
    This fixes problems such as PEP 380.

    My ideas for syntax above are not fixed, the important concept
here is
    that the two sides of the generator functions will have a "baton"
to
    represent the other side. The baton can be passed to sub-
functions, and
    values can be sent, via the baton, to the other side.

    This new syntax for a generator will break all existing programs.
But
    we happen to be at the start of Python 3K where new paradaigms are
    being examined.

Alternative Syntax
------------------

    With old style generators, g.next() and g.send( 1 ) are
conceptually
    the same as "yield" and "yield 1" inside the generator. They both
pass
    execution to the other side, and the second form passes a value.
Yet they
    currently have different syntax. Once we have a baton object, we
can get rid of
    one of these forms. g.next() is needed to support iterators. How
about we keep
    baton.next() and baton.send( 1 ). We get rid of yield completely.

    Perhaps instead of a "generator" keyword to denote the generator
    function, a "fork" keyword should be used to begin the second
    execution frame. For example:

        def f( baton ):
            # compute something
            baton.send( result )
            # compute something
            baton.send( result )

        baton = fork f()
        while True:
            print( baton.next() )

    or maybe the "yield" keyword can be used here:

        def f( baton ):
            # compute something
            baton.send( result )
            # compute something
            baton.send( result )

        baton = yield f
        while True:
            print( baton.next() )



More information about the Python-list mailing list