Revisiting Generators and Subgenerators

Winston winstonw at stratolab.com
Thu Mar 25 17:39:51 EDT 2010


Here's my proposal again, but hopefully with better formatting so you
can read it easier.


-Winston
-----------------

Proposal for a new Generator Syntax in Python 3K--
A Baton object for generators to allow subfunction to yield, and to
make
them symetric.

Abstract
--------
    Generators can be used to make coroutines. But they require
    the programmer to take special care in how he writes his
    generator. In particular, only the generator function may
    yield a value. We propose a modification to generators in
    Python 3 where a "Baton" object is given to both sides of a
    generator. Both sides use the baton object to pass execution
    to the other side, and also to pass values to the other side.

    The advantages of a baton object over the current scheme are:
    (1) the generator function can pass the baton to a
    subfunction, solving the needs of PEP 380, (2) after creation
    both sides of the generator function are symetric--they both
    can call yield(), send(), next(). They do the same thing.
    This means programming with generators is the same as
    programming with normal functions. No special contortions are
    needed to pass values back up to a yield command at the top.

Motivation
----------
    Generators make certain programming tasks easier, such as (a)
    an iterator which is of infinite length, (b) using a
    "trampoline function" they can emulate coroutines and
    cooperative multitasking, (c) they can be used to make both
    sides of a producer-consumer pattern easy to write--both
    sides can appear to be the caller.

    On the down side, generators as they currently are
    implemented in Python 3.1 require the programmer to take
    special care in how he writes his generator. In particular,
    only the generator function may yield a value--subfunctions
    called by the generator function may not yield a value.

    Here are  two use-cases in which generators are commonly
    used, but where the current limitation causes less readable
    code:

    1) a long generator function which the programmer wants to
    split into several functions. The subfunctions should be able
    to yield a result. Currently the subfunctions have to pass
    values up to the main generator and have it yield the results
    back. Similarly subfunctions cannot receive values that the
    caller sends with generator.send()

    2) generators are great for cooperative multitasking. A
    common use-case is agent simulators where many small
    "tasklets" need to run and then pass execution over to other
    tasklets. Video games are a common scenario, as is SimPy.
    Without cooperative multitasking, each tasklet must be
    contorted to run in a small piece and then return. Generators
    help this, but a more complicated algorithm which is best
    decomposed into several functions must be contorted because
    the subfuctions cannot yield or recive data from the
    generator.send().

    Here is also a nice description of how coroutines make
    programs easier to read and write:
        http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Proposal
--------
    If there is a way to make a sub-function of a generator yield
    and receive data from generator.send(), then the two problems
    above are solved.

    For example, this declares a generator. The first parameter
    of the generator is the "context" which represents the other
    side of the execution frame.

    a Baton object represents a passing of the execution from one
    line of code to another. A program creates a Baton like so:

        generator f( baton ):
            # compute something
            baton.yield( result )
            # compute something
            baton.yield( result )

        baton = f()
        while True:
            print( baton.yield() )


    A generator function, denoted with they keyword "generator"
    instead of "def" will return a "baton". Generators have the
    following methods:
        __call__( args... ) --
            This creates a Baton object which is passed back to
            the caller, i.e. the code that executed the Baton()
            command. Once the baton starts working, the two sides
            are symetric. So we will call the first frame, frame
            A and the code inside 'function' frame B. Frame is is
            returned a baton object. As soon as frame A calls
            baton.yield(), frame B begins, i.e. 'function' starts
            to run. function is passed the baton as its first
            argument, and any additional arguments are also
            passed in. When frame B yields, any value that it
            yields will be returned to frame A as the result of
            it's yield().

    Batons have the following methods:
        yield( arg=None ) -- This method will save the current
            execution state, restore the other execution state,
            and start running the other function from where it
            last left off, or from the beginning if this is the
            first time. If the optional 'arg' is given, then the
            other side will be "returned" this value from it's
            last yield(). Note that like generators, the first
            call to yield may not pass an argument.
        next() -- This method is the same as yield(None). next()
            allows the baton to be an iterator.
        __iter__() -- A baton is an iterator so this just returns
            the baton back. But it is needed to allow use of
            batons in "for" statements.
        start() -- This starts the frame B function running. It
            may only be called on a new baton. It starts the
            baton running in frame B, and returns the Baton
            object to the caller in frame A. Any value from the
            first yield is lost.

                baton = Baton( f ).start()
            It is equivalent to:
                baton = Baton( f )  # Create the baton
                baton.yield()       # Begin executing in frame B



Examples
--------

    Simple Generator:
        generator doubler( baton, sequence ):
            for a in sequence:
                print( a )
                baton.yield( a+a )

        baton = doubler( [3,8,2] )
        # For statement calls baton.__iter__, and then baton.next()
        for j in baton:
            print j

    Complicated Generator broken into parts:

        generator Complicated( baton, sequence ):
            '''A generator function, but there are no yield
            statements in this function--they are in
            subfunctions.'''
            a = sequence.next()
            if is_special(a):
                parse_special( baton, a, sequence)
            else:
                parse_regular( baton, a, sequence )

        def parse_special( baton, first, rest ):
            # process first
            baton.yield()
            b = rest.next()
            parse_special( baton, b, rest )

        def parse_regular( baton, first, rest ):
            # more stuff
            baton.yield()

        baton = Complicated( iter('some data here') )
        baton.yield()


    Cooperative Multitasker:

        class Creature( object ):
            def __init__(self, world):
                self.world = world

            generator start( self, baton ):
                '''Designated entry point for tasklets'''
                # Baton saved for later. Used in other
                # methods like escape()
                self.baton = baton
                self.run()

            def run(self):
                pass # override me in your subclass

            def escape(self):
                # set direction and velocity away
                # from baton creatures
                self.baton.yield()

            def chase(self):
                while True:
                    # set direction and velocity TOWARDS
                    # nearest creature
                    self.baton.yield()
                    # if near enough, try to pounce
                    self.baton.yield()


        class Vegetarian( Tasklet ):
            def run(self):
                if self.world.is_creature_visible():
                    self.escape()
                else:
                    # do nothing
                    self.baton.yield()

        class Carnivore( Tasklet ):
            def run(self):
                if self.world.is_creature_visible():
                    self.chase()
                else:
                    # do nothing
                    self.baton.yield()

        w = SimulationWorld()
        v = Vegetarian( w ).start()
        c = Carnivore( w ).start()
        while True:
            v.yield()
            c.yield()


Benefits
--------

    This new syntax for a generator provides all the benefits of
    the old generator, including use like a coroutine.
    Additionally, it makes both sides of the generator almost
    symetric, i.e. they both "yield" or "send" to the other. And
    since the baton objects are passed around, subfunctions can
    yield back to the other execution frame. This fixes problems
    such as PEP 380.

    My ideas for syntax above are not fixed, the important
    concept here is that the two sides of the generator functions
    will have a "baton" to represent the other side. The baton
    can be passed to sub-functions, and values can be sent, via
    the baton, to the other side.

    This new syntax for a generator will break all existing
    programs. But we happen to be at the start of Python 3K where
    new paradaigms are
    being examined.

Alternative Syntax
------------------

    yield, next, and send are redundant
    ------------------------------------
    With old style generators, g.next() and g.send( 1 ) are
    conceptually the same as "yield" and "yield 1" inside the
    generator. They both pass execution to the other side, and
    the second form passes a value. Yet they currently have
    different syntax. Once we have a baton object, we can get rid
    of one of these forms. g.next() is needed to support
    iterators. How about we keep baton.next() and baton.send( 1
    ). We get rid of yield completely.

    Use keyword to invoke a generator rather than declare
    ------------------------------------------------------------
    Perhaps instead of a "generator" keyword to denote the
    generator function, a "fork" keyword should be used to begin
    the second execution frame. For example:

        def f( baton ):
            # compute something
            baton.send( result )
            # compute something
            baton.send( result )

        baton = fork f()
        while True:
            print( baton.next() )

    or maybe the "yield" keyword can be used here:

        def f( baton ):
            # compute something
            baton.send( result )
            # compute something
            baton.send( result )

        baton = yield f
        while True:
            print( baton.next() )



More information about the Python-list mailing list