[Python-checkins] CVS: python/nondist/peps pep-0288.txt,NONE,1.1

Barry Warsaw bwarsaw@users.sourceforge.net
Mon, 01 Apr 2002 08:10:21 -0800


Update of /cvsroot/python/python/nondist/peps
In directory usw-pr-cvs1:/tmp/cvs-serv24035

Added Files:
	pep-0288.txt 
Log Message:
PEP 288, Generators Attributes and Exceptions, Hettinger


--- NEW FILE: pep-0288.txt ---
PEP: 288
Title: Generators Attributes and Exceptions
Version: $Revision: 1.1 $
Last-Modified: $Date: 2002/04/01 16:10:19 $
Author: python@rcn.com (Raymond D. Hettinger)
Status: Deferred
Type: Standards Track
Created: 21-Mar-2002
Python-Version: 2.4
Post-History:


Abstract

    This PEP introduces ideas for enhancing the generators introduced
    in Python version 2.2 [1].  The goal is to increase the
    convenience, utility, and power of generators by providing a
    mechanism for passing data into a generator and for triggering
    exceptions inside a generator.

    These mechanisms were first proposed along with two other
    generator tools in PEP 279 [7].  They were split-off to this
    separate PEP to allow the ideas more time to mature and for
    alternatives to be considered.


Rationale

    Python 2.2 introduced the concept of an iterable interface as
    proposed in PEP 234 [2].  The iter() factory function was provided
    as common calling convention and deep changes were made to use
    iterators as a unifying theme throughout Python.  The unification
    came in the form of establishing a common iterable interface for
    mappings, sequences, and file objects.

    Generators, as proposed in PEP 255 [1], were introduced as a means for
    making it easier to create iterators, especially ones with complex
    internal execution or variable states.  

    The next step in the evolution of generators is to extend the
    syntax of the 'yield' keyword to enable generator parameter
    passing.  The resulting increase in power simplifies the creation
    of consumer streams which have a complex execution state and/or
    variable state.

    A better alternative being considered is to allow generators to
    accept attribute assignments.  This allows data to be passed in a
    standard Python fashion.

    A related evolutionary step is to add a generator method to enable
    exceptions to be passed to a generator.  Currently, there is no
    clean method for triggering exceptions from outside the generator.
    Also, generator exception passing helps mitigate the try/finally
    prohibition for generators.
  
    These suggestions are designed to take advantage of the existing
    implementation and require little additional effort to
    incorporate.  They are backwards compatible and require no new
    keywords.  They are being recommended for Python version 2.4.


Reference Implementation

    There is not currently a CPython implementation; however, a
    simulation module written in pure Python is available on
    SourceForge [5].  The simulation is meant to allow direct
    experimentation with the proposal.

    There is also a module [6] with working source code for all of the
    examples used in this PEP.  It serves as a test suite for the
    simulator and it documents how each of the feature works in
    practice.

    The authors and implementers of PEP 255 [1] were contacted to
    provide their assessment of whether the enhancement was going to
    be straight-forward to implement and require only minor
    modification of the existing generator code.  Neil felt the
    assertion was correct.  Ka-Ping thought so also.  GvR said he
    could believe that it was true.  Tim did not have an opportunity
    to give an assessment.


Specification for Generator Parameter Passing

    1. Allow 'yield' to assign a value as in:

        def mygen():
            while 1:
                x = yield None
                print x

    2. Let the .next() method take a value to pass to the generator as in:

        g = mygen()
        g.next()       # runs the generator until the first 'yield'
        g.next(1)      # '1' is bound to 'x' in mygen(), then printed
        g.next(2)      # '2' is bound to 'x' in mygen(), then printed

    The control flow of 'yield' and 'next' is unchanged by this
    proposal.  The only change is that a value can be sent into the
    generator.  By analogy, consider the quality improvement from
    GOSUB (which had no argument passing mechanism) to modern
    procedure calls (which can pass in arguments and return values).

    Most of the underlying machinery is already in place, only the
    communication needs to be added by modifying the parse syntax to
    accept the new 'x = yield expr' syntax and by allowing the .next()
    method to accept an optional argument.

    Yield is more than just a simple iterator creator.  It does
    something else truly wonderful -- it suspends execution and saves
    state.  It is good for a lot more than writing iterators.  This
    proposal further expands its capability by making it easier to
    share data with the generator.

    The .next(arg) mechanism is especially useful for:
        1. Sending data to any generator
        2. Writing lazy consumers with complex execution states
        3. Writing co-routines (as demonstrated in Dr. Mertz's articles [3])

    The proposal is a clear improvement over the existing alternative
    of passing data via global variables.  It is also much simpler,
    more readable and easier to debug than an approach involving the
    threading module with its attendant mutexes, semaphores, and data
    queues.  A class-based approach competes well when there are no
    complex execution states or variable states.  However, when the
    complexity increases, generators with parameter passing are much
    simpler because they automatically save state (unlike classes
    which must explicitly save the variable and execution state in
    instance variables).

    Note A: This proposal changes 'yield' from a statement to an
    expression with binding and precedence similar to lambda.


Examples
                                
    Example of a Complex Consumer

    The encoder for arithmetic compression sends a series of
    fractional values to a complex, lazy consumer.  That consumer
    makes computations based on previous inputs and only writes out
    when certain conditions have been met.  After the last fraction is
    received, it has a procedure for flushing any unwritten data.


    Example of a Consumer Stream

        def filelike(packagename, appendOrOverwrite):
            cum = []
            if appendOrOverwrite == 'w+':
                cum.extend(packages[packagename])
            try:
                while 1:
                    dat = yield None
                    cum.append(dat)
            except FlushStream:
                packages[packagename] = cum

        ostream = filelike('mydest','w')   # Analogous to file.open(name,flag)
        ostream.next()                     # Advance to the first yield
        ostream.next(firstdat)             # Analogous to file.write(dat)
        ostream.next(seconddat)
        ostream.throw(FlushStream)         # Throw is proposed below


    Example of a Complex Consumer

    Loop over the picture files in a directory, shrink them one at a
    time to thumbnail size using PIL [4], and send them to a lazy
    consumer.  That consumer is responsible for creating a large blank
    image, accepting thumbnails one at a time and placing them in a 5
    by 3 grid format onto the blank image.  Whenever the grid is full,
    it writes-out the large image as an index print.  A FlushStream
    exception indicates that no more thumbnails are available and that
    the partial index print should be written out if there are one or
    more thumbnails on it.


    Example of a Producer and Consumer Used Together in a Pipe-like Fashion

        'Analogy to Linux style pipes:  source | upper | sink'
        sink = sinkgen()
        sink.next()
        for word in source():
            sink.next(word.upper())


Comments

    Comments from GvR:  We discussed this at length when we were hashing
        out generators and coroutines, and found that there's always a
        problem with this: the argument to the first next() call has
        to be thrown away, because it doesn't correspond to a yield
        statement. This looks ugly (note that the example code has a
        dummy call to next() to get the generator going). But there
        may be useful examples that can only be programmed (elegantly)
        with this feature, so I'm reserving judgment.  I can believe
        that it's easy to implement.

    Comments from Ka-Ping Yee:  I also think there is a lot of power to be
        gained from generator argument passing.

    Comments from Neil Schemenauer:  I like the idea of being able to pass
        values back into a generator.  I originally pitched this idea
        to Guido but in the end we decided against it (at least for
        the initial implementation).  There was a few issues to work
        out but I can't seem to remember what they were.  My feeling
        is that we need to wait until the Python community has more
        experience with generators before adding this feature.  Maybe
        for 2.4 but not for 2.3.  In the mean time you can work around
        this limitation by making your generator a method.  Values can
        be passed back by mutating the instance.

    Comments for Magnus Lie Hetland:  I like the generator parameter
        passing mechanism. Although I see no need to defer it,
        deferral seems to be the most likely scenario, and in the
        meantime I guess the functionality can be emulated either by
        implementing the generator as a method, or by passing a
        parameter with the exception passing mechanism.
                               
    Author response:  Okay, consider this proposal deferred until version 2.4
        so the idea can fully mature.  I am currently teasing out two
        alternatives which may eliminate the issue with the initial
        next() call not having a corresponding yield.


Alternative 1:  Submit
                                
        Instead of next(arg), use a separate method, submit(arg).
        Submit would behave just like next() except that on the first
        call, it will call next() twice.  The word 'submit' has the
        further advantage of being explicit in its intent.  It also
        allows checking for the proper number of arguments (next
        always has zero and submit always has one).  Using this
        alternative, the call to the consumer stream looks like this:

        ostream = filelike('mydest','w') 
        ostream.submit(firstdat)             # No call to next is needed
        ostream.submit(seconddat)
        ostream.throw(FlushStream)           # Throw is proposed below


Alternative 2:  Generator Attributes
                                
        Instead of generator parameter passing, enable writable
        generator attributes: g.data=firstdat; g.next().  The code on
        the receiving end is written knowing that the attribute is set
        from the very beginning.  This solves the problem because the
        first next call does not need to be associated with a yield
        statement.

        This solution uses a standard Python tool, object attributes,
        in a standard way.  It is also explicit in its intention and
        provides some error checking (the receiving code raises an
        AttributeError if the expected field has not be set before the
        call).
                                
        The one unclean part of this approach is that the generator
        needs some way to reference itself (something parallel to the
        use of the function name in a recursive function or to the use
        of 'self' in a method).  The only way I can think of is to
        introduce a new system variable, __self__, in any function
        that employs a yield statement.  Using this alternative, the
        code for the consumer stream looks like this:

            def filelike(packagename, appendOrOverwrite):
                cum = []
                if appendOrOverwrite == 'w+':
                    cum.extend(packages[packagename])
                try:
                    while 1:
                        cum.append(__self__.dat)
                        yield None
                except FlushStream:
                    packages[packagename] = cum

            ostream = filelike('mydest','w')
            ostream.dat = firstdat; ostream.next()
            ostream.dat = firstdat; ostream.next()
            ostream.throw(FlushStream)         # Throw is proposed in PEP 279


Specification for Generator Exception Passing:

    Add a .throw(exception) method to the generator interface:

        def logger():
            start = time.time()
            log = []
            try:
                while 1:
                    log.append( time.time() - start )
                    yield log[-1]
            except WriteLog:
                writelog(log)

        g = logger()
        for i in [10,20,40,80,160]:
            testsuite(i)
            g.next()
        g.throw(WriteLog)

    There is no existing work-around for triggering an exception
    inside a generator.  This is a true deficiency.  It is the only
    case in Python where active code cannot be excepted to or through.

    Generator exception passing also helps address an intrinsic
    limitation on generators, the prohibition against their using
    try/finally to trigger clean-up code [1].  Without .throw(), the
    current work-around forces the resolution or clean-up code to be
    moved outside the generator.

    Note A: The name of the throw method was selected for several
    reasons.  Raise is a keyword and so cannot be used as a method
    name.  Unlike raise which immediately raises an exception from the
    current execution point, throw will first return to the generator
    and then raise the exception.  The word throw is suggestive of
    putting the exception in another location.  The word throw is
    already associated with exceptions in other languages.

    Alternative method names were considered: resolve(), signal(),
    genraise(), raiseinto(), and flush().  None of these seem to fit
    as well as throw().

    Note B: The throw syntax should exactly match raise's syntax:

        throw([expression, [expression, [expression]]])

    Accordingly, it should be implemented to handle all of the following:

        raise string                    g.throw(string)
        raise string, data              g.throw(string,data)
        raise class, instance           g.throw(class,instance)
        raise instance                  g.throw(instance)
        raise                           g.throw()


    Comments from GvR:  I'm not convinced that the cleanup problem that
        this is trying to solve exists in practice. I've never felt
        the need to put yield inside a try/except. I think the PEP
        doesn't make enough of a case that this is useful.

        This one gets a big fat -1 until there's a good motivational
        section.

    Comments from Ka-Ping Yee:  I agree that the exception issue needs to
        be resolved and [that] you have suggested a fine solution.

    Comments from Neil Schemenauer:  The exception passing idea is one I
        hadn't thought of before and looks interesting.  If we enable
        the passing of values back, then we should add this feature
        too.

    Comments for Magnus Lie Hetland:  Even though I cannot speak for the
        ease of implementation, I vote +1 for the exception passing
        mechanism.

    Comments from the Community:  The response has been mostly favorable.  One
        negative comment from GvR is shown above.  The other was from
        Martin von Loewis who was concerned that it could be difficult
        to implement and is withholding his support until a working
        patch is available.  To probe Martin's comment, I checked with
        the implementers of the original generator PEP for an opinion
        on the ease of implementation.  They felt that implementation
        would be straight-forward and could be grafted onto the
        existing implementation without disturbing its internals.

    Author response:  When the sole use of generators is to simplify writing
        iterators for lazy producers, then the odds of needing
        generator exception passing are slim.  If, on the other hand,
        generators are used to write lazy consumers, create
        coroutines, generate output streams, or simply for their
        marvelous capability for restarting a previously frozen state,
        THEN the need to raise exceptions will come up frequently.

        I'm no judge of what is truly Pythonic, but am still
        astonished that there can exist blocks of code that can't be
        excepted to or through, that the try/finally combination is
        blocked, and that the only work-around is to rewrite as a
        class and move the exception code out of the function or
        method being excepted.


References

    [1] PEP 255 Simple Generators
        http://python.sourceforge.net/peps/pep-0255.html

    [2] PEP 234 Iterators
        http://python.sourceforge.net/peps/pep-0234.html

    [3] Dr. David Mertz's draft column for Charming Python.
        http://gnosis.cx/publish/programming/charming_python_b5.txt
        http://gnosis.cx/publish/programming/charming_python_b7.txt

    [4] PIL, the Python Imaging Library can be found at:
        http://www.pythonware.com/products/pil/

    [5] A pure Python simulation of every feature in this PEP is at:
        http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17348&aid=513752

    [6] The full, working source code for each of the examples in this PEP
        along with other examples and tests is at:
        http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17412&aid=513756

    [7] PEP 279 Enhanced Generators
        http://python.sourceforge.net/peps/pep-0279.html


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
fill-column: 70
End: