[Python-ideas] Proto-PEP on a 'yield from' statement

Bruce Frederiksen dangyogi at gmail.com
Fri Feb 13 07:27:20 CET 2009


Raymond Hettinger wrote:
>> I would think that in addition to forwarding send values to the 
>> subgenerator, that throw exceptions sent to the delegating generator 
>> also be forwarded to the subgenerator.  If the subgenerator does not 
>> handle the exception, then it should be re-raised in the delegating 
>> generator.  Also, the subgenerator close method should be called by 
>> the delegating generator.  
>
> I recommend dropping the notion of forwarding from the proposal.
> The idea is use-case challenged, complicated, and should not be
> hidden behind new syntax.
>
> Would hate for this to become a trojan horse proposal
> when most folks just want a fast iterator pass-through mechasism:
I don't really understand your objection.  How does adding the ability 
to forward send/throw values and closing the subgenerator in any way 
whatsoever get in the way of you using this as a fast iterator 
pass-through mechanism?

I agree that 98% of the time the simple pass-through mechanism is all 
that will be required of this new feature.  And I agree that this alone 
is sufficient motivation to want to see this feature added.  But I have 
done quite a bit of work with nested generators and end up having to use 
itertools.chain, which also doesn't support the full generator 
behavior.  Specifically, in my case, I needed itertools.chain to close 
the subgenerator so that finally clauses in the subgenerator get run 
when they should on jython and ironpython. 

I put in a request of this and was turned down.  I found an alternative 
way to do it, but it's somewhat ugly:

class chain_context(object):
    def __init__(self, outer_it):
        self.outer_it = outer_iterable(outer_it)
    def __enter__(self):
        return itertools.chain.from_iterable(self.outer_it)
    def __exit__(self, type, value, tb): self.outer_it.close()

class outer_iterable(object):
    def __init__(self, outer_it):
        self.outer_it = iter(outer_it)
        self.inner_it = None
    def __iter__(self): return self
    def close(self):
        if hasattr(self.inner_it, '__exit__'):
            self.inner_it.__exit__(None, None, None)
        elif hasattr(self.inner_it, 'close'): self.inner_it.close()
        if hasattr(self.outer_it, 'close'): self.outer_it.close()
    def next(self):
        ans = self.outer_it.next()
        if hasattr(ans, '__enter__'):
            self.inner_it = ans
            return ans.__enter__()
        ans = iter(ans)
        self.inner_it = ans
        return ans

and then use as:

    with chain_context(gen(x) for x in iterable) as it:
        for y in it:
            ...

So from my own experience, I would strongly argue that the new yield 
from should at least honor the generator close method.  Perhaps some 
people here have never run python with a different garbage collector 
that doesn't immediately reclaim garbage objects, so they don't 
understand the need for this.  Jython and ironpython are both just 
coming out with their 2.5 support; so expect to hear more of these 
complaints in the not to distant future from that crowd...

But I am baffled why the python community adopts these extra methods on 
generators and then refuses to support them anywhere else (for loops, 
itertools)?  Is this a case of "well, I didn't vote for them, so I'm not 
going to play ball"?  If that's the case, then perhaps send and throw 
should be retracted.  I know that close is necessary when you move away 
from the reference counting collector, so I'll fight to keep that; as 
well as fight to get the rest of python to play ball with it.  I haven't 
seen a need for send or throw myself.  I've played a lot with send and 
it always seems to get too complicated, so I wouldn't fight for that 
one.  I can imagine possible uses for throw, but haven't hit them yet 
myself in actual practice; so I'd only fight somewhat for throw.  If 
send/throw were mistakes, let's document that and urge people not to use 
them and make a plan for deprecating them and removing them from the 
language; and figure out what the right answers are.

But if send/throw/close were not mistakes and are done deals, then let's 
support them!  In all of these cases, adding full support for 
send/throw/close does not require that you use any of them.  It does not 
prevent using simple iterators rather than full blown generators.  It 
does not diminish in any way the current capabilities of these other 
language features.  It simply supports and allows the use of 
send/throw/close when needed.  Otherwise, why did we put 
send/throw/close into the language in the first place?

I would dearly love to see the for statement fully support close and 
throw, since that's where you use generators 99% of the time.  Maybe 
this one needs different syntax to not break existing code.  I'm not 
very good with clever syntax, so you may be able to improve on these:

for i from gen(x):

for i finally in gen(x):

for i in gen(x) closing throwing:

for i in final gen(x):

for gen(x) yielding i:

for gen(x) as i:

The idea is that close should be called when the for loop terminates 
(for any reason), and uncaught exceptions in the for body should be sent 
to the generator using throw, and then only propagated outside of the 
for statement if they are not handled by throw.  And, yes, the for 
statement should not do these things if a simple iterator is used rather 
than a generator.

If you wanted to support the send method too, then maybe something like:

for gen1(x) | gen2(y) as i:

where the values yielded by gen1 are sent to gen2 with send, and then 
the values yielded by gen2 are bound to i.

If this were adopted, I would also recommend that if gen2 were a 
function rather than a generator, then the function be called on each 
value yielded by gen1 and the results of the function bound to i.  Then

for gen(x) | fun as i:

would be like:

for map(fun, gen(x)) as i:

Of course, this leads to simply using map rather | to combine generators 
by making map use send if passed a generator as it's first argument:

for map(gen2(y), gen1(x)) as i:

But this doesn't scale as well syntactically when you want to chain 
several generators together.

for map(gen3(z), map(gen2(y), gen1(x))) as i:

vs

for gen1(x) | gen2(y) | gen3(z) as i:

Unfortunately, the way that send is currently defined, gen2 can't skip 
values to act as a filter or generate multiple values for one value sent 
in.  To do this would require that the operations of getting another 
value sent in and yielding values be separated, rather than combined as 
they are for send.  One way to do this is to use callbacks for getting 
another value.  This could be done using the current next semantics by 
simply treating the callback as an iterator and passing it as another 
parameter to the generator:

for gen2(y, gen1(x)) as i:

This is exactly what's currently being done by the itertools functions.  
But this also doesn't scale well syntactically when stacking up several 
generators.

A better way would be to allow send and next to raise a new NextValue 
exception when the generator wants another value sent in.  Then a new 
receive expression would be used in the generator to get the value.  
This would act like an iterator within the generator:

def filter(pred):
    for var in receive:
        if pred(var):
            yield var

which would be used like this down at the basic iterator level:

it = filter(some_pred)
for x in some_iterable:
    try:
        value = it.send(x)
        while True:
            process(value)
            value = next(it)
    except NextValue:
        pass

and this would done automatically by the new for statement:

for some_iterable | filter(some_pred) as value:
    process(value)

this also allows generators to generate multiple values for each value 
received:

def repeat(n):
    for var in receive:
        for i in range(n):
            yield var

for some_iterable | repeat(3) as value:
    process(value)

With the new yield from syntax, your threesomes example becomes:

def threesomes():
    yield from receive | repeat(3)

Or even just:

def threesomes():
    return repeat(3)

Other functions can be done in this style too:

def map(fn):
    for var in receive:
        yield fn(var)

So that stacking these all up is much more readable syntactically:

for gen1(x) | filter(some_pred) | map(add_1) | threesomes() as i:

You have to admit that this is much more readable than:

for threesomes(map(add_1, filter(some_pred, gen1(x)))) as i:

-bruce frederiksen



More information about the Python-ideas mailing list