[Python-Dev] PEP 380 (yield from a subgenerator) comments

Fri Mar 27 18:04:18 CET 2009

At 03:28 AM 3/27/2009 -0400, Scott Dial wrote:
>P.J. Eby wrote:
> > One remaining quirk or missing piece: ISTM there needs to be a way to
> > extract the return value without using a yield-from statement.  I mean,
> > you could write a utility function like:
> >
> >    def unyield(geniter):
> >        try:
> >            while 1: geniter.next()
> >        except GeneratorReturn as v:
> >            return v.value
>
>My first thought was to ask why it was not equivalent to say:
>
>     x = yield g
>     x = yield from g
>
>This would seem like a more obvious lack of parallelism to pick on wrt.
>return values.

Because yield-from means you're "inlining" the generator, such that 
sends go into that generator, rather than into the current generator.

>This unyield() operation seems contrived. Never before have you been
>able to write a generator that returns a value, why would these suddenly
>become common practice? The only place a return value seems useful is
>when refactoring a generator and you need to mend having loss of a
>shared scope. What other use is there for a return value?

The use case which these things are being proposed for is to replace 
most of the stack-management code that's currently needed for 
coroutine trampolines.  In such a case, you're likely using 
generators to perform long-running asynchronous operations, or else 
coroutines where two functions are co-operating to produce a result, 
each with its own control flow.

For example, you might have a generator that yields socket objects to 
wait for them to be ready to read or write, then returns a line of 
text read from the socket.  You would unyield this if you wanted to 
write top-level code that was *not* also such a task.  Similarly, you 
might write coroutines where one reads data from a file and sends it 
to a parser, and then the parser sends data back to a main program.

In either case, an unyield would either be the synchronous top-level 
loop of the program, or part of the top-level code.  Either you need 
to get the finished top-level object from your parser at the end of 
its operation, or you are waiting for all your asynchronous I/O tasks 
to complete.

>It would seem unfortunate for it to be considered a runtime error since
>this would prevent sharing a generator amongst "yield from" and
>non-"yield from" use cases.

Has anyone shown a use case for doing so?  I might be biased due to 
previous experience with these things, but I don't see how you write 
a function where both the yielded values *and* the return value are 
useful...  and if you did, you'd still need some sort of unyield operation.

Notice that in both the I/O and coroutine use cases, the point of 
yielding is primarily *to allow other code to execute*, and possibly 
pass a value back IN to the generator.  The values passed *out* by 
the generator are usually either ignored, an indicator of what the 
generator wants to be passed back in, or what sort of event it is 
waiting for before it's to be resumed.

In other words, they're usually not data -- they're just something 
that gets looped over as the task progresses.

>As Greg has said a number of times, we allow functions to return values
>with them silently being ignored all the time.

Sure.  But right now, the return value of a generator function *is 
the generator*.  And you're free to ignore that, sure.

But this is a "second" return value that only goes to a special place 
with special syntax -- without that syntax, you can't access it.

But in the use cases where you'd actually want to make such a 
function return a value to begin with, it's because that value is the 
value you *really* want from the function -- the only reason it's a 
generator is because it needs to be paused and resumed along the way 
to getting that return value.

If you're writing a function that yields values for other than 
control flow reasons, it's probably a bad idea for it to also have a 
"return" value....  because then you'd need an unyield operation to 
get at the data.

And it seems to me that people are saying, "but that's no problem, 
I'll just use yield-from to get the value".  But that doesn't *work*, 
because it turns the function where you use it into another generator!

The generators have to *stop* somewhere, in order for you to *use* 
their return values -- which makes the return feature ONLY relevant 
to co-routine use cases -- i.e., places where you have trampolines or 
a top-level loop to handle the yields...

And conversely, if you *have* such a generator, its real return value 
is the special return value, so you're not going to be able to use it 
outside the coroutine structure...  so "ignoring its return value" 
doesn't make any sense.  You'd have to write a loop over the 
generator, *just to ignore the value*...  which once again is why 
you'd want an unyield operator of some kind.

That's why special return values should be special: you have to 
handle them differently in order to receive that return value... and 
it's monumentally confusing to look at a function with a normal 
'return' that never actually "returns" that value.

A lot of the emails that have been written about this are failing to 
understand the effects of the control-flow proposed by the 
PEP.   IMO, this should be taken as evidence that using a plain 
"return" statement is in fact confusing, *even to Python-Dev 
participants who have read the PEP*.

We would be much better off with something like "yield return X" or 
"return from yield with X", as it would highlight this 
otherwise-obscure and "magical" difference in control flow.