[Python-ideas] Possible PEP 380 tweak

Greg Ewing greg.ewing at canterbury.ac.nz
Fri Oct 29 09:18:05 CEST 2010


I've been pondering the whole close()-returning-a-value
thing I've convinced myself once again that it's a bad
idea.

Essentially the problem is that we're trying to make
the close() method, and consequently GeneratorExit,
serve two different and incompatible roles.

One role (the one it currently serves) is as an
emergency bail-out mechanism. In that role, when we
have a stack of generators delegating via yield-from,
we want things to behave as thought the GeneratorExit
originates in the innermost one and propagates back
out of the entire stack. We don't want any of the
intermediate generators to catch it and turn it
into a StopIteration, because that would give the
next outer one the misleading impression that it's
business as usual, but it's not.

This is why PEP 380 currently specifies that, after
calling the close() method of the subgenerator,
GeneratorExit is unconditionally re-raised in the
delegating generator.

The proponents of close()-returning-a-value, however,
want GeneratorExit to serve another role: as a way
of signalling to a consuming generator (i.e. one that
is having values passed into it using send()) that
there are no more values left to pass in.

It seems to me that this is analogous to a function
reading values from a file, or getting them from an
iterator. The behaviour that's usually required in
the presence of delegation is quite different in those
cases.

Consider a function f1, that calls another function
f2, which loops reading from a file. When f2 reaches
the end of the file, this is a signal that it should
finish what it's doing and return a value to f1, which
then continues in its usual way.

Similarly, if f2 uses a for-loop to iterate over
something, when the iterator is exhausted, f2 continues
and returns normally.

I don't see how GeneratorExit can be made to fulfil
this role, i.e. as a "producer exhausted" signal,
without compromising its existing one. And if that
idea is dropped, the idea of close() returning a value
no longer has much motivation that I can see.

So how should "producer exhausted" be signalled, and
how should the result of a consumer generator be returned?

As for returning the result, I think it should be done
using the existing PEP 380 mechanism, i.e. the generator
executes a "return", consequently raising StopIteration
with the value. A delegating generator will then see
this as the result of a yield-from and continue normally.

As for the signalling mechanism, I think that's entirely
a matter for the producer and consumer to decide between
themselves. One way would be to send() in a sentinel value,
if there is a suitable out-of-band value available.
Another would be to throw() in some pre-arranged exception,
perhaps EOFError as a suggested convention.

If we look at files as an analogy, we see a similar range
of conventions. Most file reading operations return an empty
string or bytes object on EOF. Some, such as readline(),
raise an exception, because the empty element of the relevant
type is also a valid return value.

As an example, a consumer generator using None as a
sentinel value might look like this:

   def summer():
     tot = 0
     while 1:
       x = yield
       if x is None:
         break
       tot += x
     return tot

and a producer using it:

   s = summer()
   s.next()
   for x in values:
     s.send(x)
   try:
     s.send(None)
   except StopIteration as e:
     result = e.value

Having to catch StopIteration is a little tedious, but it
could easily be encapsulated in a helper function:

   def close_consumer(g, sentinel):
     try:
       g.send(sentinel)
     except StopIteration as e:
       return e.value

The helper function could also take care of another issue
that arises. What happens if a delegating consumer carries
on after a subconsumer has finished and yields again?

The analogous situation with files is trying to read from
a file that has already signalled EOF before. In that case,
the file simply signals EOF again. Similarly, calling
next() on an exhausted iterator raises StopIteration again.

So, if a "finished" consumer yields again, and we are using
a sentinel value, the yield should return the sentinel again.
We can get this behaviour by writing our helper function like
this:

   def close_consumer(g, sentinel):
     while 1:
       try:
         g.send(sentinel)
       except StopIteration as e:
         return e.value

So in summary, I think PEP 380 and current generator
semantics are fine as they stand with regard to the
behaviour of close(). Signalling the end of a stream of
values to a consumer generator can and should be handled
by convention, using existing facilities.

-- 
Greg



More information about the Python-ideas mailing list