[Python-ideas] PEP 479: Change StopIteration handling inside generators

Steven D'Aprano steve at pearwood.info
Thu Nov 20 03:06:15 CET 2014


On Thu, Nov 20, 2014 at 03:24:07AM +1100, Chris Angelico wrote:

> If you write __next__, you write in a "raise StopIteration" when it's
> done. If you write __getattr__, you write in "raise AttributeError" if
> the attribute shouldn't exist. Those are sufficiently explicit that it
> should be reasonably clear that the exception is the key. But when you
> write a generator, you don't explicitly raise:

That's not true in practice. See my reply to Nick, there is lots of code 
out there which uses StopIteration to exit generators. Some of that code 
isn't very good code -- I've seen "raise StopIteration" immediately 
before falling out the bottom of the generator -- but until now it has 
worked and the impression some people have gotten is that it is actually 
preferred.


> def gen():
>     yield 1
>     yield 2
>     yield 3
>     return 4

Until 3.2, that was a syntax error. For the majority of people who are 
still using Python 2.7, it is *still* a syntax error. To write this in a 
backwards-compatible way, you have to exit the generator with:

    raise StopIteration(2)


> The distinction in __next__ is between returning something and raising
> something. The distinction in a generator is between "yield" and
> "return". Why should a generator author have to be concerned about one
> particular exception having magical meaning?

I would put it another way: informally, the distinction between a 
generator and a function is that generators use yield where functions 
use return. Most people are happy with that informal definition, a full 
pedantic explanation of co-routines will just confuse them or bore them. 
The rule they will learn is:

* use return in functions
* use yield in generators

That makes generators that use both surprising. Since most generators 
either run forever or fall out the bottom when they are done, I expect 
that seeing a generator with a return in it is likely to surprise a lot 
of people. I've known that return works for many years, and I still 
give a double-take whenever I see it in a generator.


> Imagine this scenario:
> 
> def producer():
>     """Return user input, or raise KeyboardInterrupt"""
>     return input("Enter the next string: ")
> 
> def consumer():
>     """Process the user's input"""
>     while True:
>         try:
>             command = producer()
>         except KeyboardInterrupt:
>             break
>         dispatch(command)
> 
> 
> Okay, now let's make a mock producer:
> 
> strings = ["do stuff","do more stuff","blah blah"]
> def mock_producer()
>     if strings: return strings.pop(0)
>     raise KeyboardInterrupt
> 
> That's how __next__ works, only with a different exception, and I
> think people would agree that this is NOT a good use of
> KeyboardInterrupt.

Why not? How else are you going to communicate something out of band to 
the consumer except via an exception? 

We can argue about whether KeyboardInterrupt is the right exception to 
use or not, but if you insist that this is a bad protocol then you're 
implicitly saying that the iterator protocol is also a bad protocol.


> If you put a few extra layers in between the
> producer and consumer, you'd be extremely surprised that an unexpected
> KeyboardInterrupt just quietly terminated a loop.

You might be, but since I've paid attention to the protocol rules, I 
won't be. Sorry to be harsh, but how clear do we have to be? 
StopIteration terminates iterators, and generators are iterators. That 
rule may or may not be inconvenient, it might be annoying (but sometimes 
useful), it might hide bugs, it might even be something that we can 
easily forget until reminded, but if it comes as a "surprise" that just means 
you don't know how the iterator protocol works.

There are good reasons for changing this behaviour, but pandering to 
people who don't know how the iterator protocol works is not one of 
them.


> Yet this is exactly
> what the generator-and-for-loop model creates: a situation in which
> StopIteration, despite not being seen at either end of the code, now
> has magical properties.

That's exactly how the protocol works. Even if you write "return" in 
your generator, it still raises StopIteration.


> Without the generator, *only* __next__ has
> this effect, and that's exactly where it's documented to be.

The documentation states that __next__ raises StopIteration, it doesn't 
say that *only* __next__ should raise StopIteration.

https://docs.python.org/3/library/stdtypes.html#iterator.__next__

I trust that we all expect to be able to factor out the raise into a 
helper function or method, yes? It truly would be surprising if this 
failed:


class MyIterator:
    def __iter__(self):
        return self
    def __next__(self):
        return something()


def something():
    # Toy helper function.
    if random.random() < 0.5:
        return "Spam!"
    raise StopIteration



Now let's write this as a generator:

def gen():
    while True:
        yield something()


which is much nicer than:

def gen():
    while True:
        try:
            yield something()
        except StopIteration:
            return   # converted by Python into raise StopIteration



-- 
Steven


More information about the Python-ideas mailing list