[Python-Dev] Please reconsider PEP 479.

Guido van Rossum guido at python.org
Wed Nov 26 21:34:48 CET 2014


On Wed, Nov 26, 2014 at 3:24 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 26 November 2014 at 18:30, Greg Ewing <greg.ewing at canterbury.ac.nz>
> wrote:
> > Guido van Rossum wrote:
> >>
> >> Hm, that sounds like you're either being contrarian or Chris and I have
> >> explained it even worse than I thought.
> >
> > I'm not trying to be contrary, I just think the PEP could
> > explain more clearly what you're trying to achieve. The
> > rationale is too vague and waffly at the moment.
> >
> >> Currently, there are cases where list(x for x in xs if P(x)) works while
> >> [x for x in xs if P(x)] fails (when P(x) raises StopIteration). With the
> >> PEP, both cases will raise some exception
> >
> > That's a better explanation, I think.
>

It's now in the PEP.


> The other key aspect is that it changes the answer to the question
> "How do I gracefully terminate a generator function?". The existing
> behaviour has an "or" in the answer: "return from the generator frame,
> OR raise StopIteration from the generator frame". That then leads to
> the follow on question: "When should I use one over the other?".
>
> The "from __future__ import generator_stop" answer drops the "or", so
> it's just: "return from the generator frame".
>

That's now also in the PEP.


> Raising *any* exception inside the generator, including StopIteration,
> then counts as non-graceful termination, bringing generators into line
> with the PEP 343 philosophy that "hiding flow control in macros makes
> your code inscrutable", where here, the hidden flow control is relying
> on the fact that a called function raising StopIteration will
> currently always gracefully terminate generator execution.
>

Right.


> The key downside is that it means relatively idiomatic code like:
>
>     def my_generator():
>         ...
>         yield next(it)
>         ...
>

I probably considered this an upside of generators when they were
introduced. :-(


> Now needs to be written out explicitly as:
>
>     def my_generator():
>         ...
>        try:
>             yield next(it)
>         except StopIteration
>             return
>         ...
>
> That's not especially easy to read, and it's also going to be very
> slow when working with generator based producer/consumer pipelines.
>

I want to consider this performance argument seriously. Chris did a little
benchmark but I don't think he compared the right things -- he showed that
"yield from" becomes 5% slower with his patch and that a while loop is
twice as slow as "yield from" with or without his patch. I have no idea why
his patch would slow down "yield from" but I doubt it's directly related --
his change only adds some extra code when a generator frame is left with an
exception, but his "yield from" example code (
https://github.com/Rosuav/GenStopIter/blob/485d1/perftest.py) never raises
(unless I really don't understand how the implementation of "yield from"
actually works :-).

I guess what we *should* benchmark is this:

def g(depth):
    if depth > 0:
        it = g(depth-1)
        yield next(it)
    else:
        yield 42

vs. the PEP-479-ly corrected version:

def g(depth):
    if depth > 0:
        it = g(depth-1)
        try:
            yield next(it)
        except StopIteration:
            pass
    else:
        yield 42

This sets up "depth" generators each with a try/except, and then at the
very bottom yields a single value (42) which pops up all the way to the
top, never raising StopIteration.

I wrote the benchmark and here are the code and results:
https://gist.github.com/gvanrossum/1adb5bee99400ce615a5

It's clear that the extra try/except setup aren't free, even if the except
is never triggered. My summary of the results is that the try/except setup
costs 100-200 nsec, while the rest of the code executed in the frame takes
about 600-800 nsec. (Hardware: MacBook Pro with 2.8 GHz Intel Core i7.)

Incidentally, the try/except cost has come down greatly from Python 2.7,
where it's over a microsecond!

I also tried a variation where the bottommost generator doesn't yield a
value. The conclusion is about the same -- the try/except version is 150
nsec slower.

So now we have a number to worry about (150 nsec for a try/except) and I
have to think about whether or not that's likely to have a noticeable
effect in realistic situations. One recommendation follows: if you have a
loop inside your generator, and there's a next() call in the loop, put the
try/except around the loop, so you pay the setup cost only once (unless the
loop is most likely to have zero iterations, and unlikely to have more than
one).


> After thinking about that concern for a while, I'd like to suggest the
> idea of having a new builtin "allow_implicit_stop" decorator that
> swaps out a GENERATOR code object that has the new "EXPLICIT_STOP"
> flag set for one with it cleared (attempting to apply
> "allow_implicit_stop" to a normal function would be an error).
>
> Then the updated version of the above example would become:
>
>     @allow_implicit_stop
>     def my_generator():
>         ...
>         yield next(it)
>         ...
>
> Which would be semantically equivalent to:
>
>     def my_generator():
>        try:
>            ...
>             yield next(it)
>             ...
>         except StopIteration
>             return
>
> but *much* faster (especially if used in a producer/consumer pipeline)
> since it would allow a single StopIteration instance to propagate
> through the entire pipeline, rather than creating and destroying new
> ones at each stage.
>

I think we can put a number to "much faster" now -- 150 nsec per try/except.

I have serious misgivings about that decorator though -- I'm not sure how
viable it is to pass a flag from the function object to the execution
(which takes the code object, which is immutable) and how other Python
implementations would do that. But I'm sure it can be done through sheer
willpower. I'd call it the @hettinger decorator in honor of the PEP's most
eloquent detractor. :-)

Including such a feature in the PEP would also make the fix to
> contextlib simpler: we'd just update it so that
> contextlib._GeneratorContextManager automatically calls
> "allow_implicit_stop" on the passed in generator functions.
>
> Single-source Python 2/3 code would also benefit in a 3.7+ world,
> since libraries like six and python-future could just define their own
> version of "allow_implicit_stop" that referred to the new builtin in
> 3.5+, and was implemented as an identity function in other versions.
>

Well, yes, but you could also trivially write single-source Python, without
the use of any adapter library, by just writing the try/except.


> Cheers,
> Nick.
>
> P.S. While I'm less convinced this part is a good idea, if
> "allow_implicit_stop" accepted both generator functions *and*
> generator objects, then folks could even still explicitly opt in to
> the "or stop()" trick - and anyone reading the code would have a name
> to look up to see what was going on.


Well, here's how you could write the @hettinger decorator without any help
from the compiler: catch RuntimeError, check for a chained StopIteration,
and re-raise that. It wouldn't be any faster, but it would emulate the
pre-PEP world perfectly. :-)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20141126/237b11d3/attachment.html>


More information about the Python-Dev mailing list