[Python-ideas] "Iteration stopping" syntax [Was: Is this PEP-able? for X in ListY while conditionZ:]

Ron Adam ron3200 at gmail.com
Sat Jun 29 18:46:03 CEST 2013



On 06/28/2013 10:16 PM, Andrew Barnert wrote:
> On Jun 28, 2013, at 18:50, Shane Green
> <shane at umbrellacode.com
> <mailto:shane at umbrellacode.com>> wrote:
>
>> Yes, but it only works for generator expressions and not comprehensions.
>
> This is the point if options #1 and 2: make StopIteration work in comps
> either (1) by redefining comprehensions in terms of genexps or (2) by fiat.
>
> After some research, it turns out that these are equivalent. Replacing any
> [comprehension] with list(comprehension) is guaranteed by the language (and
> the CPython implementation) to give you exactly the same value unless (a)
> something in the comp raises StopIteration, or (b) something in the comp
> relies on reflective properties (e.g., sys._getframe().f_code.co_flags)
> that aren't guaranteed anyway.
>
> So, other than being 4 characters more verbose and 40% slower, there's
> already an answer for comprehensions.

Right..  Any solution also must not slow down the existing simpler cases.

For those who haven't looked at the C code yet, there is this comment there.


/* List and set comprehensions and generator expressions work by creating a
nested function to perform the actual iteration. This means that the
iteration variables don't leak into the current scope.
The defined function is called immediately following its definition, with 
the result of that call being the result of the expression.
The LC/SC version returns the populated container, while the GE version is
flagged in symtable.c as a generator, so it returns the generator object
when the function is called.
This code *knows* that the loop cannot contain break, continue, or return,
so it cheats and skips the SETUP_LOOP/POP_BLOCK steps used in normal loops.

Possible cleanups:
- iterate over the generator sequence instead of using recursion
*/

I don't know how much the SETUP_LOOP/POP_BLOCK costs in time. It probably 
only makes a big difference in nested cases.


> And if either of those problems is unacceptable, a patch for #1 or #2 is
> actually pretty easy.
>
> I've got two different proof of concepts: one actually implements the comp
> as passing the genexp to list, the other just wraps everything after the
> BUILD_LIST and before the RETURN_VALUE in a the equivalent of try: ...
> except StopIteration: pass. I need to add some error handling to the C
> code, and for #2 write sufficient tests that verify that it really does
> work exactly like #1, but I should have working patches to play with in a
> couple days.
>
>> My opinion of that workaround is that it’s also a step backward in terms
>> of readability.  I suspect.
>>
>> if i < 50 else stop() would probably also work, since it throws an
>> exception.  That’s better, IMHO.

Once a function is added that is called on every iteration, then a regular 
for loop with a break (without the function call) will run quicker.


I think what matters is that it's fast and is easy to explain. The first 
two examples here are the existing variations.  The third case would be the 
added break case.  (The exact spelling of the expression may be different.)


       # [x for x in seq]
       for x in iter:
           append x     # LIST_APPEND byte code, not a method call

       # [x for x in seq if expr]
       for x in iter:
           if expr: append x

       # [x for x in seq if expr break]
       for x in iter:
           if expr: break
           append x

The generator comps have YIELD_VALUE in place of LIST_APPEND,

This last case is the simplest variation for an early exit.  It only 
differs from the second case by having a BREAK_LOOP after the 
POP_JUMP_IF_FALSE instruction. Along with SETUP_LOOP and POP_BLOCK, before 
after the loops.

I am curious about how many places in the library adding break to these 
would make a difference. If there isn't any, or only a few, then it's 
probably not needed.  But then again, maybe it's worth a good before 
dismissing it.

Cheers,
    Ron


(* dis.dis seems to be adding some extra unneeded lines, a second, dead 
JUMP_ABSOLUTE to the top of the loop for case 2 above, and a "JUMP_FORWARD 
0" in the third case.  Seems odd, but these don't effect what we are 
talking about here.)



























More information about the Python-ideas mailing list