[Python-ideas] for/except/else
Wolfgang Maier
wolfgang.maier at biologie.uni-freiburg.de
Thu Mar 2 06:06:15 EST 2017
On 02.03.2017 06:46, Nick Coghlan wrote:
> On 1 March 2017 at 19:37, Wolfgang Maier
> <wolfgang.maier at biologie.uni-freiburg.de
> <mailto:wolfgang.maier at biologie.uni-freiburg.de>>
> wrote:
>
> Now here's the proposal: allow an except (or except break) clause to
> follow for/while loops that will be executed if the loop was
> terminated by a break statement.
>
> Now while it's possible that Nick had a good reason not to do so,
>
>
> I never really thought about it, as I only use the "else:" clause for
> search loops where there aren't any side effects in the "break" case
> (other than the search result being bound to the loop variable), so
> while I find "except break:" useful as an explanatory tool, I don't have
> any practical need for it.
>
> I think you've made as strong a case for the idea as could reasonably be
> made :)
>
> However, Steven raises a good point that this would complicate the
> handling of loops in the code generator a fair bit, as it would add up
> to two additional jump targets in cases wherever the new clause was used.
>
> Currently, compiling loops only needs to track the start of the loop
> (for continue), and the first instruction after the loop (for break).
> With this change, they'd also need to track:
>
> - the start of the "except break" clause (for break when the clause is used)
> - the start of the "else" clause (for the non-break case when both
> trailing clauses are present)
>
I think you could get away with only one additional jump target as I
showed in my previous reply to Steven. The heavier burden would be on
the parser, which would have to distinguish the existing and the two new
loop variants (loop with except clause, loop with except and else
clause) but, anyway, that's probably not really the point.
What weighs heavier, I think, is your design argument.
> The design level argument against adding the clause is that it breaks
> the "one obvious way" principle, as the preferred form for search loops
> look like this:
>
> for item in iterable:
> if condition(item):
> break
> else:
> # Else clause either raises an exception or sets a default value
> item = get_default_value()
>
> # If we get here, we know "item" is a valid reference
> operation(item)
>
> And you can easily switch the `break` out for a suitable `return` if you
> move this into a helper function:
>
> def find_item_of_interest(iterable):
> for item in iterable:
> if condition(item):
> return item
> # The early return means we can skip using "else"
> return get_default_value()
>
> Given that basic structure as a foundation, you only switch to the
> "nested side effect" form if you have to:
>
> for item in iterable:
> if condition(item):
> operation(item)
> break
> else:
> # Else clause neither raises an exception nor sets a default value
> condition_was_never_true(iterable)
>
> This form is generally less amenable to being extracted into a reusable
> helper function, since it couples the search loop directly to the
> operation performed on the bound item, whereas decoupling them gives you
> a lot more flexibility in the eventual code structure.
>
> The proposal in this thread then has the significant downside of only
> covering the "nested side effect" case:
>
> for item in iterable:
> if condition(item):
> break
> except break:
> operation(item)
> else:
> condition_was_never_true(iterable)
>
> While being even *less* amenable to being pushed down into a helper
> function (since converting the "break" to a "return" would bypass the
> "except break" clause).
I'm actually not quite buying this last argument. If you wanted to
refactor this to "return" instead of "break", you could simply put the
return into the except break block. In many real-world situations with
multiple breaks from a loop this could actually make things easier
instead of worse.
Personally, the "nested side effect" form makes me uncomfortable every
time I use it because the side effects on breaking or not breaking the
loop don't end up at the same indentation level and not necessarily
together. However, I'm gathering from the discussion so far that not too
many people are thinking like me about this point, so maybe I should
simply adjust my mind-set.
All that said, this is a very nice abstract view on things! I really
learned quite a bit from this, thank you :)
As always though, reality can be expected to be quite a bit more
complicated than theory so I decided to check the stdlib for real uses
of break. This is quite a tedious task since break is used in many
different ways and I couldn't come up with a good automated way of
classifying them. So what I did is just go through stdlib code (in
reverse alphabetical order) containing the break keyword and put it into
categories manually. I only got up to socket.py before losing my
enthusiasm, but here's what I found:
- overall I looked at 114 code blocks that contain one or more breaks
- 84 of these are trivial use cases that simply break out of a while
True block or terminate a while/for loop prematurely (no use for any
follow-up clause there)
- 8 more are causing a side-effect before a single break, and it would
be pointless to put this into an except break clause
- 3 more cause different, non-redundant side-effects before different
breaks from the same loop and, obviously, an except break clause would
not help them either
=> So the vast majority of breaks does *not* need an except break *nor*
an else clause, but that's just as expected.
Of the remaining 19 non-trivial cases
- 9 are variations of your classical search idiom above, i.e., there's
an else clause there and nothing more is needed
- 6 are variations of your "nested side-effects" form presented above
with debatable (see above) benefit from except break
- 2 do not use an else clause currently, but have multiple breaks that
do partly redundant things that could be combined in a single except
break clause
- 1 is an example of breaking out of two loops; from sre_parse._parse_sub:
[...]
# check if all items share a common prefix
while True:
prefix = None
for item in items:
if not item:
break
if prefix is None:
prefix = item[0]
elif item[0] != prefix:
break
else:
# all subitems start with a common "prefix".
# move it out of the branch
for item in items:
del item[0]
subpatternappend(prefix)
continue # check next one
break
[...]
This could have been written as:
[...]
# check if all items share a common prefix
while True:
prefix = None
for item in items:
if not item:
break
if prefix is None:
prefix = item[0]
elif item[0] != prefix:
break
except break:
break
# all subitems start with a common "prefix".
# move it out of the branch
for item in items:
del item[0]
subpatternappend(prefix)
[...]
- finally, 1 is a complicated break dance to achieve sth that clearly
would have been easier with except break; from typing.py:
[...]
def __subclasscheck__(self, cls):
if cls is Any:
return True
if isinstance(cls, GenericMeta):
# For a class C(Generic[T]) where T is co-variant,
# C[X] is a subclass of C[Y] iff X is a subclass of Y.
origin = self.__origin__
if origin is not None and origin is cls.__origin__:
assert len(self.__args__) == len(origin.__parameters__)
assert len(cls.__args__) == len(origin.__parameters__)
for p_self, p_cls, p_origin in zip(self.__args__,
cls.__args__,
origin.__parameters__):
if isinstance(p_origin, TypeVar):
if p_origin.__covariant__:
# Covariant -- p_cls must be a subclass of
p_self.
if not issubclass(p_cls, p_self):
break
elif p_origin.__contravariant__:
# Contravariant. I think it's the
opposite. :-)
if not issubclass(p_self, p_cls):
break
else:
# Invariant -- p_cls and p_self must equal.
if p_self != p_cls:
break
else:
# If the origin's parameter is not a typevar,
# insist on invariance.
if p_self != p_cls:
break
else:
return True
# If we break out of the loop, the superclass gets a
chance.
if super().__subclasscheck__(cls):
return True
if self.__extra__ is None or isinstance(cls, GenericMeta):
return False
return issubclass(cls, self.__extra__)
[...]
which could be rewritten as:
[...]
def __subclasscheck__(self, cls):
if cls is Any:
return True
if isinstance(cls, GenericMeta):
# For a class C(Generic[T]) where T is co-variant,
# C[X] is a subclass of C[Y] iff X is a subclass of Y.
origin = self.__origin__
if origin is not None and origin is cls.__origin__:
assert len(self.__args__) == len(origin.__parameters__)
assert len(cls.__args__) == len(origin.__parameters__)
for p_self, p_cls, p_origin in zip(self.__args__,
cls.__args__,
origin.__parameters__):
if isinstance(p_origin, TypeVar):
if p_origin.__covariant__:
# Covariant -- p_cls must be a subclass of
p_self.
if not issubclass(p_cls, p_self):
break
elif p_origin.__contravariant__:
# Contravariant. I think it's the
opposite. :-)
if not issubclass(p_self, p_cls):
break
else:
# Invariant -- p_cls and p_self must equal.
if p_self != p_cls:
break
else:
# If the origin's parameter is not a typevar,
# insist on invariance.
if p_self != p_cls:
break
except break:
# If we break out of the loop, the superclass gets
a chance.
if super().__subclasscheck__(cls):
return True
if self.__extra__ is None or isinstance(cls,
GenericMeta):
return False
return issubclass(cls, self.__extra__)
return True
[...]
My summary: I do see use-cases for the except break clause, but,
admittedly, they are relatively rare and may be not worth the hassle of
introducing new syntax.
More information about the Python-ideas
mailing list