PEP: statements in control structures (Re: Conditional Expressions don't solve the problem)
Huaiyu Zhu
huaiyu at gauss.almadan.ibm.com
Wed Oct 17 17:37:46 EDT 2001
I've been writing this for several evenings. It's not polished, but IMHO it
solves most of the problems discussed in this thread. So here it goes. :-)
PEP: Statements in Control Structures
1. INTRODUCTION:
One long-standing complaint about Python syntax is that it distinguishes
expressions from statements and does not allow statements in the
condition part of flow control structures, so it is not allowed to write
while x = next(): process(x)
if x = next(): process(x)
This is to avoid troublesome bugs like
if x = 0: ...
when the programmer actually wanted
if x == 0: ...
However, this restriciton of not allowing statements in conditions also
has its own costs, mainly in increased verbosity which sometimes in
themselves lead to other subtle bugs. This issue is not an artificial
one because the current control-flow structures do not represent the
most general case naturally.
In this proposal we present an extended syntax of control structures
that allow statements before conditions, without the risk of problems
associated with mixing statements with expressions. The new syntax is
if ( stmt ; )* expr : suite
( elif ( stmt ; )* expr : suite )*
[ else : suite ]
while ( stmt ; )* expr : suite
[ else : suite ]
Compared with Python's current syntax
if expr : suite
( elif expr : suite )*
[ else : suite ]
while expr : suite
[ else : suite ]
the new syntax allows simple statements in control structures just
before the conditions, separated by ";".
Essentially the same syntax was proposed by Kevin Digweed [1] in 1999,
which received some favorable comments but was subsequently lost in a
discussion with over-generalization (see the relevant thread). This
author, unaware of the earlier discussion, proposed it independently in
2000 [2]. Reference [1] was recently mentioned by Hamish Lawson [3].
2. PROPOSAL:
The extension allows zero or more simple statements separated by ";" to
be placed between the keywords "while", "if" and "elif" and their
corresponding conditions. These statements are executed in sequence at
the point just before the condition expression is to be evaluated.
In the following,
statements := (simple_statement ";" )* simple_statement
In other words, they are what can be written in one line separated by
";" in the current syntax.
2.1. The if-elif-else structure is extended to the following
"if" [ statements ";" ] expression ":" statements
( "elif" [ statements ";" ] expression ":" statements )*
[ "else:" statements ]
The semantics of
if stmts1; expr1:
stmts2
elif stmts3; expr2:
stmts4
elif stmts5; expr3:
stmts6
else:
stmts7
is equivalent to the semantics of
stmts1
if expr1:
stmts2
else:
stmts3
if expr2:
stmts4
else:
stmts5
if expr3:
stmts6
else:
stmts7
2.2. The while-else structure is extended to the following
"while" [ statements ";" ] expression ":" statements
[ "else:" statements ]
The semantics of
while stmts1; expr1:
stmts2
if expr2: break
stmts3
else:
stmts4
is equivalent to the semantics of
__hidden_variable = 0
while 1:
stmts1;
if not expr1: break
stmts2
if expr2:
__hidden_variable = 1
break
stmts3
if not __hidden_variable:
stmts4
del __hidden_variable
where __hidden_variable is a variable not used in this block.
3. RATIONALE:
3.1. General looping structure.
A general looping structure, commonly known as loop-and-half, looks like
A
loop:
B
if not C: break
D
E
If B is empty, this can be represented in current Python as
A
while C:
D
E
The new syntax allows the more general case even when B is not empty
A
while B; C:
D
E
Putting B between "while" and the condition is not just syntactical
sugar, as will be shown in the following.
3.2. Better break-else interaction:
Python allows an else-clause for the while statement. The naive way
of writing the general loop-and-half in current Python interferes
with the else clause. For example,
while x = next(); not x.is_end:
y = process(x)
if y.is_what_we_are_looking_for(): break
else:
raise "not found"
cannot be written in this naive version:
while 1:
x = next()
if x.is_end: break
y = process(x)
if y.is_what_we_are_looking_for(): break
else:
raise "not found"
This is because there are two breaks that have different semantical
meanings. The fully equivalent version in current syntax has to use
one extra variable to keep track of the breaks that affect the else
__hidden_variable = 0
while 1:
x = next()
if x.is_end: break
y = process(x)
if y.is_what_we_are_looking_for():
__hidden_variable = 1
break
if not __hidden_variable:
raise "not found"
del __hidden_variable
The improvement of the new syntax is quite obvious.
3.3. Flatter conditional structures:
A general nested "if-else-and-a-half" structure is like
A
if B:
C
else:
D
if E:
F
else:
G
if H:
I
else:
...
which can be written in the new syntax as
if A; B:
C
elif D; E:
F
elif G; H:
I
else:
...
The advantage of this syntax pattern is similar to that of "elif"
itself, namely to transform a nested branching structure into a flat
branching structure. Using flat structure in place of nested
structure is in keep with one of the good tradition of Python.
4. EXAMPLES:
4.1. Action needed before condition:
while x = get_next(); x:
whatever(x)
4.2. Condition does not need to be a method of an object in assignment:
while line = readline(); 'o' in line:
line = process(line)
if 'e' in line: break
else:
print "never met break"
4.3. Has similar power to C's for statement
for (start; action, end; incr) {
do_something;
if (cond) break;
do_other;
}
can be written as
start
while action; not end:
do_something
if cond: break
do_other
incr
4.4. More complex example:
if x = dict[a]; x: proc1(x)
elif x = next(x); x.ok(): proc2(x)
elif x.change(); property(x): proc3(x)
...
The equivalent in the current syntax is not flat:
x = dict[a]
if x: proc1(x)
else:
x = next(x)
if x.ok(): proc2(x)
else:
x.change()
if property(x): proc3(x)
else:
...
Alternatively, it requires at least a two level hack with "while":
while 1:
x = dict[a]
if x:
proc1(x)
break
x = next(x)
if x.ok():
proc2(x)
break
x.change()
if property(x):
proc3(x)
break
...
break
It is seen that the new syntax remove substantial amount of clutter,
thereby increasing readability and expressiveness.
5. ISSUES:
5.1. Syntax errors:
This structure is safe against single typing errors:
- missing colon would be detected at newline because of keywords
- missing last expression will be detected at the colon, because
the condition must be an expression
- mistype = for == in expression will be detected
- mistype == for = in statement will be detected
- mistype : for ; will be detected as missing :
- mistype ; for : will be detected as mutiple :
5.2. Obfuscation:
The new syntax does not diminish the distinction between statements
and expressions.
Specifically, the structure
if S1; S2; E: S
is built upon statements and expressions according to the syntax of
"if". It does not mean that (S1; S2; E) itself becomes a magical
super-expression. The same is true for "elif" and "while".
Consequently, without changing the syntax of "for", the following is
meaningless and not allowed
for S1; S2; a in B: C
Since the change is only in the syntax of "if", "elif" and "while",
not in the fundamentals of expressions and statements, there is not
much more chance of obfuscation than existing syntax.
5.3. Compatibility:
This extention is fully backward compatible, because the extended
syntax is currently invalid syntax.
5.4. Generality:
Guido in comment about Kevin's proposal mentioned [4] that this is
not general enough to allow short circuit conditions:
while (x = f(); x) and (x.y = g(); x.y):
"whatever"
which is equivalent to the following (assuming no training "else"):
while 1:
x = f()
if not x: break
else:
x.y = g()
if not x.y: break
"whatever"
With the extension of "if" and "elif" this could be written in
quite readable form:
while 1:
if x = f(); not x: break
elif x.y = g(); not x.y: break
However, there is indeed a complication when there is both a "break"
and an "else". A general solution would either require allowing
super expressions like (S1; S2; E) or a new keyword, such as
"until", or two flavors of "break". It is unclear whether such
situations are important enough for such more radical changes.
Note that the similar problem with "if" is already solved:
if ((x = f.readline(); x) and
(y = f.readline(); y)):
print x, y
can be written as
if x = f.readline(); not x: pass
elif y = f.readline(); not y: pass
else:
print x, y
6. ALTERNATIVES:
We show that the following alternatives have more problems than the
proposed extension.
6.1. Allowing special assignment in conditions. For example:
while a:=next(): process(a)
This has many problems:
6.1.1. It does not allow other actions before conditions, such as
while x.get_next(): x: process(x)
So it does not really solve the problem. On the other hand, if
arbitrary statement were allowed in an expression with some
syntactical trick then it would completely blur the distinction
between statement and expression.
6.1.2. It does not allow for arbitrary expressions in condition, like
while x=next(); some_property_of(x):
...
The proposal
if some_property_of(x:=next()):
...
is ugly, and if it were to become a general rule, it would allow
much obfuscations
a = f(b:=3+g(not c:=4) - c) * b(c)
6.2. Regarding (S1; S2; S3; E) as an expression that can appear in other
places. This is more general than current proposal and solve one
more problem: short-circuit condition in "while" loop with "break"
statement and "else" clause (see 5.4 above).
However, it appears to be too general than is necessary, and easily
lead to obfuscations such as
x = f((y = (x = a; x = {x.next(): x.next()}; x); y[0].next()))
The problems associated with statements in expressions appear to far
outweigh the benefit in the one particular example of 5.4.
6.3. Using iterators: In Python 2.2, it is possible to write
for line in file:
if line=='end': break
process_line(line)
in place of
while line=file.readline(); line != 'end':
process_line(line)
However, this does not solve the problem we are considering
completely. It is more suitable for objects that are naturally used
(and reused) in iterations so that it is profitable to create
iterators for them. It is not practical to define an iterator for
every action that might go before a condition:
while char=file.readline()[3]; char != 'x':
process(char)
while string = raw_input(prompt): not string.startswith("n"):
process(string)
It does not solve the nested else-if problem, either.
6.4. Conditional expression, like (if a then b else c). This solves a
completely different problem. It is not related to this proposal.
6.5. Repeat action before and after loop:
line = file.readline()
while line:
do_something()
line = file.readline()
This is a maintenance liability - it is easy to go out of sync.
This does not apply to the "if" structure, either.
6.6. An alternative syntax might be
val = dict1[key1]; if val: process1(val)
else: val = dict2[key2]; if val: process2(val)
else: val = dict3[key3]; if val: process3(val)
...
This looks more consistent and logical. One problem of putting
keyword "if" in the middle of a line is that it is less prominent,
although syntax-colored editors do help.
However, similar syntax is not right for while-loops:
A; while B: C
gives the impression that A is only done once. So this alternative
is not tenable.
7. IMPLEMENTATION:
I do not know enough about the possible implementation. It does not
appear to have fundamental difficulties. The changes are likely to
occur at places where statements are parsed.
8. SUMMARY
The extension allowing sequence of simple statements between keywords
"while", "if", "else" and their corresponding condition expressions
solves several problems. It makes code more readable in many
situations, without risk of obfuscation in other situations.
It compares favorably to several alternatives, both existing and
proposed, because, essentially, the original problem is about statements
before condition expressions in control structures. It is not a demand
to blur the distinction between statements and expressions, and should
not be solve in such a fashion.
9. REFERENCES
[1] http://groups.google.com/groups?hl=en&selm=78naok%242ed%241%40nnrp1.
dejanews.com
[2] http://www.geocities.com/huaiyu_zhu/python/ififif.txt
[3] http://mail.python.org/pipermail/python-list/2001-October/068332.html
[4] http://groups.google.com/groups?q=g:thl2213523209d&hl=en&selm=199901271725.
MAA14695%40eric.cnri.reston.va.us
(The google references are longer than my news client allows, so they are
split to mutiple lines.)
More information about the Python-list
mailing list