[Python-ideas] Is this PEP-able? for X in ListY while conditionZ:
MRAB
python at mrabarnett.plus.com
Tue Jul 2 01:34:32 CEST 2013
On 01/07/2013 23:44, Oscar Benjamin wrote:
> On 1 July 2013 21:29, David Mertz <mertz at gnosis.cx> wrote:
>> However, I see the point made by a number of people that the 'while' clause
>> has no straightforward translation into an unrolled loop, and is probably
>> ruled out on that basis.
>
> My thought (in keeping with the title of the thread) is that the comprehension
>
> data = [x for y in stuff while z]
>
> would unroll as the loop
>
> for y in stuff while z:
> data.append(x)
>
> which would also be valid syntax and have the obvious meaning. This is
> similar to Nick's suggestion that 'break if' be usable in the body of
> the loop so that
>
> data = [x for y in stuff; break if not z]
>
> would unroll as
>
> for y in stuff:
> break if not z
> data.append(y)
>
> Having a while clause on for loops is not just good because it saves a
> couple of lines but because it clearly separates the flow control from
> the body of the loop (another reason I dislike 'break if'). In other
> words I find the flow of the loop
>
> for p in primes() while p < 100:
> print(p)
>
> easier to understand (immediately) than
>
> for p in primes():
> if p >= 100:
> break
> print(p)
>
> These are just trivially small examples. As the body of the loop grows
> in complexity the readability benefit of moving 'if not z: break' into
> the top line becomes more significant.
>
> You can get the same separation of concerns using takewhile at the
> expense of a different kind of readability
>
> for p in takewhile(lambda p: p < 100, primes()):
> print(p)
>
> However there is another problem with using takewhile in for loops
> which is that it discards an item from the iterable. Imagine parsing a
> file such as:
>
> csvfile = '''# data.csv
> # This file begins with an unspecified number of header lines.
> # Each header line begins with '#'.
> # I want to keep these lines but need to parse the separately.
> # The first non-comment line contains the column headers
> x y z
> 1 2 3
> 4 5 6
> 7 8 9'''.splitlines()
>
> You can do
>
> csvfile = iter(csvfile)
> headers = []
> for line in csvfile:
> if not line.startswith('#'):
> break
> headers.append(line[1:].strip())
> fieldnames = line.split()
> for line in csvfile:
> yield {name: int(val) for name, val in zip(fieldnames, line.split())}
>
> However if you use takewhile like
>
> for line in takewhile(lambda line: line.startswith('#'), csvfile):
> headers.append(line[1:].split())
>
> then after the loop 'line' holds the last comment line. The discarded
> column header line is gone and cannot be recovered; takewhile is
> normally only used when the entire remainder of the iterator is to be
> discarded.
>
> I would propose that
>
> for line in csvfile while line.startwith('#'):
> headers.append(line)
>
> would result in 'line' referencing the item that failed the while predicate.
>
So:
for item in generator while is_true(item):
...
is equivalent to:
for item in generator:
if not is_true(item):
break
...
By similar reasoning(?):
for item in generator if is_true(item):
...
is equivalent to:
for item in generator:
if not is_true(item):
continue
...
If we have one, shouldn't we also have the other?
If only comprehensions have the 'if' form (IIRC, it has already been
rejected for multi-line 'for' loops), then shouldn't only
comprehensions have the 'while' form?
More information about the Python-ideas
mailing list