for / while else doesn't make sense
Steven D'Aprano
steve at pearwood.info
Sun Jun 5 02:35:39 EDT 2016
On Sun, 5 Jun 2016 01:29 pm, Lawrence D’Oliveiro wrote:
> On Saturday, June 4, 2016 at 11:37:18 PM UTC+12, Ned Batchelder wrote:
>> On Friday, June 3, 2016 at 11:43:33 PM UTC-4, Lawrence D’Oliveiro wrote:
>> > On Saturday, June 4, 2016 at 3:00:36 PM UTC+12, Steven D'Aprano wrote:
>> > > You can exit a loop because you have run out of items to process, or
>> > > you can exit the loop because a certain condition has been met.
>> >
>> > But why should they be expressed differently?
>> >
>> > item_iter = iter(items)
>> > while True :
>> > item = next(item_iter, None)
>> > if item == None :
>> > break
>> > if is_what_i_want(item) :
>> > break
>> > #end while
>>
>> Do you actually write loops like this?
>
> Is that a non-trolling question? Yes. All the time.
Really? Well, you'd fail my code review, because that code is broken. If
items contains None, your loop will silently end early. That's a bug.
>> If this appeared in a code review, first we'd have a conversation about
>> what this code was meant to do ...
>
> I would hope not.
Clearly. Nevertheless, its a conversation that needs to be had.
>> ...and then I would ask, "Why aren't you using a for loop?"
>
> ... and then I would ask, “Didn’t you read my previous postings where I
> pointed out the issues with them?”
I don't think that very many people would agree with you or consider them
problems at all. They're more like features than problems. Your objections
to for-loops feel kind of like "I don't like bread knives because they make
it too easy to slice bread".
Okay, you don't like for-loops, because they make looping a fixed number of
times with an optional early exit too much of a "cognitive burden" for you.
You have my sympathy, but nobody else I've come across in nearly two
decades of Python programming finds them a cognitive burden.
> Here <https://en.wikibooks.org/wiki/Python_Programming/Databases> is
> another example: see the section “Looping on Field Breaks”.
That section was written by you and is not independent confirmation that
others agree with your issues with for-loops.
> A while-True scales gracefully to complex situations like that.
Graceful like a hippopotamus.
I don't know that the situation is complex, your description is pretty clear
and to the point:
Consider the following scenario: your sales company database has
a table of employees, and also a table of sales made by each
employee. You want to loop over these sale entries, and produce
some per-employee statistics.
but the while loop you have certainly is complex. If I understand your
intent correctly, then I think this is both more elegant and likely faster
than the while loop you use:
# Beware of bugs in the following code:
# I have only proven it is correct, I haven't tested it.
rows = db_iter(
db = db,
cmd =
"select employees.name, sales.amount, sales.date from"
" employees left join sales on employees.id = sales.employee_id"
" order by employees.name, sales.date"
)
default = {'total sales': 0.0,
'number of sales': 0,
'earliest date': None,
'latest date': None}
prev_employee_name = None
stats = {}
for (employee_name, amount, date) in rows:
if (employee_name != prev_employee_name
and prev_employee_name is not None):
# Print the previous employee's stats
report(prev_employee_name, stats)
# and prepare for the next employee.
previous_employee_name = employee_name
stats = default.copy()
stats['total sales'] += amount
stats['number of sales'] += 1
if stats['earliest date'] is None:
stats['earliest date'] = date
stats['latest date'] = date
if prev_employee_name is not None:
report(prev_employee_name, stats)
No breaks needed at all, which makes it much more understandable: you know
instantly from looking at the code that it processes every record exactly
once, then exits.
But it is a *tiny* bit ugly, due to the need to print the last employee's
statistics after the loop is completed. We can fix that in two ways:
(1) Give up the requirement to print each employee's stats as they are
completed, and print them all at the end; or
(2) Put a sentinel at the end of rows.
The first may not be suitable for extremely large data sets, but it is
especially elegant:
rows = db_iter( ... # as above )
default = {'total sales': 0.0,
'number of sales': 0,
'earliest date': None,
'latest date': None}
stats = {}
for (employee_name, amount, date) in rows:
record = stats.setdefault(employee_name, default.copy())
stats['total sales'] += amount
stats['number of sales'] += 1
if stats['earliest date'] is None:
stats['earliest date'] = date
stats['latest date'] = date
for employee_name in stats:
report(employee_name, stats[employee_name])
As you now have all the statistics available, you can look for
under-performing or over-performing sales people, run comparisons between
staff, etc.
Solution (2) using a sentinel gets rid of the need to print anything outside
of the loop by simply ensuring that the very last record is a meaningless
sentinel that can be ignored:
from itertools import chain
rows = db_iter( ... # as above )
default = {'total sales': 0.0,
'number of sales': 0,
'earliest date': None,
'latest date': None}
prev_employee_name = None
stats = {}
for (employee_name, amount, date) in chain(rows, ('', 0, None)):
if (employee_name != prev_employee_name
and prev_employee_name is not None):
# Print the previous employee's stats
report(prev_employee_name, stats)
# and prepare for the next employee.
previous_employee_name = employee_name
stats = default.copy()
stats['total sales'] += amount
stats['number of sales'] += 1
if stats['earliest date'] is None:
stats['earliest date'] = date
stats['latest date'] = date
Again, there are no breaks needed, so you know that every record is
processed exactly once, and all but the last (the sentinel) is printed.
--
Steven
More information about the Python-list
mailing list