do ... while loop

Tim Peters tim_one at email.msn.com
Sat Oct 16 06:19:24 EDT 1999


[Tim]
>> The usual Python idiom for:
>>
>>     do {
>>         xxx;
>>         yyy;
>>     } while (condition);
>>
>> is simply:
>>
>>     while 1:
>>         xxx
>>         yyy
>>         if not condition:
>>             break

[Gerrit Holl]
> Huh?
> Why don't use the following code:
>
> while condition:
>     xxx
>     yyy
>
> I don't understand the difference.

Then let's see whether your parents do!  Even if they don't program, you may
be amazed at how quickly they grasp the essential difference between these
code snippets:

    while gerrit.feels_like_doing_chores();
        gerrit.do_a_chore()

versus

    while 1:
        gerrit.do_a_chore()
        if not gerrit.feels_like_doing_chores():
            break

There's always at least one more chore to be done, so both snippets are
correct.  The difference-- if you execute these loops once daily --is about
365 chores per year <wink>.

There are two common reasons for putting a loop exit somewhere other than
the top, but one reason is much better than the other:

1) You *can't* compute the exit condition at the top; e.g., here you can't
know whether all the lines in the file have been read before you try to read
another one:

    lines = 0
    f = open(somefile)
    while 1:
        if f.readline():
            lines = lines + 1
        else:
            break

2) You *know* the exit condition will be false the first time through the
loop, and it's expensive (or just humiliating <wink>) to check it; e.g.,

    while 1:
        gerrit.do_a_chore()
        if parents.say_ok_after_begging_for_an_hour_to_be_excused(gerrit):
            break

Since #2 is an efficiency trick, and for correctness relies on *knowing*
that the exit condition will be false at the top of the loop, it's not
something to get into the habit of doing (with computers, what you "know" is
eventually wrong -- and if you're wrong only one time in a million, you'll
be wrong hundreds of times every second <wink>).

Here's a story that will bore you to tears:  the FORTRAN language has a loop
construct that looks like

    do i = 1, 10
        print *, i
    enddo

That loop goes around 10 times, like the Python

    for i in range(1, 11):
        print i

Now when FORTRAN was first designed, they forgot to define what happens if
the first number is larger than the second, as in

    do i = 10, 1
        etc

Because it's a tiny bit cheaper (in machine code, on most machine
architectures) to put the "is the loop done yet?" test at the bottom of a
loop than at the top, IBM implemented FORTRAN loops that way, and that meant
*all* loops executed their bodies at least once -- even when the first
number was larger than the second.

Almost everyone else implemented FORTRAN loops the sane way; that is, to
skip the loop body entirely when the first number is larger than the second.

As consequences:  FORTRAN programs were not portable between IBM's and other
machines; grownups wasted many months yelling at each other in FORTRAN
committee meetings trying to repair this mess years later (and the Fortran
77 standard eventually said IBM's way was wrong); and unfortunate compiler
writers (like me <wink>) had to explain this over and over again to angry
IBM users, and implement ridiculous compiler options to make reasonable
compilers act like IBM's.

So that's what you cause when you put a loop exit at the bottom just to save
a nanosecond <wink>.

>> The criminally insane sometimes write the last line instead as:
>>
>>         if not condition: break
>>
>> but putting it on separate lines (as God intended) is key to making the
>> idiom instantly (after you're used to it -- about two days) recognizable
>> for what it is.

> I'll change my code. As far as I remember, this isn't mentioned in Guido's
> style guide; am I right?

There's an old saying on comp.lang.python <ahem>:  "If you meet Guido on the
Internet, kill him".  The rules in Guido's style guide are some of the
habits he believes make his code more readable, reliable, and easier to
change over time.  It's a good start, but it's not exhaustive, and in the
end you have to invent Gerrit's style guide based on what *you've* found
effective.  Which way do you find more readable?  Which way makes it easier
to find the loop exits three months later?  Is there some reason that
finding a loop's exits three months later may be especially valuable?  Or is
it just a waste of time?  You don't get any points for guessing Guido's
answers.  To be fair, you don't get any points for finding your own answers
either.  But at least their your answers then, instead of a guess about
somebody else's <wink>.

anyone-says-different-is-criminally-insane-ly y'rs  - tim






More information about the Python-list mailing list