Looping [was Re: Python and the need for speed]

Sun Apr 16 12:30:12 EDT 2017

On Sun, 16 Apr 2017 10:06 pm, bartc wrote:

> On 16/04/2017 03:51, Steve D'Aprano wrote:
>> On Sat, 15 Apr 2017 10:17 pm, bartc wrote:
> 
>>> Yes, I'm constantly surprised at this, as such syntax has a very low
>>> cost (in my last compiler, supporting 'while' for example only added 30
>>> lines to the project).
>>
>> That's the advantage of writing your own private language and having no
>> users except for yourself. You get to cut corners. Python has tens of
>> thousands of users, and doesn't have that luxury.
> 
> Here are the lines of code in my C compiler which are necessary to
> support 'while': https://pastebin.com/BYFV7EWr
> 
> (45 lines shown, but there are exactly 30 Loc if blanks are excluded.)
> 
> I'd be interested in knowing why implementing While in Python would need
> significantly more. 

Have you looked at the source code?

https://github.com/python/cpython

> (The 30 Loc figure is with support for loops /in 
> general/ already in place, and is for /adding/ a new loop statement, in
> this case 'while')

What part of *testing* and *documenting* do you not understand?

Do you have any unit tests for your compiler?  How about regression tests --
when you fix a bug, do you write a regression test to ensure it never
creeps back in?

Do you have any documentation for your compiler? Does it include doctests?
Are there any tutorials that beginners can read?

I'm guessing you don't have any of those things. The code snippet you posted
doesn't even have any comments.

Here is the declaration and comment for a regression test in Python's test
suite, checking against the return of a bug in the while statement:

    def test_break_continue_loop(self):
        """This test warrants an explanation. It is a test specifically 
        for SF bugs #463359 and #462937. The bug is that a 'break' 
        statement executed or exception raised inside a try/except inside
        a loop, *after* a continue statement has been executed in that loop,
        will cause the wrong number of arguments to be popped off the stack
        and the instruction pointer reset to a very small number (usually
        0.) Because of this, the following test *must* written as a
        function, and the tracking vars *must* be function arguments with
        default values. Otherwise, the test will loop and loop.
        """

(I've reformatted the comment as a docstring to make it easier to wrap.)

The test itself is 16 lines of code, plus 8 more lines of explanation for
why the test exists. That's for just one bug. Here's a snippet of another
test, testing that when "while 0" is optimized away, the else clause is
not:

        # Issue1920: "while 0" is optimized away,
        # ensure that the "else" clause is still present.
        x = 0
        while 0:
            x = 1
        else:
            x = 2
        self.assertEqual(x, 2)

Do I need to go on?

As a lone-wolf developer with a user base of exactly one person, perhaps you
don't care about tests, documentations, tutorials, or even comments in your
code. But Python has multiple dozens of developers who have to understand
each others code. It is open source and receives patches and contributions
from hundreds more. It has tens or hundreds of thousands of users with high
expectations about quality.

Python's requirements are a bit more strict than "well, it seems to be
working okay, so that's good enough for me".

If Python added something like:

    loop N times:
        ...

we would need *at least* the following:

- a test that `loop` was interpreted as a keyword;
- a test that `times` was interpreted as a keyword;
- a test that `loop 0 times` didn't execute the body at all;
- a test that `loop 1 times` executed the body exactly once;
- a test that `loop N times` executed the body exactly N times, for
  a few different (and probably randomly selected) values of N;
- a test that the statement handled negative integers and floats correctly;
- a test that the statement handled non-numeric values correctly 
  (probably by raising an exception);
- a test that `continue` worked as expected inside this statement;
- a test that `break` worked as expected;
- a test that `return` worked as expected;
- a test that `raise` worked as expected;
- a test that signals will be caught correctly inside the statement;

and possibly more.

[...]
> No, we're talking about a loop. It must be just about the simplest thing
> to implement in a language (compared with a type system, or code
> generation).

It still needs tests and documentation. In the absence of formal correctness
proofs, any code not covered by tests should be considered buggy.

> BTW supporting a dedicated endless loop was only about 20 lines in
> another project. Believe me, it is /nothing/. I wish other aspects were
> just as trivial. It didn't even need a dedicated keyword.

Of course it's "nothing" for you, since by the evidence given you don't
bother with the hard parts. No comments, no communication with other
developers (except to gloat over how awesome you are and how stupid their
choices are), no documentation, no tests.

>  > - describing the various syntax forms;
>  > - explaining how they differ;
>  > - tutorials for beginners showing each form;
> 
> And you don't have to explain how an endless loop should be written as
> 'while True', meanwhile advising against using 'while 1'?

No you don't, because `while True` or `while 1` or for that matter `while
['this list', 'is, 'a', 'true object']` are easily derived from
understanding (1) while loops; and (2) truthy values.

"repeat forever" needs documentation, because it is syntax.

(And what on earth makes you think that `while 1` needs advising against?)

>> The more choices you offer, the harder that decision becomes:
>>
>> - numeric Pascal or C-style loop
>> - foreach style loop
>> - repeat while condition (test at start)
>> - repeat until condition (test at start)
>> - do ... while condition (test at end)
>> - do ... until condition (test at end)
>> - repeat forever
> 
> So, how many kinds of sequences do you have in Python?
> 
> lists
> tuples
> namedtuples
> arrays
> bytearrays
> string ?
> memoryview
> 
> plus all those ordered types. My head is already spinning!

"Ordered types"?

You forgot bytes and deques. But you should try Java if you really want your
head to explode.

What's your point? Python has eight or ten or a dozen sequence types because
they're all different, sometimes *radically* different, and they all have
important uses that other types cannot fulfill.

> The loops I use are categorised as:
> 
> * Endless

That's just a special case of "until some condition is true".

> * N times

That's just a special case of "over an integer sequence".

> * Until some condition is true

You missed one: do you check the condition before the first loop, or at the
end of the loop? That makes a difference between the loop running zero or
more times, or one or more times.

> * Iterate over an integer sequence

And that in turn is just a special case of iterating over a set of values.

> * Iterate over a set of values of some object

Out of your five fundamental loops, you missed one important distinction,
and invented three unimportant ones.

It is true that Python doesn't support repeat...until condition loops, where
the condition is checked at the end. And I consider that a (minor) weakness
of Python.

> Other languages like to have even more elaborate schemes. That includes
> advanced uses of Python's for loop, were, for example, there are
> multiple loop variables.
> 
> I wonder how much testing that took to get it right?

Probably a lot. And it would be worth it even if it took fifty times more
testing.

[...]
>>> But very common requirements are endless loops, and repeat N times
>>> without needing an explicit counter.
>>
>> If by "very common" you mean "occasionally", I agree.
> 
> Most of my loops start off as endless loops, until I can determine the
> actual terminating condition, and where it best goes. Sometimes they
> stay as endless loops.
> 
> (Sometimes, I turn a normal statement into an endless loop with a 'do'
> prefix. This is an unusual feature but I use it quite a bit:
> 
>     doswitch nextchar()           # looping version of 'switch'
>     when 'A'..'Z' then ....
>     else exit                     # ie. break out of loop
>     end
> )

How delightfully eccentric of you :-)

>>> Python's byte-code does at least optimise out the check that '1' is
>>> true, but that's not what the reader sees, which is 'loop while 1 is
>>> true'. And one day it will be:
>>>
>>>      while l:
>>>          body
>>>
>>> that can be mistaken for that common idiom.
>>
>> The problem there is not the loop, but the foolish use of lowercase l as
>> a variable name.
> 
> Maybe. But if 'while 1' wasn't used, then that's one less thing to
> double-check. (I just looked at a random bunch of Python code; there
> were 25 instances of 'while True:', but 40 of 'while 1:'.)

What on earth are you talking about? Even if Python had a dedicated "loop
forever" statement, "while 1" would still be perfectly legal code.

For what it's worth, there are about 170 uses of "while 1" or "while True"
in the standard library, out of about 400+ while loops in total:

[steve at ando ~]$ cd /usr/local/lib/python3.5/
[steve at ando python3.5]$ grep "while 1:" *.py | wc -l
45
[steve at ando python3.5]$ grep "while True:" *.py | wc -l
126
[steve at ando python3.5]$ grep "while .*:" *.py | wc -l
415

which is quite rare compared to for loops:

[steve at ando python3.5]$ grep "for .*:" *.py | wc -l
1435

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.