Python Iterables struggling using map() built-in

Steven D'Aprano steve+comp.lang.python at pearwood.info
Tue Dec 9 19:10:49 EST 2014


Terry Reedy wrote:

> On 12/9/2014 12:03 AM, Terry Reedy wrote:
>>> Roy Smith wrote:
>>>
>>>> Chris Angelico wrote:
>>
>>>>> def myzip(*args):
>>>>>      iters = map(iter, args)
>>>>>      while iters:
>>>>>          res = [next(i) for i in iters]
>>>>>          yield tuple(res)
>>>>
>>>> Ugh.  When I see "while foo", my brain says, "OK, you're about to see a
>>>> loop which is controlled by the value of foo being changed inside the
>>>> loop".
> 
> What is nasty to me is that to understand the loop, one must do a whole
> program analysis to determine both that 'iters' is not rebound and that
> the list it is bound to is not mutated.

When people say "whole program analysis", they usually mean the entire
application including all its modules and libraries, not a four line
function, excluding the def header. (Or three lines if you get rid of the
unnecessary temporary variable 'res'.) Did it take you a long time to read
all two lines of the while loop to determine that iters is not modified or
rebound?

I really think you guys are trying too hard to make this function seem more
complicated than it is. If you find it so hard to understand a simple
function with four short lines, one wonders how you would possibly cope
with real code. Purely by coincidence, I have the source to the "pyclbr"
module from the standard library open in a text editor. I see a _readmodule
function that looks, in part, like this:


    try:
         for tokentype, token, start, _end, _line in g:
             if ...
                 while ...
             elif ...
                 while..
                 if ...
                 if ...
                     if ...
                 else ...
             elif ...
                 while ...
                 if ...
                 if ...
                     while ...
                         if ...
                             if ...
                             else ...
                                 if ...
                                     if ...
                                         if ...
                         if ...
                         elif ...
                             if ...
                         elif ...
                         elif ...


at which point I'm about halfway through the try block and I'm giving up.

https://hg.python.org/cpython/file/3.4/Lib/pyclbr.py

This, presumably, is good enough for the standard library, but the four line
version of zip is supposed to be too hard for mortal man to comprehend.
That's funny :-)


> To do the later, one must not 
> only read the loop body, but also preceding code to make sure the list
> is not aliased.

The preceding code is exactly *one* line, a single assignment binding the
name iters to the list. The while loop body is exactly two lines, one if
you dump the unnecessary 'res' temporary variable:

    yield tuple([next(i) for i in iters])


Quite frankly Terry, I do not believe for a second that somebody like you
who can successfully maintain IDLE is struggling to understand this myzip()
function.

Wait... is this like the Four Yorkshire Men sketch from Monty Python, only
instead of complaining about how hard you had it as children, you're all
trying to outdo each other about how difficult you find it to read this
function? If so, well done, you really had me for a while.



> Once the logic is clear and 'localized', even a simple compiler like
> CPython's can see that this is a loop-forever construct and that the
> loop test is unnecessary.  So it can be removed.

Ah, now that's nice. You're suggesting that by moving the loop condition
outside of the while statement, the compiler can generate more efficient
byte code. It only needs to test iters once, not at the start of every
loop.

That is the first interesting argument I've seen so far!

On the one hand, as micro-optimizations go, it will be pretty micro.
Particularly compared to the cost of starting and stopping a generator, I
doubt that will save any meaningful time, at least not enough to make up
for the extra effort in having to read and comprehend one more line of
code.

On the other hand, *premature optimization*. In general, one shouldn't write
more complex code so the compiler can optimize it, one should write simpler
code and have a smarter compiler. If *we* are capable of recognising that
iters is not modified in the body of the loop, then the compiler should be
capable of it too. (If it isn't, it is because nobody has bothered to give
the compiler sufficient smarts, not because it can't be done.) So a good
compiler should be able to compile "while iters" into "if iters: while
True" so long as iters is not modified in the body of the loop. 

Still, that's a nice observation: sometimes more complex source code can
lead to simpler byte code.



-- 
Steven




More information about the Python-list mailing list