[Python-Dev] Termination of two-arg iter()

Sun, 14 Jul 2002 20:03:33 -0400

[Guido]
> But if you fall through the end of the first loop, i.e. you exhaust
> the iterator prematurely, you should do something else in your logic.

I'm not clear on why falling through should be consided premature
termination of the iterator.  If you're looking for a boundary, it may be
normal for it not to be there.  For example, here's something that
suppresses #if 0 blocks, copying everything else to stdout; there's really
nothing special about an input file that doesn't have any #if 0 blocks.

"""
f = file("some_file")
get = iter(f.readline, "")

depth = 0
while True:
    # Copy lines until #if 0.
    for line in get:
        if line == "#if 0\n":
            depth = 1
            break
        else:
            print line,

    # Ignore lines through matching #endif.
    for line in get:
        if line.startswith("#if "):
            depth += 1
        elif line == "#endif\n":
            depth -=1
            if depth == 0:
                break
    else:
        break

if depth:
    raise SyntaxError("%d unclosed #if blocks" % depth)
"""

This is quite natural -- even elegant.

> An else clause on the for loop might be a good place to do something
> appropriate.

It is, but doing it on more than one of the loops is clutter provided that
StopIteration is sticky (if it is, either loop can detect EOF, and there's
no need for both to).

> I haven't used this idiom often enough to know whether that places an
> undue burden on the programmer.  I think the reported cases fall
> mostly in the category "I didn't know it could do that and it took me
> a long time to track it down."

If you can't guess what .next() might do after raising StopIteration the
first time, that can't make things easier to track down <wink>.

> I also note that even if the PEP specifies that StopIteration is a
> sink state and we fix all built-in iterators to make it so, it's easy
> for an iterator implementation to do the wrong thing (especially since
> often an extra state bit is necessary to implement the sinkstate
> property).

I agree, although if the docs are clear about the requirement, it's not
beyond ordinary skill to implement it.

> The question is, should we place the burden on iterator users to avoid
> calling next() after the first StopIteration, or should we place the
> burden on iterator implementations?

I don't think that's the real choice.  If it's left undefined by the
protocol, then some iterators will be deliberately defined to "do something
useful" if called after their first StopIteration.  Then the burden isn't on
the user to avoid it, but to keep track of which iterators do and don't "do
something useful" after they said they stopped.

> Since by far the most common iterator use case is still a single for
> loop, which already does the right thing, it's not at all clear to me
> which is worse.

Well, there are more users of iterators than implementers.  Or if there
aren't, we screwed up <wink>.