[Python-ideas] Function to return first(or last) true value from list

Oscar Benjamin oscar.j.benjamin at gmail.com
Fri Feb 21 00:51:54 CET 2014


On 20 February 2014 22:38, אלעזר <elazarg at gmail.com> wrote:
>
> 2014-02-21 0:00 GMT+02:00 Steven D'Aprano <steve at pearwood.info>:
>>
>> On Thu, Feb 20, 2014 at 04:14:17PM +0000, Oscar Benjamin wrote:
>> >
>> > The thing is just that bare next is not something that's widely
>> > recognised as being dangerous. I've seen examples of this kind of bug
>> > in samples from many Python aficionados (including at least one from
>> > you Terry).
>>
>> Why is it dangerous, and what kind of bug?
>>
>> If you're talking about the fact that next(it) can raise StopIteration,
>> I think you are exaggerating the danger. Firstly, quite often you don't
>> mind if it raises StopIteration, since that's what you would have done
>> anyway. Secondly, I don't see why raising StopIteration is so much more
>> dangerous than (say) IndexError or KeyError.
>>
> I had this bug just the other day. I did not plan for the empty case, since
> it was obvious that the empty case is a bug, so I relied on the exception
> being raised in this case. But I did not get the exception since it was
> caught in a completely unrelated for loop. It took me a while to figure out
> what's going on, and it would've taken even more for someone else, not
> familiar with my assumption or with the whole StopIteration thing (which I
> believe is the common case). An IndexError or a KeyError would have been
> great in such a case.

Exactly. The bug I had manifested in a StopIteration that was raised
in a semi-deterministic (dependent on slowly varying data) fashion
after ~1 hour of processing. Had it resulted in an IndexError I would
have seen a traceback and could have fixed the bug within about 5
minutes.

But StopIteration doesn't necessarily bubble up in the same way as
other exceptions because it can be caught and silently supressed by a
for loop (or any consumer of iterators). So instead of a traceback I
had truncated output data. It took some time to discover and verify
that the output data was truncated. Then it took some time rerunning
the script under pdb which was no help since it couldn't latch into
the suppressed exception. I assumed that it must be an exception but
there were no try/except clauses anywhere in the code. Eventually I
found it by peppering the code with:

try:
    ...
except Exception as e:
    import pdb; pdb.set_trace()

It took most of a day for me to track that down instead of 5 minutes
precisely because StopIteration is not like other exceptions.
Admittedly I'd spot a similar bug much quicker now simply because I'm
aware of the possibility.

A simplified version of the bug is shown below:

def itermerge(datasources):
    for source in datasources:
        iterator = iter(source)
        first = next(iterator)
        for item in iterator:
            yield first * item

data = [
    [1, 1, 2, 3],
    [1, 4, 5, 6],
    [],  # Who put that there?
    [1, 7, 8, 9],
]

for item in itermerge(data):
    print(item)

If you run the above then you get:

$ python tmp.py
1
2
3
4
5
6

So the data is silently truncated at the empty iterable.

> It is *very* similar to the "and or" story.

Exactly. Some of the worst programming idioms are the ones that mostly
work but fall apart in special cases. Leaking
StopIteration is fine... as long as you don't do it in a generator, or
a user-defined iterator, or *any* code called by a generator/iterator.


Oscar


More information about the Python-ideas mailing list