[Python-ideas] Membership of infinite iterators

Nick Coghlan ncoghlan at gmail.com
Wed Oct 18 07:08:04 EDT 2017


On 18 October 2017 at 20:39, Koos Zevenhoven <k7hoven at gmail.com> wrote:

> On Oct 18, 2017 13:29, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
>
> On 18 October 2017 at 19:56, Koos Zevenhoven <k7hoven at gmail.com> wrote:
>
>> I'm unable to reproduce the "uninterruptible with Ctrl-C"​ problem with
>> infinite iterators. At least itertools doesn't seem to have it:
>>
>> >>> import itertools
>> >>> for i in itertools.count():
>> ...     pass
>> ...
>>
>
> That's interrupting the for loop, not the iterator. This is the test case
> you want for the problem Jason raised:
>
>     >>> "a" in itertools.count()
>
> Be prepared to suspend and terminate the affected process, because Ctrl-C
> isn't going to help :)
>
>
> I'm writing from my phone now, cause I was dumb enough to try list(count())
>

Yeah, that's pretty much the worst case example, since the machine starts
thrashing memory long before it actually gives up and starts denying the
allocation requests :(


> But should it be fixed in list or in count?
>

That one can only be fixed in count() - list already checks
operator.length_hint(), so implementing itertools.count.__length_hint__()
to always raise an exception would be enough to handle the container
constructor case.

The open question would then be the cases that don't pre-allocate memory,
but still always attempt to consume the entire iterator:

    min(itr)
    max(itr)
    sum(itr)
    functools.reduce(op, itr)
    "".join(itr)

And those which *may* attempt to consume the entire iterator, but won't
necessarily do so:

    x in itr
    any(itr)
    all(itr)

The items in the first category could likely be updated to check
length_hint and propagate any errors immediately, since they don't provide
any short circuiting behaviour - feeding them an infinite iterator is a
guaranteed uninterruptible infinite loop, so checking for a length hint
won't break any currently working code (operator.length_hint defaults to
returning zero if a type doesn't implement __length_hint__).

I'm tempted to say the same for the APIs in the latter category as well,
but their short-circuiting semantics mean those can technically have
well-defined behaviour, even when given an infinite iterator:

    >>> any(itertools.count())
    True
    >>> all(itertools.count())
    False
    >>> 1 in itertools.count()
    True

It's only the "never short-circuits" branch that is ill-defined for
non-terminating input. So for these, the safer path would be to emit
DeprecationWarning if length_hint fails in 3.7, and then pass the exception
through in 3.8+.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20171018/e79908a2/attachment-0001.html>


More information about the Python-ideas mailing list