[Python-ideas] sentinel_exception argument to `iter`

Fri Feb 7 10:28:14 CET 2014

On 2/7/2014 3:28 AM, Steven D'Aprano wrote:
> On Thu, Feb 06, 2014 at 09:36:19PM -0500, Terry Reedy wrote:
>> On 2/6/2014 7:10 PM, Ram Rachum wrote:
>>
>>> `iter` has a very cool `sentinel` argument. I suggest an additional
>>> argument `sentinel_exception`; when it's supplied, instead of waiting
>>> for a sentinel value, we wait for a sentinel exception to be raised, and
>>> then the iteration is finished.
>>>
>>> This'll be useful to construct things like this:
>>>
>>>      my_iterator = iter(my_deque.popleft, IndexError)
>>>
>>> What do you think?
>>
>> I think this would be a great idea if simplified to reuse the current
>> parameter.
>
> That would be a backwards-incompatible change, for exactly the reason
> you give below:

In a later message, I reversed myself, in spite of the actual C 
signature making it a bit messy. Although help says 'iter(callable, 
sentinel)', the actual signature is iter(*args), with any attempt to 
pass an arg by keyword raising
TypeError: iter() takes no keyword arguments

Let n = len(args). Then the code switches as follows:
n=0: TypeError: iter expected at least 1 arguments, got 0
n>2: TypeError: iter expected at most 2 arguments, got 3
n=1: Return __iter__ or __getitem__ wrapper.
n=2: if callable(args[0]): return callable_iterator(*args),
      else: TypeError: iter(v, w): v must be callable

If the current signature were merely extended, then I believe the new 
signature would have to be (if possible)
iter(*args, *, stop_iter=<private exception>)
But having parameter args[1] (sentinel) be position-only, with no 
default, while added parameter stop_iter is keyword only, with a 
(private) default, would be a bit weird.

So instead I would suggest making the new signature be
iter(iter_or_call, sentinel=<private object>, stop_iter=<private 
exception>). If sentinel and stop_iter are both default, use current n=1 
code, else pass all 3 args to modified callable_iterator that compares 
sentinel to return values and catches stop_iter exceptions.

Either way, the user could choose to only stop on a return value, only 
stop on an exception, or stop on either with the two values not having 
to be the same. The only thing that would break is code that depends on 
a TypeError, but we allow ourselves to do that to extend functions.

>> It can work in Python because exceptions are objects like
>> anything else and can be passed as arguments.
>
> Right. And there is a big difference between *returning* an exception
> and *raising* an exception, which is why a new parameter (or a new
> function) is required. A function might legitimately return exception
> objects for some reason:
>
> exceptions_to_be_tested = iter(
>      [IndexError(msg), ValueError, StopIteration, TypeError]
>      )
>
> def func():
>      # pre- or post-processing might happen
>      return next(it)

Did you mean next(exceptions_to_be_tested)

> for exception in iter(func, StopIteration):
>      # assume the exceptions are caught elsewhere
>      raise exception
>
>
> With the current behaviour, that will raise IndexError and ValueError,
> then stop. With the suggested change in behaviour, it will raise all
> four exceptions.

No, it would still only raise the first two as it would still stop with 
the return of StopIteration. But the certainty that people would be 
confused by the double use of one parameter is reason enough not to do it.

> We cannot assume that an exception is never a legitimate return result
> from the callable. "Iterate until this exception is raised" and "iterate
> until this value is returned" are very different things and it is folly
> to treat them as the same.

>> [...]
>> I consider the threat to backward compatibility, because of the added
>> test for exceptions, theoretical rather than actual. It is very rare to
>> write a function that returns an exception,

While *writing* such a function might be very rare, Yuri showed how easy 
it is to create such a callable by binding a collection instance to a 
method.

> Rare or not, I've done it, it's allowed by the language, and it is
> inappropriate to conflate returning a class or instance with raising an
> exception.
>
> It doesn't matter whether it is rare. It is rare to write:
>
> iter(func, ({}, {}))
>
> nevertheless it would be poor design to have iter treat tuples of
> exactly two dicts as a special case.
>
> Exceptions are first-class values like strings, ints, and tuples
> containing exactly two dicts. They should be treated exactly the same as
> any other first-class value.

Agreed.

-- 
Terry Jan Reedy