reseting an iterator

Terry Reedy tjreedy at udel.edu
Thu May 21 18:21:37 EDT 2009


I will clarify by starting over with current definitions.

Ob is an iterator iff next(ob) either returns an object or raises 
StopIteration and continues to raise StopIteration on subsequent calls.

Ob is an iterable iff iter(ob) raturns an iterator.

It is intentional that the protocol definitions be minimal, so that they 
can used as widely as possible.

As a convenience, the definition of iterators is given a slight 
complication.  They are defined as a subcategory of iterables, with the 
  requirement that iter(iterator) be that same iterator.  This means 
that iterators need the following boilerplate:
   def __iter__(self): return self
The extra burden is slight since most iterators are based on builtins or 
generator functions or expressions, which add the boilerplate 
automatically.  The convenience is that one may write

def f(iterable_or_iterator):
   it = iter(iterable_or_iterator)
   ...

instead of

def f(iterable_or_iterator):
   if is_iterable(iterable_or_iterator):
     it = iter(iterable_or_iterator)
   else:
     it = iterable_or_iterator

In particular, the internal function that implements for loops can do 
the former.

In other words, a small bit of boilerplate added to iterators, mostly 
automatically, saves boilerplate in the use of iterators and iterables.

When the protocols were defined, there was discussion about whether or 
not to require 'continue to raise StopIteration'.  For instance, an 
iterator that returns objects derived from external input might not have 
any new external input now but expect to get some in the future.  It was 
decided the such iterators should either wait and block the thread or 
return a 'Not now' indicator such as None.  StopIteration should 
consistently mean 'Done, over and out' so for loops, for instance, would 
know to exit.

The OP proposes that StopIteraton should instead mean 'Done until 
reset', without defining 'reset'.  Some comments:
* This would complicate the protocol.
* There are real use cases, and reiterability is a real issue.  But ...
* Depending on the meaning, resetting may or may not be possible.
* When it is possible, it can potentially be done today with a .send() 
method.
* Many use cases are easier with a new iterator.  For instance

for i in iterable: block1()
for i in iterable: block2()

is easier to write than

it = iter(iterable)
for i in it: block1()
it.reset()
for i in it: block2()

with little relative time saving in the second case, for practical 
problems, to compensate for the extra boilerplate.

Terry Jan Reedy




More information about the Python-list mailing list