[Python-ideas] Introduce collections.Reiterable

Steven D'Aprano steve at pearwood.info
Fri Sep 20 11:48:58 CEST 2013


On Thu, Sep 19, 2013 at 11:02:57PM +1000, Nick Coghlan wrote:
> On 19 September 2013 22:18, Steven D'Aprano <steve at pearwood.info> wrote:
[...]
> > At the moment, dict views aren't directly iterable (you can't call
> > next() on them). But in principle they could have been designed as
> > re-iterable iterators.
> 
> That's not what iterable means. The iterable/iterator distinction is
> well defined and reflected in the collections ABCs:

Actually, I think the collections ABC gets it wrong, according to both 
common practice and the definition given in the glossary:

http://docs.python.org/3.4/glossary.html

More on this below.

As for my comment above, dict views don't obey the iterator protocol 
themselves, as they have no __next__ method, nor do they obey the 
sequence protocol, as they are not indexable. Hence they are not 
*directly* iterable, but they are *indirectly* iterable, since they have 
an __iter__ method which returns an iterator.

I don't think this is a critical distinction. I think it is fine to call 
views "iterable", since they can be iterated over. On the rare occasion 
that it matters, we can just do what I did above, and talk about objects 
which are directly iterable (e.g. iterators, sequences, generator 
objects) and those which are indirectly iterable (e.g. dict views).


> * iterables are objects that return iterators from __iter__.

That definition is incomplete, because iterable objects include those 
that obey the sequence protocol. This is not only by long-standing 
tradition (pre-dating the introduction of iterators, if I remember 
correctly), but also as per the definition in the glossary. Alas, 
collections.Iterable gets this wrong:

py> class Seq:
...     def __getitem__(self, index):
...             if 0 <= index < 5: return index+1000
...             raise IndexError
...
py> s = Seq()
py> isinstance(s, Iterable)
False
py> list(s)  # definitely iterable
[1000, 1001, 1002, 1003, 1004]


(Note that although Seq obeys the sequence protocol, and is can be 
iterated over, it is not a fully-fledged Sequence since it has no 
__len__.)

I think this is a bug in the Iterable ABC, but I'm not sure how one 
might fix it.



> * iterators are the subset of iterables that return "self" from
> __iter__, and expose a next (2.x) or __next__ (3.x) method

That is certainly correct. All iterators are iterables, but not all 
iterables are iterators.


> That "iterators return self from __iter__" is important, since almost
> everywhere Python iterates over something, it call "_itr = iter(obj)"
> first.

And then falls back on the sequence protocol.


> So, my question is a genuine one. While, *in theory*, an object can
> define a stateful __iter__ method that (e.g.) only works the first
> time it is called, or returns a separate object that still stores it's
> "current position" information on the original container, I simply
> can't think of a non-pathological case where "isinstance(obj,
> Iterable) and not isinstance(obj, Iterator)" would give the wrong
> answer.
> 
> In theory, yes, an object could obviously pass that test and still not
> be Reiterable, but I'm interested in what's true in *practice*.

I don't think you and I are actually in disagreement here. This is 
Python, and one could write an iterator class that is reiterable, or an 
iterable object (as determined by isinstance) which cannot be iterated 
over, but I think we can dismiss them as pathological cases. Even if 
such unusual objects are useful, it is the caller's responsibility, not 
the callee's, to use them safely and appropriately with functions that 
are expecting them.


-- 
Steven


More information about the Python-ideas mailing list