[Python-Dev] Single- vs. Multi-pass iterability

Guido van Rossum guido@python.org
Tue, 16 Jul 2002 09:50:22 -0400


> > argue that making the file object its own iterator is only confusing;
> > given that I'm also not sure what problem it solves, I'm at best +0 on
> 
> Personally, I think it solves at least a teaching problem -- it
> helps me teach the difference between iterators and iterables.  In
> the Europython tutorial I had to gloss a bit over the fact that the
> difference was rather blurried.  According to the principles I
> mentioned, as easiest for the audience to understand and apply, the
> file object SHOULD have been an iterator, not an iterable -- i.e. it
> SHOULD have been the case that f is iter(f) when f is a file object
> -- but it wasn't.  When it IS, that's one less micro-wart I need to
> mention when teaching or writing about it.

I dunno.  The presence of seek() and write() makes the behavior of
files a rather unique blend of iterator and iterable.

> I don't see any downside to having this micro-wart removed.  In
> particular, I don't see what's confusing.  Things that respond to
> iter(x) fall in two categories:
>     iterators: also have x.next(), and iter(x) is x
>     iterables: iter(x) is not x, so you can presumably get another
>         iterator out of x at some later point in time if needed.
> It's not QUITE as simple as this, but moving file objects from
> the second category to the first seems to _simplify_ things a bit.

I worry that equating a file with its iterable makes it more likely
that people mix next() with readline() or seek(), which doesn't work
(at least not until the I/O system is rewritten).

I'd be more comfortable with teaching people that you should *either*
use a file in a for loop (the common case, probably) *or* use its
native I/O methods (readline() etc.), but not mix both.

> E.g.:
> 
> def useIterable(x):
>     try: 
>         it = iter(x)
>     except TypeError:
>         raise TypeError, "Need iterable object, not %s" % type(x)
>     if it is x:
>         raise TypeError, "Need iterable object, not iterator"
>     # keep happily using it and/or x as needed, and in particular
>     # the code is able to call it1 = iter(x) if it needs to iterate again
> 
> Not perfect -- but having a file-object argument fail this simplistic
> test seems better to me, less confusing, than having it pass.

This actually looks like an example of the "look before you leap"
(LBYL) syndrome, which you disapproved of recently.

> So, I, personally, am +1.  It might be even nicer (from the point of
> view of teaching, at least) if iterating on f interoperated more
> smoothly with other method calls on f, but I do see your point that
> the right way to achieve THAT would be a complete rewrite of the I/O
> system, and thus a vastly heavier project than the current one.
> Still, the current step seems to be in the right direction.

Somehow I'd rather emphasize the relative brokenness of the current
situation.  Anyway, I'm somewhere between -0 and +0 (inclusive) on
this.

--Guido van Rossum (home page: http://www.python.org/~guido/)