[Python-Dev] Single- vs. Multi-pass iterability

Guido van Rossum guido@python.org
Wed, 17 Jul 2002 13:38:35 -0400


> > > The file object, unless you make it into an iterator, is not "a
> > > container" like all others and just sits there -- a bit of a wart.
> >
> > I must be misunderstanding.  How does making the file object into an
> > iterator make it a container???
> 
> My fault for unclear expression!  I mean: if it's an iterator, it's an
> iterator.  All OTHER iterables (iterables that aren't iterators) are
> (what some call) containers.
> 
> It's not QUITE that way, but Python would be easier to teach if
> it were.

But leaving the file object as an exception to the rule helps as a
reminder that it's just a rule of thumb and cannot be taken as
absolute law.

> > > to Oren but he disliked it (because it meant readline would not
> > > respect its numeric argument, if any, in that case).
> >
> > Hm, you should've sent it to me.  The numeric argument was a mistake I
> > think.  Who ever uses it?
> 
> Not me, and I think it's advisory anyway according to the docs.
> 
> Still, it doesn't solve the reference-loop-between-two-deuced-things-
> that-don't-cooperate-with-gc problem.  And I can't see how either
> could be made into a WEAK reference given that xreadlines objects
> in other contexts need to hold a strong ref to the file they work on --
> we'd have to refactor xreadlines objects too, a core part holding a
> weak ref and a shell around it (holding a strong ref to the file) to
> support ordinary calls to xreadlines.xreadlines.  Messy:-(.

I don't think that a weak ref to the file would be sufficient for
xreadlines -- e.g.

    for line in open(filename):
        print line,

would close the file right away.

Likewise, the file needs a strong ref to the xreadlines, otherwise the
following would create a new iterator in the second for loop, and lose
data buffered by the first iterator.

    f = open(filename)
    it = iter(f)
    for i in range(10):
        it.next()
    del it
    for line in f:
        print line,

I think I will have to reject Oren's patch because of this, and the
situation with file iterators will remain as it is: once you've asked
for the iterator, all operations on the file are unsafe, and the only
way to get back to using the file is to abandon the file and do an
absolute seek on the file.  (This is sort of like switching between
the raw integer file descriptor and the stream object in C -- or in
Python if you care to use f.fileno() and os.read() etc.)

--Guido van Rossum (home page: http://www.python.org/~guido/)