feof status (was: Re: [Python-Dev] Rehabilitating fgets)

Eric S. Raymond esr@thyrsus.com
Mon, 8 Jan 2001 14:17:50 -0500


Guido van Rossum <guido@python.org>:
> [Eric]
> > A file is at EOF when attempts to read more data from it will fail
> > returning no data.
> 
> I was afraid you would say this.  That's not a condition that's easy
> to calculate without doing I/O, *and* that's not the condition that
> you are interested in for your problem.  According to your definition,
> f.eof() should be true in this example:
> 
>     f = open("/etc/passwd")
>     f.seek(0, 2)                 # Seek to end of file
>     print f.eof()                # What will this print???
>     print `f.readline()`         # Will print ''

I agree that after f.seek(0, 2) f is in an end-of-file condition.  But
I think it's precisely the definition that would be useful for my
problem.  Contrary to what you say, I think my definition of EOF is
quite sharp -- a sequential read would return no data.

Better to think of what I need as an "is there data waiting?" query.
I should have framed it that way, rather than about EOFness, from the
beginning.

> But getting the right result here requires a lot of knowledge about
> how the file is implemented!  While you've explained how this can be
> implemented on Unix, it can't be implemented with just the tools that
> stdio gives us.

Granted.  However, it looks possible that "is there data waiting"
*can* be portably implemented with the help of fstat(2), which by
precedent is also part of Python's toolkit.

> I also don't want to make f.eof() a non-portable feature: *if*
> it is provided, it's too important for that.

Agreed.

> Note that stdio's feof() doesn't have this definition!  It is set when
> the last *read* (or getc(), etc.) stumbled upon an EOF condition.
> That's also of limited value; it's mostly defined so you can
> distinguish between errors and EOF when you get a short read.  The
> stdio feof() flag would be false in the above example.

OK.  You're right about that.  I should have thought more clearly about
the difference between the state of stdio and the state of the underlying
file or device.  Access to stdio state won't do by itself.

> > This is where it bites that I can't test for EOF with a read(0).
> 
> And can you tell me a system where you *can* test for EOF with a
> read(0)?  I've never heard of such a thing.  The Unix read() system
> call has the same properties as Python's f.read().  I'm pretty sure
> that fread() with a zero count also doesn't give you the information
> you're after.

I'd have to test -- but what Unix read(2) does in this case isn't
really my point.  My real point is that I can't probe for whether
there's data waiting to be read in what seems like the obvious way.  I
expect Python to compensate for the deficiencies of the underlying C,
not reflect them.

> > Just having the plain-file case work would, IMHO, be justification
> > enough for this method.  If it turns out to be portable across Mac and
> > Windows sockets as well, *huge* win.  Could this be tested by someone
> > with access to Windows and Mac systems?
> 
> I don't see the huge win.

Try "polling after a non-blocking open".  A lower-overhead and more 
natural way to do it than with a poller object.  (This is on my mind 
because I used a poller object to query FIFOs just last week.)

The game system I'm working on, BTW, has another point of interest for
this list.  It is a rather large and complex suite of C programs that
makes heavy use of dynamic-memory allocation; I am translating to
Python partly in order to avoid chronic misallocation problems (leaks
and wild pointers) and partly because the thing needed to be rewritten
anyway to eliminate global state so I can embed it an multithreaded
server.

Side-by-side comparison of the original C and its translation should
be quite an interesting educational experience once it's done.  That
just might be my next yesar's paper.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

It is the assumption of this book that a work of art is a gift, not a
commodity.  Or, to state the modern case with more precision, that works of
art exist simultaneously in two "economies," a market economy and a gift
economy.  Only one of these is essential, however: a work of art can survive
without the market, but where there is no gift there is no art.
	-- Lewis Hyde, The Gift: Imagination and the Erotic Life of Property