[Python-3000] Non-blocking I/O? (Draft PEP for New IO system)

Daniel Stutzbach daniel at stutzbachenterprises.com
Wed Mar 7 01:44:37 CET 2007


+1

I liked the original design at first, but after fleshing it and out
and seeing all of the corner cases raised here, the Buffered I/O
interface for non-blocking would have to be so different that it would
not make much sense to make it the same type.

The raw I/O layer semantics are unchanged, right?  (.read() returns 0
bytes on EOF, None on an EAGAIN/EWOULDBLOCK condition, raise IOError
on any other problem)

On 3/6/07, Guido van Rossum <guido at python.org> wrote:
> Reading this and all the other discussion on the proper semantics for
> non-blocking I/O I think I may have overreached in trying to support
> non-blocking I/O at all levels of the new I/O stack. There probably
> aren't enough use cases for wanting to support readline() returning
> None if no full line if input is available yet to warrant the
> additional complexities -- and I haven't even looked very carefully at
> incremental codecs, which introduce another (small) buffer.
>
> I think maybe a useful simplification would be to support special
> return values to capture EWOULDBLOCK (or equivalent) in the raw I/O
> interface only. I think it serves a purpose here, since without such
> support, code doing raw I/O would either require catching IOError all
> the time and inspecting it for EWOULDBLOCK (or other platform specific
> values!), or not using the raw I/O interface at all, requiring yet
> another interface for raw non-blocking I/O.
>
> The buffering layer could then raise IOError (or perhaps a special
> subclass of it) if the raw I/O layer ever returned one of these; e.g.
> if a buffered read needs to go to the raw layer to satisfy a request
> and the raw read returns None, then the buffered read needs to raise
> this error if no data has been taken out of the buffer yet; or it
> should return a short read if some data was already consumed (since
> it's hard to "unconsume" data, especially if the requested read length
> is larger than the buffer size, or if there's an incremental encoder
> involved). Thus, applications can assume that a short read means
> either EOF or nonblocking I/O; most apps can safely ignore the latter
> since it must be explicitly be turned on by the app.
>
> For writing, if the buffering layer receives a short write, it should
> try again; but if it receives an EWOULDBLOCK, it should likewise raise
> the abovementioned error, since repeated attempts to write in this
> case would just end up spinning the CPU without making progress. (We
> should not raise an error if a single short write happens, since AFAIK
> this is possible for TCP sockets even in blocking mode, witness the
> addition of the sendall() method.)
>
> This means that the buffering layer that sits directly on top of the
> raw layer must still be prepared to deal with the special return
> values from non-blocking I/O, but its API to the next layer up doesn't
> need special return values, since it turns these into IOErrors, and
> the next layer(s) up won't have to deal with it nor reflect it in
> their API.
>
> Would this satisfy the critics of the current design?
>
> --Guido
>
> On 3/4/07, Adam Olsen <rhamph at gmail.com> wrote:
> > On 3/4/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > > I'm having trouble seeing what the use case is for
> > > the buffered non-blocking writes being discussed here.
> > >
> > > Doing asynchronous I/O usually *doesn't* involve
> > > putting the file descriptor into non-blocking mode.
> > > Instead you use select() or equivalent, and only
> > > try to read or write when the file is reported as
> > > being ready.
> >
> > I can't say which is more common, but non-blocking has a safer feel.
> > Normal code would be select-driven in both, but if you screw up with
> > non-blocking you get an error, whereas blocking you get a mysterious
> > hang.
> >
> > accept() is the exception.  It's possible for a connection to
> > disappear between the time select() returns and the time you call
> > accept(), so you need to be non-blocking to avoid hanging.
> >
> > >
> > > For this to work properly, the select() needs to
> > > operate at the *bottom* of the I/O stack. Any
> > > buffering layers sit above that, with requests for
> > > data propagating up the stack as the file becomes
> > > ready.
> > >
> > > In other words, the whole thing has to have the
> > > control flow inverted and work in "pull" mode
> > > rather than "push" mode. It's hard to see how this
> > > could fit into the model as a minor variation on
> > > how writes are done.
> >
> > Meaning it needs to be a distinct interface and explicitly designed as such.
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/daniel%40stutzbachenterprises.com
>


-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC


More information about the Python-3000 mailing list