[Python-3000] Non-blocking I/O? (Draft PEP for New IO system)

Wed Mar 7 01:57:54 CET 2007

On 3/6/07, Daniel Stutzbach <daniel at stutzbachenterprises.com> wrote:
> +1
>
> I liked the original design at first, but after fleshing it and out
> and seeing all of the corner cases raised here, the Buffered I/O
> interface for non-blocking would have to be so different that it would
> not make much sense to make it the same type.
>
> The raw I/O layer semantics are unchanged, right?  (.read() returns 0
> bytes on EOF, None on an EAGAIN/EWOULDBLOCK condition, raise IOError
> on any other problem)

Yup.

> On 3/6/07, Guido van Rossum <guido at python.org> wrote:
> > Reading this and all the other discussion on the proper semantics for
> > non-blocking I/O I think I may have overreached in trying to support
> > non-blocking I/O at all levels of the new I/O stack. There probably
> > aren't enough use cases for wanting to support readline() returning
> > None if no full line if input is available yet to warrant the
> > additional complexities -- and I haven't even looked very carefully at
> > incremental codecs, which introduce another (small) buffer.
> >
> > I think maybe a useful simplification would be to support special
> > return values to capture EWOULDBLOCK (or equivalent) in the raw I/O
> > interface only. I think it serves a purpose here, since without such
> > support, code doing raw I/O would either require catching IOError all
> > the time and inspecting it for EWOULDBLOCK (or other platform specific
> > values!), or not using the raw I/O interface at all, requiring yet
> > another interface for raw non-blocking I/O.
> >
> > The buffering layer could then raise IOError (or perhaps a special
> > subclass of it) if the raw I/O layer ever returned one of these; e.g.
> > if a buffered read needs to go to the raw layer to satisfy a request
> > and the raw read returns None, then the buffered read needs to raise
> > this error if no data has been taken out of the buffer yet; or it
> > should return a short read if some data was already consumed (since
> > it's hard to "unconsume" data, especially if the requested read length
> > is larger than the buffer size, or if there's an incremental encoder
> > involved). Thus, applications can assume that a short read means
> > either EOF or nonblocking I/O; most apps can safely ignore the latter
> > since it must be explicitly be turned on by the app.
> >
> > For writing, if the buffering layer receives a short write, it should
> > try again; but if it receives an EWOULDBLOCK, it should likewise raise
> > the abovementioned error, since repeated attempts to write in this
> > case would just end up spinning the CPU without making progress. (We
> > should not raise an error if a single short write happens, since AFAIK
> > this is possible for TCP sockets even in blocking mode, witness the
> > addition of the sendall() method.)
> >
> > This means that the buffering layer that sits directly on top of the
> > raw layer must still be prepared to deal with the special return
> > values from non-blocking I/O, but its API to the next layer up doesn't
> > need special return values, since it turns these into IOErrors, and
> > the next layer(s) up won't have to deal with it nor reflect it in
> > their API.
> >
> > Would this satisfy the critics of the current design?
> >
> > --Guido
> >
> > On 3/4/07, Adam Olsen <rhamph at gmail.com> wrote:
> > > On 3/4/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > > > I'm having trouble seeing what the use case is for
> > > > the buffered non-blocking writes being discussed here.
> > > >
> > > > Doing asynchronous I/O usually *doesn't* involve
> > > > putting the file descriptor into non-blocking mode.
> > > > Instead you use select() or equivalent, and only
> > > > try to read or write when the file is reported as
> > > > being ready.
> > >
> > > I can't say which is more common, but non-blocking has a safer feel.
> > > Normal code would be select-driven in both, but if you screw up with
> > > non-blocking you get an error, whereas blocking you get a mysterious
> > > hang.
> > >
> > > accept() is the exception.  It's possible for a connection to
> > > disappear between the time select() returns and the time you call
> > > accept(), so you need to be non-blocking to avoid hanging.
> > >
> > > >
> > > > For this to work properly, the select() needs to
> > > > operate at the *bottom* of the I/O stack. Any
> > > > buffering layers sit above that, with requests for
> > > > data propagating up the stack as the file becomes
> > > > ready.
> > > >
> > > > In other words, the whole thing has to have the
> > > > control flow inverted and work in "pull" mode
> > > > rather than "push" mode. It's hard to see how this
> > > > could fit into the model as a minor variation on
> > > > how writes are done.
> > >
> > > Meaning it needs to be a distinct interface and explicitly designed as such.
> >
> > --
> > --Guido van Rossum (home page: http://www.python.org/~guido/)
> > _______________________________________________
> > Python-3000 mailing list
> > Python-3000 at python.org
> > http://mail.python.org/mailman/listinfo/python-3000
> > Unsubscribe: http://mail.python.org/mailman/options/python-3000/daniel%40stutzbachenterprises.com
> >
>
>
> --
> Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC
>

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)