[Python-ideas] non-blocking buffered I/O

Guido van Rossum guido at python.org
Mon Oct 29 23:08:54 CET 2012


On Mon, Oct 29, 2012 at 2:25 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Mon, 29 Oct 2012 10:03:00 -0700
> Guido van Rossum <guido at python.org> wrote:
>> >> Then there is a
>> >> BufferedReader class that implements more traditional read() and
>> >> readline() coroutines (i.e., to be invoked using yield from), the
>> >> latter handy for line-oriented transports.
>> >
>> > Well... It would be nice if BufferedReader could re-use the actual
>> > io.BufferedReader and its fast readline(), read(), readinto()
>> > implementations.
>>
>> Agreed, I would love that too, but the problem is, *this*
>> BufferedReader defines methods you have to invoke with yield from.
>> Maybe we can come up with a solution for sharing code by modifying the
>> _io module though; that would be great! (I've also been thinking of
>> layering TextIOWrapper on top of these.)
>
> There is a rather infamous issue about _io.BufferedReader and
> non-blocking I/O at http://bugs.python.org/issue13322
> It is a bit problematic because currently non-blocking readline()
> returns '' instead of None when no data is available, meaning EOF can't
> be easily detected :(

Eeew!

> Once this issue is solved, you could use _io.BufferedReader, and
> workaround the "partial read/readline result" issue by iterating
> (hopefully in most cases there is enough data in the buffer to
> return a complete read or readline, so the C optimizations are useful).

Yes, that's what I'm hoping for.

> Here is how it may work:
>
> def __init__(self, fd):
>     self.fd = fd
>     self.bufio = _io.BufferedReader(...)
>
> def readline(self):
>     chunks = []
>     while True:
>         line = self.bufio.readline()
>         if line is not None:
>             chunks.append(line)
>             if line == b'' or line.endswith(b'\n'):
>                 # EOF or EOL
>                 return b''.join(chunks)
>         yield from scheduler.block_r(self.fd)
>
> def read(self, n):
>     chunks = []
>     bytes_read = 0
>     while True:
>         data = self.bufio.read(n - bytes_read)
>         if data is not None:
>             chunks.append(data)
>             bytes_read += len(data)
>             if data == b'' or bytes_read == n:
>                 # EOF or read satisfied
>                 break
>         yield from scheduler.block_r(self.fd)
>     return b''.join(chunks)

Hm... I wonder if it would make more sense if these standard APIs were
to return specific exceptions, like the ssl module does in
non-blocking mode? Look here (I updated since posting last night):
http://code.google.com/p/tulip/source/browse/sockets.py#142

> As for TextIOWrapper, AFAIR it doesn't handle non-blocking I/O at all
> (but my memories are vague).

Same suggestion... (I only found out about ssl's approach to async I/O
a few days ago. It felt brilliant and right to me. But maybe I'm
missing something?)

> By the way I don't know how this whole approach (of mocking socket-like
> or file-like objects with coroutine-y read() / readline() methods)
> lends itself to plugging into Windows' IOCP.

Me neither. I hope Steve Dower can tell us.

> You may rely on some raw
> I/O object that registers a callback when a read() is requested and
> then yields a Future object that gets completed by the callback.
> I'm sure Richard has some ideas about that :-)

Which Richard?

-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list