[Python-Dev] Status of PEP 3145 - Asynchronous I/O for subprocess.popen

Nick Coghlan ncoghlan at gmail.com
Fri Mar 28 11:49:54 CET 2014


On 28 March 2014 20:20, Victor Stinner <victor.stinner at gmail.com> wrote:
> 2014-03-28 2:16 GMT+01:00 Josiah Carlson <josiah.carlson at gmail.com>:
>> def do_login(...):
>>     proc = subprocess.Popen(...)
>>     current = proc.recv(timeout=5)
>>     last_line = current.rstrip().rpartition('\n')[-1]
>>     if last_line.endswith('login:'):
>>         proc.send(username)
>>         if proc.readline(timeout=5).rstrip().endswith('password:'):
>>             proc.send(password)
>>             if 'welcome' in proc.recv(timeout=5).lower():
>>                 return proc
>>     proc.kill()
>
> I don't understand this example. How is it "asynchronous"? It looks
> like blocking calls. In my definition, asynchronous means that you can
> call this function twice on two processes, and they will run in
> parallel.

Without reading all the reference from PEP 3145 again, I now seem to
recall the problem it was aimed at was the current deadlock warnings
in the subprocess docs - if you're not careful to make sure you keep
reading from the stdout and stderr pipes while writing to stdin, you
can fill up the kernel buffers and deadlock while communicating with
the subprocess. So the "asynchronous" part is to be able to happily
write large amounts of data to a stdin pipe without fear of deadlock
with a subprocess that has just written large amounts of data to the
stdout or stderr pipes.

So, from the perspective of the user, it behaves like a synchronous
blocking operation, but on the backend it needs to use asynchronous
read and write operations to avoid deadlock. I suspect it would likely
be a relatively thin wrapper around run_until_complete().

Also, as far as where such functionality should live in the standard
library could go, it's entirely possible for it to live in its natural
home of "subprocess". To make that work, the core subprocess.Popen
functionality would need to be moved to a _subprocess module, and then
both subprocess and asyncio would depend on that, allowing subprocess
to also depend on asyncio without creating a circular import.

So I'll go back on my original comment - assuming I've now remembered
its intended effects correctly PEP 3145 remains a valid proposal,
independent of (but potentially relying on) asyncio, as the problem it
is designed to solve is all those notes like "Do not use stdout=PIPE
or stderr=PIPE with this function. As the pipes are not being read in
the current process, the child process may block if it generates
enough output to a pipe to fill up the OS pipe buffer." in the current
subprocess module by using an asynchronous backend while still
presenting a synchronous API.

And rather than adding a new API, I'd hope it could propose just
getting rid of those warnings by reimplementing the current deadlock
prone APIs on top of "run_until_complete()" and exploring the
potential consequences for backwards compatibility.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list