Non-blocking read with popen subprocess

Fri Jul 31 18:27:19 EDT 2009

On Jul 30, 7:30 pm, Jonathan Gardner <jgard... at jonathangardner.net>
wrote:
> On Jul 30, 5:24 pm, Dhanesh <dhanesh... at gmail.com> wrote:
>
>
>
> > how can I we have a non blocking read ?
>
> Seehttp://docs.python.org/library/popen2.html#flow-control-issues
>
> Note well: In the non-blocking world, you have to use select() or poll
> () to get your job done.

I solved a problem just yesterday regarding select on a Popen.stdout
file descriptor that was created with a 100Mb buffer.  I'd love it if
anybody could critique my solution and my understanding of my
findings.

In my case, select would return with my Popen object ready for read,
I'd go off and try to read and then my read would block.  After a few
strace's I found that Popen.stdout.read(), which I am assuming is the
same as the read builtin, would make multiple read system calls, in
many cases blocking until it returned my default buffer size.

Sure enough, it's in the docs: http://docs.python.org/library/stdtypes.html?highlight=fread
"""
Note that this method may call the underlying C function fread() more
than once in an effort to acquire as close to size bytes as possible.
Also note that when in non-blocking mode, less data than was requested
may be returned, even if no size parameter was given.
"""

I couldn't find a way to tell subprocess to return my stdout fd in non-
blocking mode, but then I thought about it not being a problem with
the fd and how it was opened as much as a problem with the read
call.

  A non-blocking fd will have a read call throw EWOULDBLOCK when asked
for more bytes than is in buffer, a blocking fd would have read just
return fewer bytes  ...do I have this this right?

I can handle assembling the data fine, I just don't want to block.

I wrote a little test program to spit out random stdout to test a
Popen object's stdout in os.read.  The len() of the result of os.read
was all over the place depending on how quickly i re-ran read() on the
fd.  There were slight delays from time to time, but I attributed that
to the interpreter being context switched in, ssh delays ...or the
weather...in general the return was pretty fast.

I switched my program to use os.read and now it's snappy and spends
most of it's time blocking on select exactly where I want it to.

(this is on linux btw, I guess os.read is different everywhere)

> You may want to look at "communicate" (http://docs.python.org/library/subprocess.html#popen-objects) which may be what you need.