subprocess and non-blocking IO (again)

Donn Cave donn at u.washington.edu
Mon Oct 10 14:47:34 EDT 2005


In article <434a80e4_2 at news3.prserv.net>,
 Marc Carter <mcarter at uk.ibm.com> wrote:
> I am trying to rewrite a PERL automation which started a "monitoring" 
> application on many machines, via RSH, and then multiplexed their 
> collective outputs to stdout.
> 
> In production there are lots of these subprocesses but here is a 
> simplified example what I have so far (python n00b alert!)
> - SNIP ---------
> import subprocess,select,sys
> 
> speakers=[]
> lProc=[]
> 
> for machine in ['box1','box2','box3']:
>      p = subprocess.Popen( ('echo '+machine+';sleep 2;echo goodbye;sleep 
> 2;echo cruel;sleep 2;echo world'), stdout=subprocess.PIPE, 
> stderr=subprocess.STDOUT, stdin=None, universal_newlines=True )
>      lProc.append( p )
>      speakers.append( p.stdout )
> 
> while speakers:
>      speaking = select.select( speakers, [], [], 1000 )[0]
>      for speaker in speaking:
>          speech = speaker.readlines()
>          if speech:
>              for sentence in speech:
>                  print sentence.rstrip('\n')
>                  sys.stdout.flush() # sanity check
>          else: # EOF
>              speakers.remove( speaker )
> - SNIP ---------
> The problem with the above is that the subprocess buffers all its output 
> when used like this and, hence, this automation is not informing me of 
> much :)

You're using C stdio, through the Python fileobject.  This is
sort of subprocess' fault, for returning a fileobject in the
first place, but in any case you can expect your input to be
buffered.  You're asking for it, because that's what C stdio does.
When you call readlines(), you're further guaranteeing that you
won't go on to the next statement until the fork dies and its
pipe closes, because that's what readlines() does -- returns
_all_ lines of output.

If you want to use select(), don't use the fileobject
functions. Use os.read() to read data from the pipe's file
descriptor (p.stdout.fileno().)  This is how you avoid the
buffering.

> This topic seems to have come up more than once.  I am hoping that 
> things have moved on from posts like this:
> http://groups.google.com/group/comp.lang.python/browse_thread/thread/5472ce95e
> b430002/434fa9b471009ab2?q=blocking&rnum=4#434fa9b471009ab2
> as I don't really want to have to write all that ugly 
> fork/dup/fcntl/exec code to achieve this when high-level libraries like 
> "subprocess" really should have corresponding methods.

subprocess doesn't have pty functionality.  It's hard to say
for sure who said what in that page, after the incredible mess
Google has made of their USENET archives, but I believe that's
why you see dup2 there - the author is using a pty library,
evidently pexpect.  As far as I know, things have not moved on
in this respect, not sure what kind of movement you expected
to see in the intervening month.  I don't think you need ptys,
though, so I wouldn't worry about it.

   Donn Cave, donn at u.washington.edu



More information about the Python-list mailing list