select() vs. pipes (was [Python-Dev] Re: PEP 324 (process module))

Wed Aug 4 22:03:51 CEST 2004

On Wed, 2004-08-04 at 15:01, Peter Astrand wrote:
> >
> > But if you comment out the "os.close(p_in)" line, it no longer busywaits
> > (the select timeout is reached on every iteration).  At least this is
> > the behavior under Linux.
> 
> This isn't strange. You are closing the (only) read-end of the pipe. When
> you do this, the pipe is broken. Consider this:
> 
> >>> import os
> >>> r, w = os.pipe()
> >>> os.close(r)
> >>> os.write(w, "a")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> OSError: [Errno 32] Broken pipe
> 
> select() only indicates that something has happened on this
> filedescriptor.

Right.  I understand that returning EOF on the reader side of the pipe
is the inverse of the "broken pipe" behavior you demonstrate above.

> > This is a little unfortunate because the normal dance when communicating
> > between parent and child in order to capture the output of a child
> > process seems to be:
> >
> > 1) In the parent process, create a set of pipes that will represent
> > stdin/stdout/stderr of the child.
> >
> > 2)  fork
> 
> The problem with your example was that it didn't fork...

I was all set to try to refute this, but after writing a minimal test
program to do what I want to do, I find that you're right.  That's good
news!  I'll need to revisit my workaroudns in the program that caused me
to need to do this.  Thanks for the schooling.

> So, there is no problem with using select() on pipes when communicating
> with a subprocess. It works great. Take a look at (my) process.py's
> communicate() method for some inspiration.

I've actually looked at it and it's quite nice, but it doesn't do one
thing that I'd like to see as part of a process stdlib library.  The use
case I'm thinking of is one where a long-running program needs to
monitor the output of many other potentially long-running processes,
doing other things in the meantime.  This kind of program tends to use
select as a part of a mainloop where there might be other things going
on (like handling network communications to/from sockets, updating a
GUI, etc).  Also, the output from child stderr and stdout potentially
never end because the child process(es) may never end.

In popen5, "communicate" is terminal.  It calls select until there's no
more data to get back and then unconditionally waits for the subprocess
to finish, blocking the entire time.  This isn't useful for the type of
program I describe above.  Of course, it wasn't meant to be, but having
an API that could help with this sort of thing would be nice as well,
although probably out of scope for PEP 234.

- C