Python 2.2.1 and select()

Derek Martin code at pizzashack.org
Mon Mar 24 22:03:56 EDT 2008


On Mon, Mar 24, 2008 at 05:52:54PM -0700, Noah wrote:
> On Mar 24, 2:58 pm, Derek Martin <c... at pizzashack.org> wrote:
> > If and only if the total amount of output is greater than the
> > specified buffer size, then reading on this file hangs indefinitely.

> I think this is more of a limitation with the underlying clib.
> Subprocess buffering defaults to block buffering instead of
> line buffering. 

That's an interesting thought, but I guess I'd need you to elaborate
on how the buffering mode would affect the operation of select().  I
really don't see how your explanation can cover this, given the
following:

1. The subprocess used to test, in both the case where it worked, and
the case where it did not, was the very same shell script -- not a
compiled program (well, bash technically).  As far as I'm aware, there
haven't been any significant changes to the buffering mode defaults in
glibc...  But I could easily be mistaken.

2. By default, STDERR is always unbuffered, whether or not STDOUT is a
terminal device or not.

3. The actual subproc I care about is a perl script.

4. Most importantly, the whole point of using select() is that it
should only return a list of file objects which are ready for reading
or writing.  In this case, in both the working case (Python 2.4+ on
Red Hat) and the non-working case (Python 2.2.1 on Debian 3.1),
select() returns the file object corresponding to the subprocess's
STDOUT, which *should* mean that there is data ready to be read on
that file descriptor.  However, the actual read blocks, and both the
parent and the child go to sleep.

This should be impossible.  That is the very problem select() is
designed to solve...

Moreover, we've set the buffer size to 8k.  If your scenario were
correct, then at the very least, as soon as the process wrote 8k to
STDOUT, there should be data ready to read.  Assuming full buffering
is enabled for the pipe that connects STDOUT of the subprocess to the
parent, the call to select() should block until one of the following
conditions occur: 

 - 8k of data is written by the child into the pipe

 - any amount of data is written to STDERR

 - the child process terminates

The last point is important; if the child process only has 4k of data
to write to STDOUT, and never writes anything to STDERR, then the
buffer will never fill.  However, the program will terminate, at which
point (assuming there was no explicit call to close() previously) the
operating system will close all open file descriptors, and flush all
of the child's I/O buffers.  At that point, the parent process, which
would be sleeping in select(), will wake up, read the 4k of data, and
(eventually) close its end of the pipe (an additional iteration
through the select() loop will be required, I believe).

Should the program write output to STDERR before the 8k STDOUT buffer
is full, then again, the parent, sleeping in select(), will awaken, and
select will return the file object corresponding to the parent's end
of the pipe connecting to the child's STDERR.  Again, all of this is the
essence of what select() does.  It is supposed to guarantee that any
file descriptors (or objects) it returns are in fact ready for data to
be read or written.

So, unless I'm missing something, I'm pretty certain that buffering
mode has nothing to do with what's going on here.  I think there are
only a few possibilities:

1. My implementation of the select() loop is subtlely broken.  This
   seems like the most likely case to me; however I've been over it a
   bunch of times, and I can't find anything wrong with it.  It's
   undenyable that select is returning a file object, and that reads
   on that file object immediately after the call to select block.  I
   can't see how this could be possible, barring a bug somewhere else.

2. select.select() is broken in the version of Python I'm using.  

3. The select() system call is somehow broken in the Linux kernel I'm
   using.  I tend to rule this out, because I'm reasonably certain
   someone would have noticed this before I did.  The kernel in
   question is being used on thousands of machines (I'm not
   exaggerating) which run a variety of network-oriented programs.  I
   can't imagine that none of them uses select() (though perhaps its
   possible that none use it in quite the manner I'm using it here).
   But it may be worth looking at...  I could write an implementation
   of a select() loop in C and see how that works.

If you can see any flaw in my analysis, by all means point it out!
Thanks for your response.

-- 
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20080324/204bf295/attachment.sig>


More information about the Python-list mailing list