non-blocking IO EAGAIN on write

Sat Jul 24 07:25:09 EDT 2010

In article <mailman.1105.1279945954.1673.python-list at python.org>,
 Kushal Kumaran <kushal.kumaran+python at gmail.com> wrote:

> In general, after select has told you a descriptor is ready, the 
> first write after that should always succeed.

I used to think that too.  Over the last few years, I've been 
maintaining a large hunk of cross-platform C++ code which makes heavy 
use of select(), with both UDP and TCP sockets.  I've seen lots of 
strange behavior.

For the moment, assume we're talking about a single-threaded program.  
This simplifies things a lot.

If you write (pseudo-code):

select(fd)
write(fd)

when the select indicates fd is ready, it's not really saying, "The 
following i/o call will succeed".  What it's saying is, "The following 
i/o call won't block".  It could return an error, as long as it returns 
it immediately.

Consider, for example, a write on a TCP connection.  You are sitting in 
a select(), when the other side closes the connection.  The select() 
should return, and the write should then immediately fail.  If you're 
tempted to say that the select() should return some sort of error, 
consider the case where the remote end closes the connection after the 
select() returns but before your process gets to execute the following 
write() call.

We also saw a case where (due to what we consider a kernel bug), a 
received UDP packet with a checksum error would cause the select() to 
wake up, *then* notice the checksum error and discard the packet, and 
thus the following read() would block.

The bottom line is if you really want to make sure you never block in an 
I/O call, put your descriptor into non-blocking mode, and treat select() 
as a *hint*.  A way to ask the kernel, "Tell me when you think it might 
be a good idea to try polling this descriptor again".