select.select and socket.setblocking

Sat Jan 3 16:25:59 EST 2009

Bryan Olson <fakeaddress at nowhere.org> wrote:

> There are cases where a socket can select() as readable, but not be 
> readable by the time of a following recv() or accept() call. All such 
> cases with which I'm familiar call for a non-blocking socket.

I used to believe that if select() said data was ready for reading, a 
subsequent read/recv/recvfrom() call could not block.  It could return an 
error, but it could not block.  I was confident of this until just a few 
months ago when reality blew up in my face.

The specific incident involved a bug in the linux kernel.  If you received 
an UDP packet with a checksum error, the select() would return when the 
packet arrived, *before* the checksum was checked.  By the time you did the 
recv(), the packet had been discarded and the recv() would block.

This led me on a big research quest (including some close readings of 
Stevens, which appeared to say that this couldn't happen).  The more I 
read, the more I (re) discovered just how vague and poorly written the 
Berkeley Socket API docs are :-)

The bottom line is that Bryan is correct -- regardless of what the various 
man pages and textbooks say, in the real world, it is possible for a read() 
to block after select() says the descriptor is ready.  The right way to 
think about select() is to treat it as a heuristic which can make a polling 
loop more efficient, but should never be relied upon to predict the future.

Neither the negative nor positive behavior is guaranteed.  There's no 
guaranteed response time; just because select() hasn't returned yet doesn't 
mean a descriptor couldn't be read without blocking in another thread right 
now.  And, just because it has returned, that doesn't mean by the time you 
get around to reading, there will still be anything there.