sending a file through sockets

Sat Jul 6 06:06:19 EDT 2002

brueckd at tbye.com wrote:
 > On Fri, 5 Jul 2002, Bryan Olson wrote:

 >>No, cutting the cord will not do the same thing.  You only get the zero-
 >>byte successful read on graceful shutdown.
 >
 > Actually, you get 0-byte successful read when you simply call close() 
too.

I assume you mean when the other side calls close.  Yes, that's a clean
shutdown (of it's writing end).

 > Try it. Also note that the socket object destructor calls close(), so
 > those sockets also get closed cleanly.

Right, that's the one gottcha I noted.  The system will try to shutdown
cleanly, so a local error at the sender could result in a graceful
shutdown of his socket.

 > Finally, I just tried killing the
 > server process and the client side immediately came back with a 0-byte
 > successful read.

That's the system doing the shutdown for you.  If you think it's too
dangerous, well, the code you posted didn't actually check the length.
The explicitly sent length only protected against the server sending
more data than that.

 > ...and is not small source of errors in HTTP agents, and is also why
 > HTTP 1.1 discourages this method.

I'm prepared to believe you that this was a consideration in HTTP/1.1,
but can you cite it? I see a different reason in RFC 2068, "In order to
remain persistent, all messages on the connection must have a self-
defined message length (i.e., one not defined by closure of the
connection), as described in section 4.4."

 > It's especially annoying to deal with
 > if you're writing web proxies and caches and the origin server is using
 > closed sockets to mean end-of-transmission.

What's annoying is proxying from a source that delineates the end by
marker (connection close or other) to a protocol that puts the content
length up front.  Even if you don't trust shutdown to mark the end of
content, requiring the length beforehand is problematic.  Sometimes the
sender doesn't know.

 >>[...]

 > Ahh... there's the problem. ;-)    ------'
 > Works on Unices..

I tried searching for what it does, and what I found is a Linux patch
(2.02) with the comment: "connect() to INADDR_ANY means loopback
(BSD'ism)."  So, yes, it works on many systems; count me surprised. It
caught my eye because I didn't think INADDR_ANY makes sense for
connect().

 > FWIW, latest CVS version of sockets has timeout support. Yay!

Actually the sockets in the Python 2.2 (Windows) library I have support
timeouts and select().  The problem is that the protocol libraries don't
use them.

 >>Try your pull-the-plug idea on the Python network protocol libraries.
 >>Most TCP stacks will detect the dead connection in about two hours.
 >
 > Again, a call to shutdown is not needed to get the 0-byte successful recv
 > call.

The remote side has to do the shutdown, though it may not be a
deliberate call from the application that does it.

 > The other side detects it immediately, for other problems you tend
 > to get a socket.error exception.

On pull-the-plug a recv won't detect the connection loss until the local
system tries a keepalive probe, which is typically two hours.

 > What you _do_ need timeouts for is when
 > you are reading data from a buggy sender that puts your connection on a
 > shelf and then forgets about it, i.e. the connection is still open and
 > ready for use but the sender side simply fails to send anything.

Lots of connections go silent do to failures other than sender bugs,
such as system crashes, power failures, and the ever popular loss of
modem connections.  And then of course there are malicious senders.

--Bryan