sending a file through sockets

Donn Cave donn at u.washington.edu
Tue Jul 9 12:44:24 EDT 2002


Quoth Bryan Olson <fakeaddress at nowhere.org>:
| Donn Cave wrote:
...
|> They certainly ought to
|> be much more robust than the present example needed to be, since it
|> really was more introductory.  What perhaps is really harder than it
|> looks, is deciding "how much is enough".  Part of my job is to decide
|> how robust an application needs to be, and while it may be good to
|> err on the side of safety, that doesn't excuse me from making the call.
|
| For an internet application, it turns out to be more than most people
| think.  It's a rough network out there.

I'm going to try one last time to clarify my point here: it _depends_
_on_ _the_ _application_.  Some applications need to be as bomb-proof
as possible, but many others don't.  That isn't because they're not
"professional", or not "real", it's because the risk is a tolerable
trade-off for the complexity of a bomb-proof solution.  Python's network
client modules are that kind of trade-off.  If you can develop alternative
implementations that are as portable, as simple to use and maintain, but
are more robust, then I'm sure we're all ears.

| [...]
|> The signal issue with httplib.py is another matter.  It isn't so much
|> about sockets, as C stdio via file objects.  I wouldn't have written
|> it that way myself, because of liabilities like this.  recv(2) will
|> return EINTR, and socketobject.recv() will accordingly raise an exception,
|> so there is no ambiguity.  fileobject.read() ignores the error.  The
|> example used recv(), so it didn't put its foot in that hole to start with.
|> But that's another trade-off - the authors of those modules apparently
|> felt that the simplicity of a file object solution was worth the lossage.
|
| I have to disagree.  The trouble with signals is their global-ness.  The
| modules don't handle the exception on EINTR.  One module using an alarm
| can break others.  On the other side, the sockets module (on some
| platforms) calls signal to ignore SIGPIPE.  That could come as a
| surprise to other modules or users, and it's possible someone else might
| set it back.

There are more problems than that with signals.  Python does what
it can with them, but as an interpreted language it is at a natural
disadvantage.  If it were my job to write the distribution socket
clients, I wouldn't handle EINTR either - some applications might
want it to interrupt a read, and unless the API is going to support
alternative options on this matter, it's the best one.  That's one
of the issues you're going to have to deal with in your improved
socket clients, right?  More complicated API.

What is ``I have to disagree'' about?  Are you saying that you have
a plan to make file objects work fine with sockets and signals?

I don't think I agree with ignoring SIGPIPE, but we seem to be wandering
from the topic of introductory socket examples.

| I welcome you to send a byte count.  I didn't even say it was bad
| advice; just that it's not really needed, and since the protocol
| didn't check it, we might as well take it out.  To me, this one looked
| like a job for HTTP/0.9.

If you ever get your hands an actual application that sends a byte
count but doesn't look at it on the other end - leave it alone.
A couple of revisions down the line, some instance of the other end
will check that byte count, and you won't have to break all the
others when you put it in.

| For a flexible protocol I would not recommend sending a single byte
| count as a prefix, because in many cases the sender doesn't know how
| much data he'll be transmitting. It's a huge pain to have to buffer the
| entire transmission to see how big it is. So we'd want a chunked
| representations, or a maybe a second control connection.

Even better, let's use XML and make the server parse everything.
Ha, ha.  If you want to divide your data into several sends,
that's OK, just send a byte count with each and some indication
that either this one is or is not the last.

	Donn Cave, donn at u.washington.edu



More information about the Python-list mailing list