[Python-Dev] Socket timeout patch

Mon, 03 Jun 2002 13:22:16 -0400

[Addressing only points that need attention]

> > - Also please don't put a space between the open parenthesis and the
> >   first argument (you do this almost consistently in test_timeout.py).
> 
>   Couldn't really figure out what you were seeing here. I read that
> you saw something like func( a, b), which I don't see in my local
> copy.

test_timeout.py from the SF page has this.  I'm glad you fixed this
already in your own copy.

> I do have something like this for list comprehension:
> 
>     [ x.find('\n') for x in self._rbuf ]
> 
>   Er, but I though there were supposed to be surrounding spaces at the
> edges...

I prefer to see that as

    [x.find('\n') for x in self._rbuf]

>   I screwed up the write. New the write is:
> 
[...]
> 
>   which is pretty much the same as the old. The read should be ok
> though. I could really use someone with a win compiler to test this
> for me.

I'll review it more when you next upload it.

> > - It also looks like you've broken the semantics of size<0 in read().
> 
>   Maybe I'm misunderstanding the code, but I thought that a size < 0
> meant to read until there are no more? The statement:
> 
>     while size < 0 or buf_len < size:
> 
>   accomplishes the same thing as what's in the current socket.py
> implementation.  If you look closely, the 'if >= 0' branch *always* returns,
> meaning that the < 0 is equiv to while 1. Due to shortcutting, the same
> thing happens in the above statement. Maybe a comment would make it clearer?

I was referring to this piece of code:

!         if buf_len > size:
!             self._rbuf.append (data[size:])
!             data = data[:size]

Here data[size:] gives you the last byte of the data and data[:size]
chops off the last byte.

> > - Maybe changing the write buffer to a list makes sense too?
> 
>   I could do this. Then just do a join before the flush. Is the append
> /that/ much faster?

Depends on how small the chunks are you write.  Roughly, repeated list
append is O(N log N), while repeated string append is O(N**2).

> > - Since this code appears independent from the timeoutsocket code,
> >   maybe we can discuss this separately?
> 
>   The point of this code was to keep from losing data when an exception
> occurs (as timothy, if I remember correctly, pointed out). Hence the reason
> for keeping a lot more data around in instance variables instead of local
> variables. So the windows version might (in obscure cases) be affected
> by the timeout changes. That's what this patch was addressing.

OK, but given the issues the first version had, I recommand that the
code gets more review and that you write unit tests for all cases.

> > - Please don't introduce more static functions with a 'Py' name
> >   prefix.
> 
>   Only did this in one place, with PyTimeout_Err. The reason was that the
> other Error functions used the Py prefix, so it was done for consistency. I
> can change that.. or change the naming scheme with the others if you like.

I like to do code cleanup that doesn't change semantics (like
renamings) as a separate patch and checkin.  You can do this before or
after the timeout changes, but don't merge it into the timeout
changes.  I still like the static names that you introduce not to
start with Py.

> > - I believe that you can't reliably maintain a "sock_blocking" flag;
> >   there are setsockopt() or ioctl() calls that can make a socket
> >   blocking or non-blocking.  Also, there's no way to know whether a
> >   socket passed to fromfd() is in blocking mode or not.
> 
>   Well, upon socket creation (in init_sockobject), the socket is
> set to blocking mode. I think that handles fromfd, right? Doesn't
> every initialization means have to call that function?

OK, it looks like you call internal_setblocking(s, 0) to set the
socket in nonblocking mode.  (Hm, I don't see any calls to set the
socket in blocking mode!)

So do I understand that you are now always setting the socket in
non-blocking mode, even when there is no timeout specified, and that
you look at the sock_blocking flag to decide whether to do timeouts or
just pass the nonblocking behavior to the user?

This is a change in semantics, and could interfere with existing
applications that pass the socket's file descriptor off to other
code.  I think I'd be happier if the behavior wasn't changed at all
until a timeout is set for a socket -- then existing code won't
break.

>   The real problem would be someone using an ioctl or setsockopt (Can
> you even do blocking stuff through setsockopt?).

Yes, setblocking() makes a call to setsockopt(). :-)

> Ugh.  The original timeoutsocket didn't really deal with anything
> like that.  Erm, seems like an interface problem here - using ioctl
> kinda breaks the socket object interface. Perhaps we should be doing
> some sort of getsockopt to figure out the blocking mode and update
> our state accordingly? That would be an extra call for each thing to
> the interface though.

I only really care for sockets passed in to fromfd().  E.g. someone
can currently do:

  s1 = socket(AF_INET, SOCK_STREAM)
  s1.setblocking(0)

  s2 = fromfd(s1.fileno())
  # Now s2 is non-blocking too

I'd like this to continue to work as long as s1 doesn't set a timeout.

>   One solution is to set/unset blocking mode right before doing each
> call to be sure of the state and based on the internally stored value
> of the blocking attribute... but... then that kind of renders ioctl
> useless.

Don't worry so much about ioctl, but do worry about fromfd.

>   Another solution might be to set the blocking mode to on everytime
> someone sets a timeout. That would change the blocking/socket
> interaction already described a bit but not drastically. Also easy
> to implement.  That sends the message: Don't use ioctls when using
> timeouts.

I like this.

>   Hmm.. Will need to think about this more. Any insight would be
> helpful or some wisdom about how you usually handle this sort of
> thing.

See above.  Since we don't know what people out there are doing, I
don't want to break existing code.  We do know that existing code
doesn't use (this form of) timeout, so we can exploit that knowledge.

> > - What is s->errorhandler() for?  AFAICT, this is always equal to
> >   PySocket_Err!
> 
>   This was always in the module. Not sure why it was put there intially.
> I used it to be consistent.

Argh, you're right.  MAL added this; I'll ask him why.

>   If you made it this far, it's time for coffee.

When can I expect a new version?

--Guido van Rossum (home page: http://www.python.org/~guido/)