More urllib timeout issues.

Steve Holden steve at holdenweb.com
Sat Apr 28 11:18:24 EDT 2007


John Nagle wrote:
>    I thought I had all the timeout problems with urllib worked around,
> but no.
> 
>    socket.setdefaulttimeout is useful, but not always effective.
> I'm setting that to 15 seconds.
> If the host end won't open the connection within 15 seconds,
> urllib times out.  But if the host end opens the connection,
> then never sends anything, urllib waits for many minutes before
> timing out.  Any idea how to deal with this?  And don't just
> say "use urllib2" unless you KNOW it works better there and
> can explain why.  I finally have M2Crypto and urllib playing
> well together, and don't want to mess with that.
> 
>    For some wierd reason, several UK academic sites have this
> behavior, including "soton.ac.uk".  If you try to open that
> in a browser, the browser just sits there, and eventually,
> after several minutes, displays "The site is taking too
> long to respond".
> 
>    What's the current status in this area?  Some patches to sockets
> were proposed a while back.  There's a long history of trouble
> in this area, and some fixes, but nothing that just works.
> The sockets module has two timeout settings (socket.setdefaulttimeout and
> sock.settimeout, the M2Crypto module has two (sock.set_socket_read_timeout and 
> sock.set_socket_write_timeout), and none of them play well together
> or with the urllib/urllib2/httplib level and the blocking/non blocking
> socket distinction.
> 
>    What we really should have is something like this:
> 
> Sockets should have
> 	set_socket_connect_timeout
> 	set_socket_read_timeout
> 	set_socket_write_timeout
> 
> which set an upper limit on how long a socket can go with a request for
> a connect, read or write pending but without progress on the connection.
> This needs to be independent of select poll timeouts, and these timeouts
> should work on blocking sockets.
> 
> The existing socket function
> 
> 	settimeout
> 
> should set all of the above, and
> 
> 	socket.setdefaulttimeout
> 
> should set the default value for settimeout to be used on new sockets.
> 
> SSL and M2Crypto, which wrap socket functionality,
> should understand all the above functions.
> 
> HTTPlib, urllib, and urllib2 objects should understand
> 
> 	settimeout
> 
> Making the connect/read/write timeout distinction at that level
> probably isn't worth the trouble.
> 	
> Then we'd have a reasonable network timeout system.
> We have about half of the above now, but it's not consistent.
> 
> Comments?
> 
The only comments I'll make for now are

1) There is work afoot to build timeout arguments into network libraries 
for 2.6, and I know Facundo Batista has been involved, you might want to 
Google or email Facundo about that.

2) The main reason why socket.setdefaulttimeout is unsuitable for many 
purposes is its thread-unsafe property, so all threads must use the same 
default timeout or have it randomly change according to the whim of hte 
last thread to alter it.

3) This is important and sensible work and if properly followed through 
will likely lead to serious quality improvements in the network libraries.

regards
  Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd           http://www.holdenweb.com
Skype: holdenweb      http://del.icio.us/steve.holden
------------------ Asciimercial ---------------------
Get Python in your .sig and on the web. Blog and lens
holdenweb.blogspot.com        squidoo.com/pythonology
tag items:            del.icio.us/steve.holden/python
All these services currently offer free registration!
-------------- Thank You for Reading ----------------




More information about the Python-list mailing list