urllib2 timeout not working - stalls for an hour or so

Peter Otten __peter__ at web.de
Fri Sep 2 09:04:29 EDT 2016


Sumeet Sandhu wrote:

> Hi,
> 
> I use urllib2 to grab google.com webpages on my Mac over my Comcast home
> network.
> 
> I see about 1 error for every 50 pages grabbed. Most exceptions are
> ssl.SSLError, very few are socket.error and urllib2.URLError.
> 
> The problem is - after a first exception, urllib2 occasionally stalls for
> upto an hour (!), at either the urllib2.urlopen or response.read stages.
> 
> Apparently the urllib2 and socket timeouts are not effective here - how do
> I fix this?
> 
> ----------------
> import urllib2
> import socket
> from sys import exc_info as sysExc_info
> timeout = 2
> socket.setdefaulttimeout(timeout)
> 
>     try :
>         req = urllib2.Request(url,None,headers)
>         response = urllib2.urlopen(req,timeout=timeout)
>         html = response.read()
>     except :
>         e = sysExc_info()[0]
>         open(logfile,'a').write('Exception: %s \n' % e)
> < code that follows this : after the first exception, I try again for a
> few tries >

I'd use separate try...except-s for response = urlopen() and 
response.read(). If the problem originates with read() you could try to 
replace it with select.select([response.fileno()], [], [], timeout) calls in 
a loop.




More information about the Python-list mailing list