urllib timeout issues

supercooper supercooper at gmail.com
Tue Mar 27 15:21:55 EDT 2007


I am downloading images using the script below. Sometimes it will go
for 10 mins, sometimes 2 hours before timing out with the following
error:

Traceback (most recent call last):
  File "ftp_20070326_Downloads_cooperc_FetchLibreMapProjectDRGs.py",
line 108, i
n ?
    urllib.urlretrieve(fullurl, localfile)
  File "C:\Python24\lib\urllib.py", line 89, in urlretrieve
    return _urlopener.retrieve(url, filename, reporthook, data)
  File "C:\Python24\lib\urllib.py", line 222, in retrieve
    fp = self.open(url, data)
  File "C:\Python24\lib\urllib.py", line 190, in open
    return getattr(self, name)(url)
  File "C:\Python24\lib\urllib.py", line 322, in open_http
    return self.http_error(url, fp, errcode, errmsg, headers)
  File "C:\Python24\lib\urllib.py", line 335, in http_error
    result = method(url, fp, errcode, errmsg, headers)
  File "C:\Python24\lib\urllib.py", line 593, in http_error_302
    data)
  File "C:\Python24\lib\urllib.py", line 608, in redirect_internal
    return self.open(newurl)
  File "C:\Python24\lib\urllib.py", line 190, in open
    return getattr(self, name)(url)
  File "C:\Python24\lib\urllib.py", line 313, in open_http
    h.endheaders()
  File "C:\Python24\lib\httplib.py", line 798, in endheaders
    self._send_output()
  File "C:\Python24\lib\httplib.py", line 679, in _send_output
    self.send(msg)
  File "C:\Python24\lib\httplib.py", line 646, in send
    self.connect()
  File "C:\Python24\lib\httplib.py", line 630, in connect
    raise socket.error, msg
IOError: [Errno socket error] (10060, 'Operation timed out')


I have searched this forum extensively and tried to avoid timing out,
but to no avail. Anyone have any ideas as to why I keep getting a
timeout? I thought setting the socket timeout did it, but it didnt.

Thanks.

<--- CODE --->

images = [['34095e3','Clayton'],
['35096d2','Clearview'],
['34095d1','Clebit'],
['34095c3','Cloudy'],
['34096e2','Coalgate'],
['34096e1','Coalgate SE'],
['35095g7','Concharty Mountain'],
['34096d6','Connerville'],
['34096d5','Connerville NE'],
['34096c5','Connerville SE'],
['35094f8','Cookson'],
['35095e6','Council Hill'],
['34095f5','Counts'],
['35095h6','Coweta'],
['35097h2','Coyle'],
['35096c4','Cromwell'],
['35095a6','Crowder'],
['35096h7','Cushing']]

exts = ['tif', 'tfw']
envir = 'DEV'
# URL of our image(s) to grab
url = 'http://www.archive.org/download/'
logRoot = '//fayfiler/seecoapps/Geology/GEOREFRENCED IMAGES/TOPO/
Oklahoma UTMz14meters NAD27/'
logFile = os.path.join(logRoot, 'FetchLibreDRGs_' + strftime('%m_%d_%Y_
%H_%M_%S', localtime()) + '_' + envir + '.log')

# Local dir to store files in
fetchdir = logRoot
# Entire process start time
start = time.clock()

msg = envir + ' - ' + "Script: " + os.path.join(sys.path[0],
sys.argv[0]) + ' - Start time: ' + strftime('%m/%d/%Y %I:%M:%S %p',
localtime()) + \
 
'\n--------------------------------------------------------------------------------------------------------------
\n\n'
AddPrintMessage(msg)
StartFinishMessage('Start')

# Loop thru image list, grab each tif and tfw
for image in images:
    # Try and set socket timeout default to none
    # Create a new socket connection for every time through list loop
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect(('archive.org', 80))
    s.settimeout(None)

    s2 = time.clock()
    msg = '\nProcessing ' + image[0] + ' --> ' + image[1]
    AddPrintMessage(msg)
    print msg
    for ext in exts:
        fullurl = url + 'usgs_drg_ok_' + image[0][:5] + '_' + image[0]
[5:] + '/o' + image[0] + '.' + ext
        localfile = fetchdir + image[0] + '_' +
string.replace(image[1], ' ', '_') + '.' + ext
        urllib.urlretrieve(fullurl, localfile)
    e2 = time.clock()
    msg = '\nDone processing ' + image[0] + ' --> ' + image[1] +
'\nProcess took ' + Timer(s2, e2)
    AddPrintMessage(msg)
    print msg
    # Close socket connection, only to reopen with next run thru loop
    s.close()

end = time.clock()
StartFinishMessage('Finish')
msg = '\n\nDone! Process completed in ' + Timer(start, end)
AddPrintMessage(msg)




More information about the Python-list mailing list