urllib timeout issues

Thu Mar 29 09:35:13 EDT 2007

On Mar 27, 4:50 pm, "Gabriel Genellina" <gagsl-... at yahoo.com.ar>
wrote:
> En Tue, 27 Mar 2007 17:41:44 -0300, supercooper <supercoo... at gmail.com>
> escribió:
>
>
>
> > On Mar 27, 3:13 pm, "Gabriel Genellina" <gagsl-... at yahoo.com.ar>
> > wrote:
> >> En Tue, 27 Mar 2007 16:21:55 -0300, supercooper <supercoo... at gmail.com>
> >> escribió:
>
> >> > I am downloading images using the script below. Sometimes it will go
> >> > for 10 mins, sometimes 2 hours before timing out with the following
> >> > error:
>
> >> >     urllib.urlretrieve(fullurl, localfile)
> >> > IOError: [Errno socket error] (10060, 'Operation timed out')
>
> >> > I have searched this forum extensively and tried to avoid timing out,
> >> > but to no avail. Anyone have any ideas as to why I keep getting a
> >> > timeout? I thought setting the socket timeout did it, but it didnt.
>
> >> You should do the opposite: timing out *early* -not waiting 2 hours- and
> >> handling the error (maybe using a queue to hold pending requests)
>
> >> --
> >> Gabriel Genellina
>
> > Gabriel, thanks for the input. So are you saying there is no way to
> > realistically *prevent* the timeout from occurring in the first
>
> Exactly. The error is out of your control: maybe the server is down,
> irresponsive, overloaded, a proxy has any problems, any network problem,
> etc.
>
> > place?  And by timing out early, do you mean to set the timeout for x
> > seconds and if and when the timeout occurs, handle the error and start
> > the process again somehow on the pending requests?  Thanks.
>
> Exactly!
> Another option: Python is cool, but there is no need to reinvent the
> wheel. Use wget instead :)
>
> --
> Gabriel Genellina

Gabriel...thanks for the tip on wget...its awesome! I even built it on
my mac. It is working like a champ for hours on end...

Thanks!

chad

import os, shutil, string

images = [['34095d2','Nashoba'],
['34096c8','Nebo'],
['36095a4','Neodesha'],
['33095h7','New Oberlin'],
['35096f3','Newby'],
['35094e5','Nicut'],
['34096g2','Non'],
['35096h6','North Village'],
['35095g3','Northeast Muskogee'],
['35095g4','Northwest Muskogee'],
['35096f2','Nuyaka'],
['34094e6','Octavia'],
['36096a5','Oilton'],
['35096d3','Okemah'],
['35096c3','Okemah SE'],
['35096e2','Okfuskee'],
['35096e1','Okmulgee Lake'],
['35095f7','Okmulgee NE'],
['35095f8','Okmulgee North'],
['35095e8','Okmulgee South'],
['35095e4','Oktaha'],
['34094b7','Old Glory Mountain'],
['36096a4','Olive'],
['34096d3','Olney'],
['36095a6','Oneta'],
['34097a2','Overbrook']]

wgetDir = 'C:/Program Files/wget/o'
exts = ['tif', 'tfw']
url = 'http://www.archive.org/download/'
home = '//fayfiler/seecoapps/Geology/GEOREFRENCED IMAGES/TOPO/Oklahoma
UTMz14meters NAD27/'

for image in images:
    for ext in exts:
        fullurl = url + 'usgs_drg_ok_' + image[0][:5] + '_' + image[0]
[5:] + '/o' + image[0] + '.' + ext
        os.system('wget %s -t 10 -a log.log' % fullurl)
        shutil.move(wgetDir + image[0] + '.' + ext, home + 'o' +
image[0] + '_' + string.replace(image[1], ' ', '_') + '.' + ext)