n00b confusion re: local variable referenced before assignment error

Diez B. Roggisch deets at nospam.web.de
Fri Jun 19 12:30:48 EDT 2009


Wells Oliver schrieb:
> Writing a class which essentially spiders a site and saves the files 
> locally. On a URLError exception, it sleeps for a second and tries again 
> (on 404 it just moves on). The relevant bit of code, including the 
> offending method:
> 
> class Handler(threading.Thread):
>         def __init__(self, url):
>                 threading.Thread.__init__(self)
>                 self.url = url
> 
>         def save(self, uri, location):
>                 try:
>                         handler = urllib2.urlopen(uri)
>                 except urllib2.HTTPError, e:
>                         if e.code == 404:
>                                 return
>                         else:
>                                 print "retrying %s (HTTPError)" % uri
>                                 time.sleep(1)
>                                 self.save(uri, location)
>                 except urllib2.URLError, e:
>                         print "retrying %s" % uri
>                         time.sleep(1)
>                         self.save(uri, location)
> 
>                 if not os.path.exists(os.path.dirname(location)):
>                         os.makedirs(os.path.dirname(location))
> 
>                 file = open(location, "w")
>                 file.write(handler.read())
>                 file.close()
> 
> ...
> 
> But what I am seeing is that after a retry (on catching a URLError 
> exception), I see bunches of "UnboundLocalError: local variable 
> 'handler' referenced before assignment" errors on line 38, which is the 
> "file.write(handler.read())" line..

Your code defines the name handler only if the urllib2.urlopen is 
successful. But you try later to access it uncoditionally, and of course 
that fails.

You need to put the file-stuff after the urlopen, inside the try-except.

Also note that python has no tail-recursion-optimization, so your method 
will recurse and at some point exhaust the stack if there are many errors.

You should consider writing it rather as while-loop, with breaking out 
of it when the page could be fetched.

Diez



More information about the Python-list mailing list