How to test a URL request in a "while True" loop

MRAB python at mrabarnett.plus.com
Wed Dec 30 20:08:04 EST 2009


Brian D wrote:
> Thanks MRAB as well. I've printed all of the replies to retain with my
> pile of essential documentation.
> 
> To follow up with a complete response, I'm ripping out of my mechanize
> module the essential components of the solution I got to work.
> 
> The main body of the code passes a URL to the scrape_records function.
> The function attempts to open the URL five times.
> 
> If the URL is opened, a values dictionary is populated and returned to
> the calling statement. If the URL cannot be opened, a fatal error is
> printed and the module terminates. There's a little sleep call in the
> function to leave time for any errant connection problem to resolve
> itself.
> 
> Thanks to all for your replies. I hope this helps someone else:
> 
> import urllib2, time
> from mechanize import Browser
> 
> def scrape_records(url):
>     maxattempts = 5
>     br = Browser()
>     user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:
> 1.9.0.16) Gecko/2009120208 Firefox/3.0.16 (.NET CLR 3.5.30729)'
>     br.addheaders = [('User-agent', user_agent)]
>     for count in xrange(maxattempts):
>         try:
>             print url, count
>             br.open(url)
>             break
>         except urllib2.URLError:
>             print 'URL error', count
>             # Pretend a failed connection was fixed
>             if count == 2:
>                 url = 'http://www.google.com'
>             time.sleep(1)
>             pass

'pass' isn't necessary.

>     else:
>         print 'Fatal URL error. Process terminated.'
>         return None
>     # Scrape page and populate valuesDict
>     valuesDict = {}
>     return valuesDict
> 
> url = 'http://badurl'
> valuesDict = scrape_records(url)
> if valuesDict == None:

When checking whether or not something is a singleton, such as None, use
"is" or "is not" instead of "==" or "!=".

>     print 'Failed to retrieve valuesDict'




More information about the Python-list mailing list