[urllib2 + Tor] How to handle 404?

Chris Rebert clp at rebertia.com
Fri Nov 7 03:28:34 EST 2008


On Fri, Nov 7, 2008 at 12:05 AM, Gilles Ganault <nospam at nospam.com> wrote:
> Hello
>
>        I'm using the urllib2 module and Tor as a proxy to download data
> from the web.
>
> Occasionnally, urlllib2 returns 404, probably because of some issue
> with the Tor network. This code doesn't solve the issue, as it just
> loops through the same error indefinitely:
>
> =====
> for id in rows:
>        url  = 'http://www.acme.com/?code=' + id[0]
>        while True:
>                try:
>                        req = urllib2.Request(url, None, headers)
>                        response = urllib2.urlopen(req).read()
>                except HTTPError,e:
>                        print 'Error code: ', e.code
>                        time.sleep(2)
>                        continue
                else: #should align with the `except`
                        break
        handle_success(response) #should align with `url =` line

Cheers,
Chris
-- 
Follow the path of the Iguana...
http://rebertia.com

> =====
>
> Any idea of what I should do to handle this error properly?
>
> Thank you.
> --
> http://mail.python.org/mailman/listinfo/python-list
>



More information about the Python-list mailing list