Question about using urllib2 to load a url
Kushal Kumaran
kushal.kumaran at gmail.com
Mon Apr 2 00:31:59 EDT 2007
On Apr 2, 2:52 am, "ken" <ken.carl... at gmail.com> wrote:
> Hi,
>
> i have the following code to load a url.
> My question is what if I try to load an invalide url
> ("http://www.heise.de/"), will I get an IOException? or it will wait
> forever?
>
Depends on why the URL is invalid. If the URL refers to a non-
existent domain, a DNS request will result in error and you will get
an "urllib2.URLError: <urlopen error (-2, 'Name or service not
known')>". If the name resolves but the host is not reachable, the
connect code will timeout (eventually) and result in an
"urllib2.URLError: <urlopen error (113, 'No route to host')>". If the
host exists but does not have a web server running, you will get an
"urllib2.URLError: <urlopen error (111, 'Connection refused')>". If a
webserver is running but the requested page does not exist, you will
get an "urllib2.HTTPError: HTTP Error 404: Not Found".
The URL you gave above does not meet any of these conditions, so
results in a valid handle to read from.
If, at any time, an error response fails to reach your machine, the
code will have to wait for a timeout. It should not have to wait
forever.
> Thanks for any help.
>
> opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
> urllib2.install_opener(opener)
>
> txheaders = {'User-agent': 'Mozilla/5.0 (X11; U; Linux i686; en-
> US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3'}
>
> try:
> req = Request(url, txdata, txheaders)
> handle = urlopen(req)
> except IOError, e:
> print e
> print 'Failed to open %s' % url
> return 0;
--
Kushal
More information about the Python-list
mailing list