[Tutor] urllib2.urlopen()

Sun Oct 14 04:50:54 CEST 2012

On 14/10/12 12:45, Ray Jones wrote:
> On 10/13/2012 05:09 PM, Brian van den Broek wrote:
>> On 13 October 2012 19:44, Ray Jones<crawlzone at gmail.com>  wrote:
>>> I am attempting to capture url headers and have my script make decisions
>>> based on the content of those headers.
>>>
>>> Here is what I am using in the relative portion of my script:
>>>
>>> try:
>>>      urllib2.urlopen('http://myurl.org')
>>> except urllib2.HTTPError, e:

Well, in this case, for that URL, the connection succeeds without
authentication. It might help if you test with a URL that actually
fails :)

>>> In the case of authentication error, I can print e.info() and get all
>>> the relevant header information. But I don't want to print.

Then don't.

If you can do `print e.info()`, then you can also do `info = e.info()`
and inspect the info programmatically.

[...]
> Thanks for the response. I experimented some, but I am not even sure
> what kinds of things to try. I mostly tried things like
> E.__getattribute__() or print E.strerror, but nothing seemed to give me
> what I was looking for.

Normally you would look up the documentation for HTTPError and see what
attributes it is documented to have:

http://docs.python.org/library/urllib2.html#urllib2.HTTPError

but unfortunately the docs are rather sparse. In this case, I strongly
recommend the "urllib2 missing manual":

http://www.voidspace.org.uk/python/articles/urllib2.shtml

-- 
Steven