urllib2 documentation

Mike C. Fletcher mcfletch at rogers.com
Sat Oct 2 23:00:00 EDT 2004


Carlos Ribeiro wrote:
...

>Well, I agree that some parts of the library are not very well
>documented. But urllib2 is *so* simple to use for almost all
>situations that it goes almost without the need for documentation. If
>all that you need is to read a URL, do it like this:
>
>urlopen("http://somesite.com/somepage")
>  
>
...

>But
>that's not something that really should worry most users.
>  
>
What you've described, and what the examples demonstrate, however, is 
only the most trivial use case for urllib2, ones no different than 
urllib.  The urllib2 module is *intended* to allow you to work within 
esoteric/complex environments including those requiring http 
authentication, https negotiation and the like. That is, the whole 
*point* of urllib2 is that it handles those situations where urllib 
doesn't provide the needed features. But the documentation never 
explains how to accomplish those things which are beyond urllib (which 
lets you pass in a data argument, AFAICS, which is all that the examples 
demonstrate with urllib2).

The documentation explains what objects are there, but not how to use 
them.  I can gather I must use an HTTPSHandler object to open an https 
url, but where do I put it?  How do I associate an HTTPBasicAuthHandler 
with a request?  What do I do to give it my authentication parameters? 
(You can figure that out if you think to click on the "password manager" 
explanation link and intuit that you can call add_password on the 
AuthHandler too)

Eventually you either figure out (or find sample code) that tells you 
that those rather cryptic build_opener and install_opener functions have 
something to do with making it all work.

There's actually sample code in the docstrings of the module which is at 
least as useful as the example in the official docs, btw, which looks 
like this:

    import urllib2

    # set up authentication info
    authinfo = urllib2.HTTPBasicAuthHandler()
    authinfo.add_password('realm', 'host', 'username', 'password')

    proxy_support = urllib2.ProxyHandler({"http" : "http://ahad-haam:3128"})

    # build a new opener that adds authentication and caching FTP handlers
    opener = urllib2.build_opener(proxy_support, authinfo,
    urllib2.CacheFTPHandler)

    # install it
    urllib2.install_opener(opener)

    f = urllib2.urlopen('http://www.python.org/')

and is sufficient to get you going.  It explains *where* you put your 
handlers, gives examples of how to use the objects (i.e. it actually 
documents the "add_password", rather than the internal customisation 
point *http_error_401 for the AuthHandler,* and gives a concrete example 
of the constructor arguments for the proxy).

So, my suggestion (which is why this is copied to docs@), copy the 
examples from the module docstring into the examples section of the docs.

Anyway, enough ranting by old farts too lazy to write better 
documentation themselves (that's me, btw, though in my defense, these 
days I use Twisted for this kind of stuff),
Mike

________________________________________________
  Mike C. Fletcher
  Designer, VR Plumber, Coder
  http://www.vrplumber.com
  http://blog.vrplumber.com




More information about the Python-list mailing list