[Python-Dev] urllib2 doesn't grok URLs w/ user/passwd

Tue Dec 30 14:28:52 EST 2003

> On Tuesday 30 December 2003 04:03 pm, Skip Montanaro wrote:
> > SF seems to be down for some unscheduled reason.  Posting here just so I
> > don't completely forget about it should I exit my web browser before SF is
> > back up...
> >
> > urllib2.urlopen("http://foo@www.python.org/") fails (at least in part)
> > because it fails to separate the username and password from the hostname.
> > Trying to open http://foo:bar@www.python.org/ reveals other shortcomings in
> > its url parsing.  It seems to me the syntactic bits shouldn't be difficult
> > to resolve using urllib.spluituser().  I'm much less clear what to do with
> > the username and password once they've been separated from the hostname.
> 
> Presumably they need to be kept somewhere and sent in the Authorization
> header in case the server returns a 401 error and challenge (or a proxy 
> returns a 407 error and challenge) -- or maybe the Authorization header
> (with the base 64 encoding of user:pass) can be sent even as part of the
> first request to speed things up (assuming an authorization scheme of
> Basic).

This is what the ever-popular old urllib does.

> RFC 2617, I believe.  urllib2's architecture delegates authorization
> to separate components, of course, so I guess the userid and password
> should just be handed over to such components if they're present, but I
> haven't looked into that in detail.

Me neither.

--Guido van Rossum (home page: http://www.python.org/~guido/)