[issue2583] urlparse normalize URL path

Senthil report at bugs.python.org
Fri May 16 05:48:40 CEST 2008


Senthil <orsenthil at users.sourceforge.net> added the comment:

Just try it this way.
>>> print urlparse.urljoin('http://site.com/', 'path/../path/.././path/./')
http://site.com/path/
>>>

The difference is the inital '/' in the second argument.
Human interpretation is:
Go to http://site.com/ and 1) go to path directory 2) go to one-level
above (/../) which results in site.com again 3) go to path directory 4)
go to one-level above (..) (results site.com )5) Stay in the same
directory (.) 6) goto path 7) stay there (.) 
Final result is http://www.site.com/path/

When you start the path with a '/'
>>> print urlparse.urljoin('http://site.com/', '/path/../path/.././path/./')
http://site.com/path/../path/.././path/./

The RFC (1808) suggests the following.
urlparse.urljoin('http://a/b/c/d','/./g') = <URL:http://a/./g>
The argument is taken as a complete path for the server.


The way to use this would be, this way:

>>> print urlparse.urljoin('http://site.com/', 'path/../path/.././path/./')
http://site.com/path/
>>>

This is not a bug and can be closed.

----------
nosy: +orsenthil

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2583>
__________________________________


More information about the Python-bugs-list mailing list