[issue2583] urlparse normalize URL path
Senthil
report at bugs.python.org
Fri May 16 05:48:40 CEST 2008
Senthil <orsenthil at users.sourceforge.net> added the comment:
Just try it this way.
>>> print urlparse.urljoin('http://site.com/', 'path/../path/.././path/./')
http://site.com/path/
>>>
The difference is the inital '/' in the second argument.
Human interpretation is:
Go to http://site.com/ and 1) go to path directory 2) go to one-level
above (/../) which results in site.com again 3) go to path directory 4)
go to one-level above (..) (results site.com )5) Stay in the same
directory (.) 6) goto path 7) stay there (.)
Final result is http://www.site.com/path/
When you start the path with a '/'
>>> print urlparse.urljoin('http://site.com/', '/path/../path/.././path/./')
http://site.com/path/../path/.././path/./
The RFC (1808) suggests the following.
urlparse.urljoin('http://a/b/c/d','/./g') = <URL:http://a/./g>
The argument is taken as a complete path for the server.
The way to use this would be, this way:
>>> print urlparse.urljoin('http://site.com/', 'path/../path/.././path/./')
http://site.com/path/
>>>
This is not a bug and can be closed.
----------
nosy: +orsenthil
__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2583>
__________________________________
More information about the Python-bugs-list
mailing list