is there a bug in urlunparse/urlunsplit
Rob Williscroft
rtw at freenet.co.uk
Sun May 18 16:37:19 EDT 2008
Alex wrote in news:09764c57-03ce-4ccb-a26d-
e851899dcc7c at a23g2000hsc.googlegroups.com in comp.lang.python:
> Hi all.
>
> Is there a bug in the urlunparse/urlunsplit functions?
> Look at this fragment (I know is quite silly):
>
> urlunparse(urlparse('www.example.org','http'))
> ---> 'http:///www.example.org'
> ^^^^^
Try these 3:
urlparse('www.example.org','http')
urlparse('http://www.example.org','http')
urlparse('//www.example.org','http')
The 1st returns www.example.org as the path part
with the other 2 its the location (domain) part.
Although it may not be immediately obvious that the result
is correct, consider the follwing html fragment:
<img src="aaa.gif">
<img stc="http://anothersite.com/bbb.gif">
If you were to use urlparse to parse the src attributes
you would want:
( '', '', 'aaa.gif', '','','' )
( 'http', 'anothersite.com', '/bbb.gif', '','','' )
Which AIUI is what urlparse does.
Rob.
--
http://www.victim-prime.dsl.pipex.com/
More information about the Python-list
mailing list