[issue44375] urllib.parse.urlparse is not parsing the url properly

Gnanesh report at bugs.python.org
Thu Jun 10 07:52:24 EDT 2021


Gnanesh <gnaneshkunal at outlook.com> added the comment:

Hey neethu,

For empty schemes, it should have a prefix of "//" in the URL to parse it correctly.

Try:
> urlparse('//www.cwi.nl:80')

ParseResult(scheme='', netloc='www.cwi.nl:80', path='', params='', query='', fragment='')


Here's a comment from the docs (https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlparse): 
> Following the syntax specifications in RFC 1808, urlparse recognizes a netloc only if it is properly introduced by ‘//’. Otherwise the input is presumed to be a relative URL and thus to start with a path component.

----------
nosy: +Gnanesh

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue44375>
_______________________________________


More information about the Python-bugs-list mailing list