[issue20271] urllib.parse.urlparse() accepts wrong URLs

Karthikeyan Singaravelan report at bugs.python.org
Wed Mar 27 12:16:05 EDT 2019


Karthikeyan Singaravelan <tir.karthi at gmail.com> added the comment:

See also issue36338 for a possible security issue for host of value "benign.com[attacker.com]" (spam[::1] format) where attacker.com is parsed as the host name assuming presence of [ and ] to be a IPV6 address without validation of the value attacker.com inside [] to be a valid IPV6 address.

As a datapoint input "http://[::1]spam" raises exception in Java, golang and Ruby. Browser's JS console returns invalid URL. I too would like exception being raised but not sure at which level.

Ruby seems to use a regex : https://github.com/ruby/ruby/blob/trunk/lib/uri/rfc3986_parser.rb#L6
Java parseurl : http://hg.openjdk.java.net/jdk/jdk/file/c4c225b49c5f/src/java.base/share/classes/java/net/URLStreamHandler.java#l124
golang : https://github.com/golang/go/blob/50bd1c4d4eb4fac8ddeb5f063c099daccfb71b26/src/net/url/url.go#L587

See also https://url.spec.whatwg.org/#host-parsing

If input starts with U+005B ([), then:

    If input does not end with U+005D (]), validation error, return failure.

    Return the result of IPv6 parsing input with its leading U+005B ([) and trailing U+005D (]) removed.

----------
nosy: +xtreak
versions: +Python 3.6, Python 3.7, Python 3.8

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue20271>
_______________________________________


More information about the Python-bugs-list mailing list