[issue19451] urlparse accepts invalid hostnames
Terry J. Reedy
report at bugs.python.org
Sat Nov 2 00:39:40 CET 2013
Terry J. Reedy added the comment:
The 3.4 urllib.parse.urlparse doc says "The module has been designed to match the Internet RFC on Relative Uniform Resource Locators. It supports the following URL schemes: <list of 24, including 'file:'>".
To me, 'support' means 'accept every valid URL for the particular scheme' but not necessarily 'reject every URL that is invalid for the particular scheme'.
The other RFCs references are these:
"Following the syntax specifications in RFC 1808, urlparse recognizes a netloc only if it is properly introduced by ‘//’." and
" The fragment is now parsed for all URL schemes (unless allow_fragment is false), in accordance with RFC 3986."
I currently see this, at best, as a request to deprecate 'over-acceptance', to be removed in the future. But if there are urls in the wild that use _s, then practicality says that this should be closed as invalid.
----------
nosy: +terry.reedy
type: behavior -> enhancement
versions: -Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.5
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19451>
_______________________________________
More information about the Python-bugs-list
mailing list