[issue3991] urllib.request.urlopen does not handle non-ASCII characters
Bill Janssen
report at bugs.python.org
Mon Sep 29 22:47:32 CEST 2008
Bill Janssen <bill.janssen at gmail.com> added the comment:
As I read RFC 2396,
1.5: "A URI is a sequence of characters from a very
limited set, i.e. the letters of the basic Latin alphabet, digits,
and a few special characters."
2.4: "Data must be escaped if it does not have a representation using an
unreserved character; this includes data that does not correspond to
a printable character of the US-ASCII coded character set, or that
corresponds to any US-ASCII character that is disallowed, as
explained below."
So your URL string is invalid. You need to escape the characters properly.
(RFC 2396 is what the HTTP RFC cites as its authority on URLs.)
----------
nosy: +janssen
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3991>
_______________________________________
More information about the Python-bugs-list
mailing list