[issue3991] urllib.request.urlopen does not handle non-ASCII characters
Daniel Diniz
report at bugs.python.org
Sun Feb 8 22:50:20 CET 2009
Daniel Diniz <ajaksu at gmail.com> added the comment:
I think Toshio's usecase is important enough to deserve a fix (patch
attached) or a special-cased error message. IMO, newbies trying to fix
failures from urlopen may have a hard time figuring out the maze:
urlopen -> _opener -> open -> _open -> _call_chain -> http_open ->
do_open (and that's before leaving urllib!).
>>> from urllib.request import urlopen
>>> url = 'http://localhost/ñ.html'
>>> urlopen(url).read()
Traceback (most recent call last):
[...]
UnicodeEncodeError: 'ascii' codec can't encode character '\xf1' in
position 5: ordinal not in range(128)
If the newbie isn't completely lost by then, how about:
>>> from urllib.parse import quote
>>> urlopen(quote(url)).read()
Traceback (most recent call last):
[...]
ValueError: unknown url type: http%3A//localhost/%C3%B1.html
----------
keywords: +patch
nosy: +ajaksu2
Added file: http://bugs.python.org/file12986/non_ascii_path.diff
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3991>
_______________________________________
More information about the Python-bugs-list
mailing list