[issue19977] Use "surrogateescape" error handler for sys.stdin and sys.stdout on UNIX for the C locale

STINNER Victor report at bugs.python.org
Fri Dec 13 18:03:54 CET 2013


STINNER Victor added the comment:

test_ls.py: test script producing invalid filenames and then trying to display them into stdout.

Output with UTF-8 locale, UTF-8 terminal and Python 3.3 (or unpatched 3.4, it's the same):

ascii.txt
<UnicodeError 'invalid_utf8:\udcff.txt'>
<UnicodeError 'latin1:\udce9.txt'>
utf8:é€.txt

Output with C locale (ASCII), UTF-8 terminal and Python 3.3:

ascii.txt
<UnicodeError 'invalid_utf8:\udcff.txt'>
<UnicodeError 'latin1:\udce9.txt'>
<UnicodeError 'utf8:\udcc3\udca9\udce2\udc82\udcac.txt'>

Output with C locale (ASCII), UTF-8 terminal and patched Python 3.4:

ascii.txt
invalid_utf8:�.txt
latin1:�.txt
utf8:é€.txt

You get no Unicode error with LANG=C, but you get mojibake instead.

----------
Added file: http://bugs.python.org/file33124/test_ls.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19977>
_______________________________________


More information about the Python-bugs-list mailing list