[issue19977] Use "surrogateescape" error handler for sys.stdin and sys.stdout on UNIX for the C locale

STINNER Victor report at bugs.python.org
Mon Apr 28 01:57:39 CEST 2014


STINNER Victor added the comment:

> We should not overcomplicate this. I suggest that we simply use utf-8 under the C locale.

Please open a new issue if you would prefer UTF-8. You will have to solve different technical issues. I tried to list some of them in issues #19846 and #19847.

In short, you should always decode and encode "OS data" with the same encoding. Python "file system encoding" is the locale encoding because in some places, PyUnicode_DecodeLocale[AndSize]() is used (ex: to decode PYTHONWARNINGS environment variable). A common location is PyUnicode_DecodeFSDefaultAndSize() before the Python codec is loaded. See also _Py_wchar2char() and _Py_char2wchar() functions which use the locale encoding and are used in many places.

I'm now closing the issue because the initial point (use surrogateescape error handler) is implemented in Python 3.5, and backporting such major change in Python 3.4 branch is risky right now.

----------
resolution:  -> fixed
status: open -> closed

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19977>
_______________________________________


More information about the Python-bugs-list mailing list