[issue24968] Python 3 raises Unicode errors with the xxx.UTF-8 locale
Nick Coghlan
report at bugs.python.org
Tue Sep 1 02:05:15 CEST 2015
Nick Coghlan added the comment:
Looking again at the *specific* bug report here, I'm moving the resolution to "out of date", as it's actually the one we addressed in 3.5 by enabling surrogateescape by default on all of the standard streams when the OS claims the locale encoding is ASCII, not just stderr: http://bugs.python.org/issue19977
That allows us to at least correctly roundtrip data, even if the OS has given has bad encoding settings.
The problem with forcing UTF-8 more generally when the OS claims ASCII is that it may be the wrong thing to do and result in data corruption, especially on systems using East Asian codecs. Querying /etc/locale.conf [1] instead of relying on the nominal glibc locale settings should reliably give us correct encoding/locale information on modern Linux systems in cases like this one, where SSH has forwarded mismatched locale settings from a client system to a server shell session.
Another issue with relevant background discussion is issue #23993, which speculated on extending the "default to surrogateescape" idea to all open() calls when glibc claims the locale encoding is ASCII.
[1] http://www.freedesktop.org/software/systemd/man/locale.conf.html
----------
resolution: not a bug -> out of date
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24968>
_______________________________________
More information about the Python-bugs-list
mailing list