[issue19846] Setting LANG=C breaks Python 3 on Linux

Serhiy Storchaka report at bugs.python.org
Mon Dec 9 10:36:51 CET 2013


Serhiy Storchaka added the comment:

> And yet, in Python 2, people could do that, and Python didn't care.
> *That's* the regression I'm worried about. If it hadn't round-tripped
> cleanly in Python 2, I wouldn't care here either.

$ python2.7 -c "print u'\u20ac'"
€
$ LANG=C python2.7 -c "print u'\u20ac'"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 0: ordinal not in range(128)

And even worse:

$ python2.7 -c "print u'\u20ac'" >/dev/null
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 0: ordinal not in range(128)

What the wart!

Other program can produces wrong (or even absolutely senseless) output with C locale.

$ LANG=C ls
???????????? ????????????????                                      ???????????????????? ??????????
?????????? ??????????????                                          ?????????????? ????????????????????????
?????????? ????????                                                ???????????? ??????????????
?????????????? ??????????????????                                  ???????? ????????????????????

What is better, silently produce corrupted output or raise an exception? If first, then let just set the "replace" or "backslashreplace" error handler for sys.stdout by default.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19846>
_______________________________________


More information about the Python-bugs-list mailing list