[issue30338] LC_ALL=en_US + io.open() => LookupError: (osx)
Ronald Oussoren
report at bugs.python.org
Thu May 11 12:43:23 EDT 2017
Ronald Oussoren added the comment:
On macOS 10.12:
ronald$ LC_ALL=en_US python2.7 -c 'import locale; print(repr(locale.getpreferredencoding()))'
''
ronald$ LC_ALL=en_US python3.6 -c 'import locale; print(repr(locale.getpreferredencoding()))'
'UTF-8'
getpreferredencoding uses the CODESET path on macOS, with means the result above is explained by this session (python 2.7):
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'en_US'
>>> locale.nl_langinfo(locale.CODESET)
''
Note that _pyio uses locale.getpreferedencoding(), not locale.getpreferredencoding(False). The latter would use US-ASCII as the encoding:
>>> import locale
>>> locale.nl_langinfo(locale.CODESET)
'US-ASCII'
I guess the empty string for the encoding is explained by the following shell session that looks at the locale information:
$ LC_ALL=en_US.UTF-8 locale -ck LC_ALL | charmap
charmap="UTF-8"
$ LC_ALL=en_US locale -ck LC_ALL | grep charmap
charmap=
In python3 locale.getpreferredencoding (or rather, the same function in _bootlocale) was tweaked to deal with this problem:
if not result and sys.platform == 'darwin':
# nl_langinfo can return an empty string
# when the setting has an invalid value.
# Default to UTF-8 in that case because
# UTF-8 is the default charset on OSX and
# returning nothing will crash the
# interpreter.
result = 'UTF-8'
Backporting this to 2.7 would IMHO be the best way to fix this issue.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue30338>
_______________________________________
More information about the Python-bugs-list
mailing list