[issue37198] _parse_localename fail to parse 'en_IL'

Eryk Sun report at bugs.python.org
Wed Jun 12 23:46:02 EDT 2019


Eryk Sun <eryksun at gmail.com> added the comment:

Windows prefers locale names based on RFC 4646 language tags, which use a hyphen instead of an underscore (e.g. "en-UK"). This name format doesn't include an encoding. The C runtime (not the Windows API) makes one exception to support ".utf8" and ".utf-8" (e.g. "en-UK.utf8"). If UTF-8 is not specified, setlocale implicitly uses the ANSI codepage of the given locale (which is not to be confused with the ANSI codepage of the system locale).

As noted in this issue, _parse_localename currently fails if there's no encoding. hodai's PR 14027 addresses this case by looking for underscore in the name. In Windows, it should also look for hyphen. Also, instead of using None for the encoding in this case, in Windows we can look it up via ___lc_codepage_func [1]. This could be added as _locale._lc_codepage_func.

[1] https://docs.microsoft.com/en-us/cpp/c-runtime-library/lc-codepage-func?view=vs-2019

----------
nosy: +eryksun

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue37198>
_______________________________________


More information about the Python-bugs-list mailing list