[Python-Dev] PEP 540: Add a new UTF-8 mode (v3)

Victor Stinner victor.stinner at gmail.com
Sun Dec 10 12:21:50 EST 2017


Ok, I fixed the effects of the locale coercion (PEP 538). Does it now
look good to you, Naoki?

https://www.python.org/dev/peps/pep-0540/#relationship-with-the-locale-coercion-pep-538

The commit:

https://github.com/python/peps/commit/71cda51fbb622ece63f7a9d3c8fa6cd33ce06b58

diff --git a/pep-0540.txt b/pep-0540.txt
index 0a9cbc1e..c163916d 100644
--- a/pep-0540.txt
+++ b/pep-0540.txt
@@ -144,9 +144,15 @@ The POSIX locale enables the locale coercion (PEP
538) and the UTF-8
 mode (PEP 540). When the locale coercion is enabled, enabling the UTF-8
 mode has no (additional) effect.

-Locale coercion only impacts non-Python code like C libraries, whereas
-the Python UTF-8 Mode only impacts Python code: the two PEPs are
-complementary.
+The UTF-8 has the same effect than locale coercion:
+``sys.getfilesystemencoding()`` returns ``'UTF-8'``,
+``locale.getpreferredencoding()`` returns ``UTF-8``, ``sys.stdin`` and
+``sys.stdout`` error handler set to ``surrogateescape``. These changes
+only affect Python code. But the locale coercion has addiditonal
+effects: the ``LC_CTYPE`` environment variable and the ``LC_CTYPE``
+locale are set to a UTF-8 locale like ``C.UTF-8``. The side effect is
+that non-Python code is also impacted by the locale coercion. The two
+PEPs are complementary.

 On platforms where locale coercion is not supported like Centos 7, the
 POSIX locale only enables the UTF-8 Mode. In this case, Python code uses

Victor


2017-12-10 5:47 GMT+01:00 INADA Naoki <songofacandy at gmail.com>:
> Now I'm OK to accept the PEP, except one nitpick.
>
>>
>> Locale coercion only impacts non-Python code like C libraries, whereas
>> the Python UTF-8 Mode only impacts Python code: the two PEPs are
>> complementary.
>>
>
> This sentence seems bit misleading.
> If UTF-8 mode is disabled explicitly, locale coercion affects Python code too.
> locale.getpreferredencoding() is UTF-8, open()' s default encoding is UTF-8,
> and stdio is UTF-8/surrogateescape.
>
> So shouldn't this sentence is: "Locale coercion impacts both of Python code
> and non-Python code like C libraries, whereas ..."?
>
> INADA Naoki  <songofacandy at gmail.com>


More information about the Python-Dev mailing list