[Python-Dev] PEP 538: Coercing the legacy C locale to a UTF-8 based locale

Fri May 5 00:57:53 EDT 2017

On 5 May 2017 at 02:25, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Thu, 4 May 2017 11:24:27 +0900
> INADA Naoki <songofacandy at gmail.com> wrote:
>> Hi, Nick and all core devs who are interested in this PEP.
>>
>> I'm reviewing PEP 538 and I want to accept it in this month.
>> It will reduces much UnicodeError pains which server-side OPs facing.
>> Thank you Nick for working on this PEP.
>>
>> If you have something worrying about this PEP, please post a comment
>> soon.  If you don't have enough time to read entire this PEP, feel free to
>> ask a question about you're worrying.
>
> From my POV, it is problematic that the behaviour outlined in PEP 538
> (see Abstract section) varies depending on the adoption of another PEP
> (PEP 540).
>
> If we want to adopt PEP 538 before pronouncing on PEP 540, then PEP 538
> should remove all points conditional on PEP 540 adoption, and PEP 540
> should later be changed to adopt those removed points as PEP
> 540-specific changes.

While I won't be certain until I update the PEP and reference
implementation, I'm pretty sure Inada-san's suggestion to replace the
call to Py_SetStandardStreamEncoding with defaulting to
surrogateescape on the standard streams in the C.UTF-8 locale will
remove this current dependency between the PEPs as well as making the
"C.UTF-8 locale" and "C locale coerced to C.UTF-8" behaviour
indistinguishable at runtime (aside from the stderr warning in the
latter case).

It will then be up to Victor to state in PEP 540 how locale coercion
will interact with Python UTF-8 mode (with my recommendation being the
one currently in PEP 538: it should implicitly set the environment
variable, so the mode activation is inherited by subprocesses)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia