[Python-ideas] PEP 540: Add a new UTF-8 mode

Victor Stinner victor.stinner at gmail.com
Thu Jan 5 12:10:04 EST 2017


2017-01-05 17:50 GMT+01:00 Victor Stinner <victor.stinner at gmail.com>:
> In its current shame, my PEP 540 leaves Python default unchanged, but
> adds two modes: UTF-8 and UTF-8 strict. The UTF-8 mode is more or less
> the UNIX mode generalized for all inputs and outputs: mojibake is a
> feature, just pass bytes unchanged. The UTF-8 strict mode is more
> extreme that the current "Strict Unicode mode" since it fails on
> *decoding* data from the operating system.
> (...)
> Now that I have a better view of what we have and what we want, the
> question is if the default behaviour should be changed and if yes,
> how.

A common request is that "Python just works" without having to pass a
command line option or set an environment variable. Maybe the default
behaviour should be left unchanged, but the behaviour with the POSIX
locale should change.

Maybe we can enable the UTF-8 mode (or "UNIX mode") of the PEP 540
when the POSIX locale is used?

Said differently, the UTF-8 would not only be enabled by -X utf8 and
PYTHONUTF8=1, but also enabled by the common LANG=C and when Python is
started in an empty environment (no env var).

Victor


More information about the Python-ideas mailing list