Everything you did not want to know about Unicode in Python 3

Marko Rauhamaa marko at pacujo.net
Tue May 13 05:59:28 EDT 2014


Johannes Bauer <dfnsonfsduifb at gmx.de>:

> The only people who are angered by this now is people who always
> treated encodings sloppily and it "just worked". Well, there's a good
> chance it has worked by pure chance so far. It's a good thing that
> Python does this now more strictly as it gives developers *guarantees*
> about what they can and cannot do with text datatypes without having
> to deal with encoding issues in many places. Just one place: The
> interface where text is read or written, just as it should be.

I'm not angered by text. I'm just wondering if it has any practical use
that is not misuse...

For example, Py3 should not make any pretense that there is a "default"
encoding for strings. Locale's are an abhorrent invention from the early
8-bit days. IOW, you should never input or output text without explicit
serialization.

I get the feeling that Py3 would like to present a world where strings
are first-class I/O objects that can exist in files, in filenames,
inside pipes. You say, "text is read or written." I'm saying text is
never read or written. It only exists as an abstraction (not even
unicode) inside the virtual machine.


Marko



More information about the Python-list mailing list