Python 3.2 has some deadly infection

Marko Rauhamaa marko at pacujo.net
Fri Jun 6 13:11:02 EDT 2014


Steven D'Aprano <steve+comp.lang.python at pearwood.info>:

> On Fri, 06 Jun 2014 18:32:39 +0300, Marko Rauhamaa wrote:
>> Unicode, like ASCII, is a code. Representing text in unicode is
>> encoding.
>
> A Unicode string as an abstract data type has no encoding.

Unicode itself is an encoding. See it in action here:

    72 101 108 108 111 44 32 119 111 114 108 100

> It is a Platonic ideal, a pure form like the real numbers.

Far from it. It is a mapping from symbols to integers. The symbols are
the Platonic ones.

The Unicode/ASCII encoding above represents the same "Platonic" string
as this ESCDIC one:

    212 133 147 147 150 107 64 166 150 153 137 132

> Unicode string like this:
>
> s = u"NOBODY expects the Spanish Inquisition!"
>
> should not be thought of as a bunch of bytes in some encoding,

Encoding is not tied to bytes or even computers. People can speak in
code, after all.


Marko



More information about the Python-list mailing list