Python 3.2 has some deadly infection

Ethan Furman ethan at stoneleaf.us
Fri Jun 6 09:24:35 EDT 2014


On 06/05/2014 11:30 AM, Marko Rauhamaa wrote:
 >
> How text is represented is very different from whether text is a
> fundamental data type. A fundamental text file is such that ordinary
> operating system facilities can't see inside the black box (that is,
> they are *not* encoded as far as the applications go).

Of course they are.  It may be an ASCII-encoding of some flavor or 
other, or something really (to me) strange -- but an encoding is most 
assuredly in affect.

ASCII is *not* the state of "this string has no encoding" -- that would 
be Unicode; a Unicode string, as a data type, has no encoding.  To 
transport it, store it, etc., it must (usually?) be encoded into 
something -- utf-8, ASCII, turkish, or whatever subset is agreed upon 
and will hopefully contain all the Unicode characters needed for the 
string to be properly represented.

The realization that ASCII was, in fact, an encoding was a big paradigm 
shift for me, but a necessary one.

--
~Ethan~




More information about the Python-list mailing list