char 128? no... 256

Peter Hansen peter at engcorp.com
Wed Feb 12 13:35:38 EST 2003


Afanasiy wrote:
> 
> All of my devices can display the TradeMark symbol correctly.
> None of them can print the Unicode character 8482. I never use Unicode.
> The TradeMark symbol is being encoded to that Unicode value, 8482.
> I would like to decode that back to what I assume is iso-8859-1.
> However, encoding back to iso-8859-1 only allows characters under 256.

If you haven't caught the nuance yet, notice that Unicode is not
*encoded* from Python's point of view but is the *decoded* form
of the many various possible *encodings* such as Latin-1 or CP1250.

Remember to think of Unicode as the *base* form of the information,
and that anything else including ASCII is therefore an *encoding* 
and you'll be more likely to pick the correct function when trying
to choose between xxx.encode() and xxx.decode().

Maybe a mnemonic would help: "Unicode is *de* Code!"  :-)

-Peter




More information about the Python-list mailing list