How to get an encoding a value?

Diez B. Roggisch deetsNOSPAM at web.de
Sat Oct 23 09:02:20 EDT 2004


> You mix up characters and glyphs which makes it confusing.
> There are no numeric values associated with glyphs in Unicode, but there
> are numeric values associated with abstract characters.
> (http://www.unicode.org/standard/WhatIsUnicode.html)


> Unicode provides a unique number for every character, no matter what the
> platform, no matter what the program, no matter what the language.
> 
> These numbers are called `code points'. (It says `unique' above, but later
> they relax that).
> 
> But you are right regarding the encodings. The Unicode code points can be
> encoded in different ways e.g. with the UTF-8 encoding.

Just checked - yup, you're right: a character might in fact be composed of
several glyphs. So they are closely related (especially in your common
western language), but not the same. 

Sheesh, that stuff is always a bit more complicated than one actually thinks
- I usually get the applicational part of it right, but the inner details
of unicode are still foggy...

-- 
Regards,

Diez B. Roggisch



More information about the Python-list mailing list