Quick tutorial on using the unicodedata module?

Brian Quinlan brian at sweetapp.com
Tue Dec 31 19:35:52 EST 2002


> The unicodedata module docs come with no examples, so I'm left bumping
> around more-or-less in the dark.  I just encountered a web page with 
> no encoding information which contains an octal 205 byte.  It seems to
> display as an ellipsis, and my heuristic decoder function expresses it

> as u'\x85' and says the encoding s utf-8.  With those bits of 
> information ("ellipsis", 0205, u'\x85') I can't seem to get any 
> unicodedata function to return anything useful.  
> Any suggestions would be appreciated.

Hi Skip, what information are you looking for?

Unicode defines a bunch of ellispsis:

0xeaf LAO ELLIPSIS
0x1801 MONGOLIAN ELLIPSIS
0x2026 HORIZONTAL ELLIPSIS
0x22ee VERTICAL ELLIPSIS
0x22ef MIDLINE HORIZONTAL ELLIPSIS
0x22f0 UP RIGHT DIAGONAL ELLIPSIS
0x22f1 DOWN RIGHT DIAGONAL ELLIPSIS

Which one are you looking for? 0x2026? Another tidbit is that there are
no valid single byte UTF-8 sequences except for the ASCII characters.

Cheers,
Brian 






More information about the Python-list mailing list