Pure python implementation of string-like class

Ross Ridge rridge at csclub.uwaterloo.ca
Sat Feb 25 15:58:47 EST 2006


Steve Holden wrote:
>"Wider than UTF-16" doesn't make sense.

Ross Ridge wrote"
> It makes perfect sense.

Alan Kennedy wrote:
> UTF-16 is a "Unicode Transcription Format", meaning that it is a
> mechanism for representing all unicode code points, even the ones with
> ordinals greater than 0xFFFF, using series of 16-bit values.

It's an encoding format that only supports encoding 1,112,064 different
characters making it a little more than 20-bits wide.   While this
enough to encode all code points currently assigned by Unicode, it's
not sufficient to encode the private use area of ISO 10646-1 that
Akihiro Kayama wants to use.

                                                   Ross Ridge




More information about the Python-list mailing list