[Python-Dev] UTF-16 code point comparison

Bill Tutt billtut@microsoft.com
Thu, 27 Jul 2000 11:29:34 -0700



> From: 	bckfnn@worldonline.dk [mailto:bckfnn@worldonline.dk] 


> - The BreakIterator does not handle surrogates. It does handle 
>   combining characters and it seems a natural place to put support
>   for surrogates.
> - The Collator class offers different levels of normalization before
>   comparing string but does not seem to support surrogates. This class
>   seems a natural place for javasoft to put support for surrogates 
>   during string comparison.

Both of these aren't surprising given that there aren't any officially
allocated characters in surrogate land
(but they're coming fast), and the most likely first allocations are to the
combined CJK glyph space which doesn't really have anything to collate. :)

Bill