[I18n-sig] Re: [Python-Dev] Unicode debate

Paul Prescod paul@prescod.net
Tue, 02 May 2000 11:25:33 -0500


Guido van Rossum wrote:
> 
> Aha, then we'll see u == v even though type(u) is type(v) and len(u)
> != len(v).  /F's world will collapse. :-)

There are many levels of equality that are interesting. I don't think we
would move to grapheme equivalence until "the rest of the world" (XML,
Java, W3C, SQL) did. 

If we were going to move to grapheme equivalence (some day), the right
way would be to normalize characters in the construction of the Unicode
string. This is known as "Early normalization":

http://www.w3.org/TR/charmod/#NormalizationApplication

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
It's difficult to extract sense from strings, but they're the only
communication coin we can count on. 
	- http://www.cs.yale.edu/~perlis-alan/quotes.html