[I18n-sig] Re: [Python-Dev] Unicode debate

M.-A. Lemburg mal@lemburg.com
Wed, 03 May 2000 10:15:29 +0200


Just van Rossum wrote:
> 
> [MAL vs. PP]
> >> > FYI: Normalization is needed to make comparing Unicode
> >> > strings robust, e.g. u"é" should compare equal to u"e\u0301".
> >>
> >> That's a whole 'nother debate at a whole 'nother level of abstraction. I
> >> think we need to get the bytes/characters level right and then we can
> >> worry about display-equivalent characters (or leave that to the Python
> >> programmer to figure out...).
> >
> >I just wanted to point out that the argument "slicing doesn't
> >work with UTF-8" is moot.
> 
> And failed...

Huh ? The pure fact that you can have two (or more)
Unicode characters to represent a single character makes
Unicode itself have the same problems as e.g. UTF-8.

> [Refs about collation and decomposition]
>
> It's very deep stuff, which seems more appropriate for an extension than
> for builtin comparisons to me.

That's what I think too; I never argued for making this
builtin and automatic (don't know where people got this idea
from).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/