[I18n-sig] Re: [Python-Dev] Unicode debate
M.-A. Lemburg
mal@lemburg.com
Wed, 03 May 2000 10:15:29 +0200
Just van Rossum wrote:
>
> [MAL vs. PP]
> >> > FYI: Normalization is needed to make comparing Unicode
> >> > strings robust, e.g. u"é" should compare equal to u"e\u0301".
> >>
> >> That's a whole 'nother debate at a whole 'nother level of abstraction. I
> >> think we need to get the bytes/characters level right and then we can
> >> worry about display-equivalent characters (or leave that to the Python
> >> programmer to figure out...).
> >
> >I just wanted to point out that the argument "slicing doesn't
> >work with UTF-8" is moot.
>
> And failed...
Huh ? The pure fact that you can have two (or more)
Unicode characters to represent a single character makes
Unicode itself have the same problems as e.g. UTF-8.
> [Refs about collation and decomposition]
>
> It's very deep stuff, which seems more appropriate for an extension than
> for builtin comparisons to me.
That's what I think too; I never argued for making this
builtin and automatic (don't know where people got this idea
from).
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/