[Python-3000] String comparison

Thu Jun 7 20:34:16 CEST 2007

"Stephen J. Turnbull" <turnbull at sk.tsukuba.ac.jp> wrote:
> Josiah Carlson writes:
> 
>  > Maybe I'm missing something, but it seems to me that there might be a
>  > simple solution.  Don't normalize any identifiers or strings.
> 
> That's not a solution, that's denying that there's a problem.

For core Python, there is no problem.  The standard libraries don't have
any normalization issues, nor will they have any normalization issues. 
The only place where there could be potential for normalization issues
is in to-be-written 3rd party code.

With that said, from what I understand, there are three places where we
could potentially do normalization; identifiers, literals, data.
Identifiers and literals have the best case for normalization, data the
worst (don't change my data without me telling you to!)  From Guido's
recent post, he seems to say more or less the same thing with
normalization to text read through the text IO layer.

Since I don't expect to be reading much unicode from disk (and/or I
expect to be reading bytes and decoding them to unicode manually), being
able to disable normalization on data from text IO is fine.

Regarding the rest of it, I've come to the point of exhaustion.  I no
longer have the energy to care what happens with Python 3.0 and unicode
(identifiers, literals, data, types, etc.), but I hope Ka-Ping is able
to convince people more than I have. Good luck with the decisions.

Good day,
 - Josiah