[Python-3000] PEP: Supporting Non-ASCII Identifiers

Thu Jun 7 17:29:39 CEST 2007

On 6/6/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Are you suggesting that this should be done on the fly
> when comparing strings? Or that all strings should be
> stored in canonicalised form?

Preferably the second; store them canonicalized.

> I can see some big cans of worms being opened up by
> either approach. Surprising results could include
> things like s1 == s2 but len(s1) <> len(s2), or
> len(s1 + s2) <> len(s1) + len(s2).

Yes, these are surprising, but that is the nature of unicode.

People will get used to it, with the same pains they face now over "1"
+ "1" = "11", or output that doesn't line up because one row had a
single-digit number.

-jJ