[Python-Dev] Re: [I18n-sig] Re: Unicode debate

Guido van Rossum guido@python.org
Wed, 03 May 2000 08:22:57 -0400


> [Guido]
> > When *comparing* 8-bit and Unicode strings, the presence of non-ASCII
> > bytes in either should make the comparison fail; when ordering is
> > important, we can make an arbitrary choice e.g. "\377" < u"\200".
> 
> [Toby]
> > I assume 'fail' means 'non-equal', rather than 'raises an exception'?
> 
> [Guido]
> > Yes, sorry for the ambiguity.

[Tim]
> Huh!  You sure about that?  If we're setting up a case where meaningful
> comparison is impossible, isn't an exception more appropriate?  The current
> 
> >>> 83479278 < "42"
> 1
> >>>
> 
> probably traps more people than it helps.

Agreed, but that's the rule we all currently live by, and changing it
is something for Python 3000.

I'm not real strong on this though -- I was willing to live with
exceptions from the UTF-8-to-Unicode conversion.  If we all agree that
it's better for u"\377" == "\377" to raise an precedent-setting
exception than to return false, that's fine with me too.  I do want
u"a" == "a" to be true though (and I believe we all already agree on
that one).

Note that it's not the first precedent -- you can already define
classes whose instances can raise exceptions during comparisons.

--Guido van Rossum (home page: http://www.python.org/~guido/)