[Python-Dev] decoding errors when comparing strings

Paul Prescod paul@prescod.net
Sat, 15 Jul 2000 12:24:56 -0500


Fredrik Lundh wrote:
> 
> ...
> 
> but how about this one:
> 
>    >>> u"едц" == "едц"
>    Traceback (most recent call last):
>      File "<stdin>", line 1, in ?
>    UnicodeError: ASCII decoding error: ordinal not in range(128)

As soon as you find a character out of the ASCII range in one of the
strings, I think that you should report that the two strings are
unequal. We can't have exceptions popping out of dictionaries and other
"blind compare" situations. Is there any precedent for throwing an
exception on cmp?

Also, is it really necessary to allow raw non-ASCII characters in source
code though? We know that they aren't portable across editing
environments, so one person's happy face will be another person's left
double-dagger.

-- 
 Paul Prescod - Not encumbered by corporate consensus
It's difficult to extract sense from strings, but they're the only
communication coin we can count on. 
	- http://www.cs.yale.edu/~perlis-alan/quotes.html