[Python-Dev] Unicode and comparisons
M.-A. Lemburg
mal@lemburg.com
Tue, 04 Apr 2000 11:26:53 +0200
Fredrik bug report made me dive a little deeper into compares
and contains tests.
Here is a snapshot of what my current version does:
>>> '1' == None
0
>>> u'1' == None
0
>>> '1' == 'aäöü'
0
>>> u'1' == 'aäöü'
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeError: UTF-8 decoding error: invalid data
>>> '1' in ('a', None, 1)
0
>>> u'1' in ('a', None, 1)
0
>>> '1' in (u'aäöü', None, 1)
0
>>> u'1' in ('aäöü', None, 1)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeError: UTF-8 decoding error: invalid data
The decoding errors occur because 'aäöü' is not a valid
UTF-8 string (Unicode comparisons coerce both arguments
to Unicode by interpreting normal strings as UTF-8
encodings of Unicode).
Question: is this behaviour acceptable or should I go
even further and mask decoding errors during compares
and contains tests too ?
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/