Problem with sets and Unicode strings

Dennis Benzinger Dennis.Benzinger at gmx.net
Tue Jun 27 14:46:22 EDT 2006


Hi!

The following program in an UTF-8 encoded file:


# -*- coding: UTF-8 -*-

FIELDS = ("Fächer", )
FROZEN_FIELDS = frozenset(FIELDS)
FIELDS_SET = set(FIELDS)

print u"Fächer" in FROZEN_FIELDS
print u"Fächer" in FIELDS_SET
print u"Fächer" in FIELDS


gives this output


False
False
Traceback (most recent call last):
   File "test.py", line 9, in ?
     print u"FÀcher" in FIELDS
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: 
ordinal not in range(128)


Why do the first two print statements succeed and the third one fails 
with an exception?

Why does the use of set/frozenset remove the exception?


Thanks,
Dennis



More information about the Python-list mailing list