Problem with sets and Unicode strings

Dennis Benzinger Dennis.Benzinger at gmx.net
Wed Jun 28 12:48:13 EDT 2006


Robert Kern wrote:
> Dennis Benzinger wrote:
>> Serge Orlov wrote:
>>> On 6/27/06, Dennis Benzinger <Dennis.Benzinger at gmx.net> wrote:
>>>> Hi!
>>>>
>>>> The following program in an UTF-8 encoded file:
>>>>
>>>>
>>>> # -*- coding: UTF-8 -*-
>>>>
>>>> FIELDS = ("Fächer", )
>>>> FROZEN_FIELDS = frozenset(FIELDS)
>>>> FIELDS_SET = set(FIELDS)
>>>>
>>>> print u"Fächer" in FROZEN_FIELDS
>>>> print u"Fächer" in FIELDS_SET
>>>> print u"Fächer" in FIELDS
>>>>
>>>>
>>>> gives this output
>>>>
>>>>
>>>> False
>>>> False
>>>> Traceback (most recent call last):
>>>>    File "test.py", line 9, in ?
>>>>      print u"FÀcher" in FIELDS
>>>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1:
>>>> ordinal not in range(128)
>>>>
>>>>
>>>> Why do the first two print statements succeed and the third one fails
>>>> with an exception?
>>> Actually all three statements fail to produce correct result.
>>
>> So this is a bug in Python?
> 
> No.
> [...]

But I'd say that it's not intuitive that for sets x in y can be false 
(without raising an exception!) while the doing the same with a tuple 
raises an exception. Where is this difference documented?


Thanks,
Dennis



More information about the Python-list mailing list