Problem with sets and Unicode strings

Jean-Paul Calderone exarkun at divmod.com
Thu Jun 29 15:50:06 EDT 2006


On Thu, 29 Jun 2006 21:19:30 +0200, Dennis Benzinger <dennis.benzinger at gmx.net> wrote:
>Robert Kern wrote:
>> Dennis Benzinger wrote:
>>> Ok, I understand.
>>> But isn't it a (minor) problem that using a set like this:
>>>
>>> # -*- coding: UTF-8 -*-
>>>
>>> FIELDS_SET = set(("Fächer", ))
>>>
>>> print u"Fächer" in FIELDS_SET
>>> print u"Fächer" == "Fächer"
>>>
>>> shadows the error of not setting sys.defaultencoding()?
>>
>> You can't set the default encoding. If you could, then scripts that run
>> on your machine wouldn't run on mine.
>> [...]
>
>As Serge Orlov wrote in one of his posts you _can_ set the default
>encoding (at least in site.py). See
><http://docs.python.org/lib/module-sys.html>

But doing so is not useful so one should generally never do it.  You
cannot set the default encoding on any computers you don't directly
control, so any software you write which depends on this will not be
easily distributable.  Additionally, if you decide to use two packages
which use this feature and go to the trouble of modifying your own
site.py for them, you won't be able to, since there can only be one
default system encoding.  Only one will be able to work at a time.

The default encoding is ascii and should always be ascii.  If you want
another encoding, specify it in a call to .encode() or .decode().

Jean-Paul



More information about the Python-list mailing list