Is unicode.lower() locale-independent?

Torsten Bronger bronger at physik.rwth-aachen.de
Sat Jan 12 07:26:42 EST 2008


Hallöchen!

Fredrik Lundh writes:

> Robert Kern wrote:
>
>>> However it appears from your bug ticket that you have a much
>>> narrower problem (case-shifting a small known list of English
>>> words like VOID) and can work around it by writing your own
>>> locale-independent casing functions. Do you still need to find
>>> out whether Python unicode casings are locale-dependent?
>>
>> I would still like to know. There are other places where .lower()
>> is used in numpy, not to mention the rest of my code.
>
> "lower" uses the informative case mappings provided by the Unicode
> character database; see
>
>     http://www.unicode.org/Public/4.1.0/ucd/UCD.html
>
> afaik, changing the locale has no influence whatsoever on Python's
> Unicode subsystem.

Slightly off-topic because it's not part of the Unicode subsystem,
but I was once irritated that the none-breaking space (codepoint xa0
I think) was included into string.whitespace.  I cannot reproduce it
on my current system anymore, but I was pretty sure it occured with
a fr_FR.UTF-8 locale.  Is this possible?  And who is to blame, or
must my program cope with such things?

Tschö,
Torsten.

-- 
Torsten Bronger, aquisgrana, europa vetus
                                      Jabber ID: bronger at jabber.org
               (See http://ime.webhop.org for further contact info.)



More information about the Python-list mailing list