Is unicode.lower() locale-independent?
Torsten Bronger
bronger at physik.rwth-aachen.de
Sat Jan 12 07:26:42 EST 2008
Hallöchen!
Fredrik Lundh writes:
> Robert Kern wrote:
>
>>> However it appears from your bug ticket that you have a much
>>> narrower problem (case-shifting a small known list of English
>>> words like VOID) and can work around it by writing your own
>>> locale-independent casing functions. Do you still need to find
>>> out whether Python unicode casings are locale-dependent?
>>
>> I would still like to know. There are other places where .lower()
>> is used in numpy, not to mention the rest of my code.
>
> "lower" uses the informative case mappings provided by the Unicode
> character database; see
>
> http://www.unicode.org/Public/4.1.0/ucd/UCD.html
>
> afaik, changing the locale has no influence whatsoever on Python's
> Unicode subsystem.
Slightly off-topic because it's not part of the Unicode subsystem,
but I was once irritated that the none-breaking space (codepoint xa0
I think) was included into string.whitespace. I cannot reproduce it
on my current system anymore, but I was pretty sure it occured with
a fr_FR.UTF-8 locale. Is this possible? And who is to blame, or
must my program cope with such things?
Tschö,
Torsten.
--
Torsten Bronger, aquisgrana, europa vetus
Jabber ID: bronger at jabber.org
(See http://ime.webhop.org for further contact info.)
More information about the Python-list
mailing list