convert Unicode to lower/uppercase?
Peter Otten
__peter__ at web.de
Mon Sep 22 03:39:24 EDT 2003
"Martin v. Löwis" wrote:
> jallan wrote:
>
>> But that really doesn't work properly. According to Unicode specs and
>> German usage the uppercase of "ß" is actually "SS", that is the single
>> character "ß" should uppercase to two characters.
>
> Can you cite exact chapter and verse of the Unicode specs that say so?
> According to the Unicode database,
>
> http://www.unicode.org/Public/UNIDATA/UnicodeData.txt
>
> has neither an uppercase mapping, nor a lowercase mapping.
It seems like UnicodeData.txt does not give the full story. Quoting from
http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt:
[...]
# (For compatibility, the UnicodeData.txt file only contains case mappings
for
# characters where they are 1-1, and does not have locale-specific
mappings.)
[...]
# <code>; <lower> ; <title> ; <upper> ; (<condition_list> ;)? # <comment>
[...]
# The German es-zed is special--the normal mapping is to SS.
# Note: the titlecase should never occur in practice. It is equal to
titlecase(uppercase(<es-zed>))
00DF; 00DF; 0053 0073; 0053 0053; # LATIN SMALL LETTER SHARP S
[...]
Thus, to comply with the standard, "ß".upper() --> "SS" is required.
> Also, in German, the uppercase mapping of ß is of ongoing debate.
My personal impression is that, even before the orthography reform in 1998,
the SZ variant was seldom used.
For the "official" rule see http://www.ids-mannheim.de/reform/a2-3.html.
Peter
More information about the Python-list
mailing list