Py 3.3, unicode / upper()

Chris Angelico rosuav at gmail.com
Wed Dec 19 21:01:33 EST 2012


On Thu, Dec 20, 2012 at 8:23 AM, Ian Kelly <ian.g.kelly at gmail.com> wrote:
> On Wed, Dec 19, 2012 at 1:55 PM,  <wxjmfauth at gmail.com> wrote:
>> Yes, it is correct (or can be considered as correct).
>> I do not wish to discuss the typographical problematic
>> of "Das Grosse Eszett". The web is full of pages on the
>> subject. However, I never succeeded to find an "official
>> position" from Unicode. The best information I found seem
>> to indicate (to converge), U+1E9E is now the "supported"
>> uppercase form of U+00DF. (see DIN).
>
> Is this link not official?
>
> http://unicode.org/cldr/utility/character.jsp?a=00DF
>
> That defines a full uppercase mapping to SS and a simple uppercase
> mapping to U+00DF itself, not U+1E9E.  My understanding of the simple
> mapping is that it is not allowed to map to multiple characters,
> whereas the full mapping is so allowed.

Ahh, thanks, that explains why the other Unicode-aware language I
tried behaved differently.

Pike v7.9 release 5 running Hilfe v3.5 (Incremental Pike Frontend)
> string s="Stra\u00dfe";
> upper_case(s);
(1) Result: "STRA\337E"
> lower_case(upper_case(s));
(2) Result: "stra\337e"
> String.capitalize(lower_case(s));
(3) Result: "Stra\337e"

The output is the equivalent of repr(), and it uses octal escapes
where possible (for brevity), so \337 is its representation of U+00DF
(decimal 223, octal 337). Upper-casing and lower-casing this character
result in the same thing.

> write("Original: %s\nLower: %s\nUpper: %s\n",s,lower_case(s),upper_case(s));
Original: Straße
Lower: straße
Upper: STRAßE

It's worth noting, incidentally, that the unusual upper-case form of
the letter (U+1E9E) does lower-case to U+00DF in both Python 3.3 and
Pike 7.9.5:

> lower_case("Stra\u1E9Ee");
(9) Result: "stra\337e"

>>> ord("\u1e9e".lower())
223

So both of them are behaving in a compliant manner, even though
they're not quite identical.

ChrisA



More information about the Python-list mailing list