Py 3.3, unicode / upper()

wxjmfauth at gmail.com wxjmfauth at gmail.com
Wed Dec 19 15:55:08 EST 2012


Le mercredi 19 décembre 2012 15:52:23 UTC+1, Christian Heimes a écrit :
> Am 19.12.2012 15:23, schrieb wxjmfauth at gmail.com:
> 
> > But, this is not the problem.
> 
> > I was suprised to discover this:
> 
> > 
> 
> >>>> 'Straße'.upper()
> 
> > 'STRASSE'
> 
> > 
> 
> > I really, really do not know what I should think about that.
> 
> > (It is a complex subject.) And the real question is why?
> 
> 
> 
> It's correct. LATIN SMALL LETTER SHARP S doesn't have an upper case
> 
> form. However the unicode database specifies an upper case mapping from
> 
> ß to SS. http://codepoints.net/U+00DF
> 
> 
> 
> Christian

-----

Yes, it is correct (or can be considered as correct).
I do not wish to discuss the typographical problematic
of "Das Grosse Eszett". The web is full of pages on the
subject. However, I never succeeded to find an "official
position" from Unicode. The best information I found seem
to indicate (to converge), U+1E9E is now the "supported"
uppercase form of U+00DF. (see DIN).

What is bothering me, is more the implementation. The Unicode
documentation says roughly this: if something can not be
honoured, there is no harm, but do not implement a workaroud.
In that case, I'm not sure Python is doing the best.

If "wrong", this can be considered as programmatically correct
or logically acceptable (Py3.2)

>>> 'Straße'.upper().lower().capitalize() == 'Straße'
True

while this will *always* be problematic (Py3.3)

>>> 'Straße'.upper().lower().capitalize() == 'Straße'
False

jmf




More information about the Python-list mailing list