[issue43221] German Text Conversion Using Upper() and Lower()

Eryk Sun report at bugs.python.org
Sun Feb 14 02:42:09 EST 2021


Eryk Sun <eryksun at gmail.com> added the comment:

Python uses standard Unicode character properties, as defined by the Unicode Consortium. This issue is discussed in their FAQ [1]: 

    Q: Why does ß (U+00DF LATIN SMALL LETTER SHARP S) not uppercase to 
       U+1E9E LATIN CAPITAL LETTER SHARP S by default?

    A: In standard German orthography, the sharp s ("ß") used to be 
       exclusively uppercased to a sequence of two capital S characters. 
       This longstanding practice is reflected in the default case 
       mappings in Unicode. A capital form of ß is sometimes preferred
       for typographic reasons or to avoid ambiguity, such as in
       uppercase names as found in passports. It is encoded in the
       Unicode Standard as U+1E9E. While this character is not widely
       used, [it] is now recognized in the official orthography as an 
       optional uppercase form of ß in addition to "SS".  Because it is
       only an optional alternative, the original mapping to "SS" is
       retained in the Unicode character properties. 

---
[1] http://unicode.org/faq/casemap_charprop.html#11

----------
nosy: +eryksun
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue43221>
_______________________________________


More information about the Python-bugs-list mailing list