[issue12737] str.title() is overzealous by upcasing combining marks inappropriately

Guido van Rossum report at bugs.python.org
Sun Oct 25 22:08:51 EDT 2020


Guido van Rossum <guido at python.org> added the comment:

Are you sure? Running Ezio's titletest.py, I get this output (note that the UCD major version is in the double digits so the test for that misfires :-).

titletest.py: Please set your PYTHONIOENCODING envariable to utf8
WARNING: Your old UCD is out of date, expected 6.0.0 but got 13.0.0
titlecase of  'déme un café'  should be  'Déme Un Café'  not  'DéMe Un Café'
titlecase of  'i̇stanbul'  should be  'İstanbul'  not  'İStanbul'
titlecase of  'ᾲ στο διάολο'  should be  'Ὰͅ Στο Διάολο'  not  'ᾺΙ Στο ΔιάΟλο'
failed 3 out of 6 tests

Note that the test program specifically uses combining marks, which are alternate ways to spell some characters. It seems what's failing is the second deme un cafe, the first istanbul, and the (only) greek phrase.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue12737>
_______________________________________


More information about the Python-bugs-list mailing list