[issue12737] str.title() is overzealous by upcasing combining marks inappropriately

Martin v. Löwis report at bugs.python.org
Sat Oct 1 16:42:35 CEST 2011


Martin v. Löwis <martin at v.loewis.de> added the comment:

>> As for terminology: I think the documentation should continue to
>> speak about "words" and "letters", and then define what is meant
>> in this context. It's not that the Unicode consortium invented
>> the term "letter", so we should use it more liberally than just
>> referring to the L* categories.
> 
> I really don't think it wise to have private definitions of these.
> 
> If Letter doesn't mean L?, things get too weird.  That's why 
> there are separate definitions of alphabetic, word, etc.

But I won't be using the word "Letter", but "letter" (lower case).
Nobody will assume that this refers to the Unicode standard;
people would rather expect that this is [A-Za-z] (i.e. not expect
non-ASCII characters to be considered at all). So elaboration is
necessary, anyway. I take the risk of confusing the 10 people that
ever read UTS#18 :-)

----------
title: str.title() is overzealous by upcasing combining marks inappropriately -> str.title() is overzealous by upcasing combining marks	inappropriately

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12737>
_______________________________________


More information about the Python-bugs-list mailing list