[issue12266] str.capitalize contradicts oneself

Thu Jul 21 11:02:34 CEST 2011

Marc-Andre Lemburg <mal at egenix.com> added the comment:

Ezio Melotti wrote:
> 
> Ezio Melotti <ezio.melotti at gmail.com> added the comment:
> 
> Do you mean  "if (!Py_UNICODE_ISLOWER(*s)) {"  (with the '!')?

Sorry, here's the correct version:

    if (!Py_UNICODE_ISUPPER(*s)) {
        *s = Py_UNICODE_TOUPPER(*s);
        status = 1;
    }
    s++;
    while (--len > 0) {
        if (!Py_UNICODE_ISLOWER(*s)) {
            *s = Py_UNICODE_TOLOWER(*s);
            status = 1;
        }
        s++;
    }

> This sounds fine to me, but with this approach all the uncased characters will go through a Py_UNICODE_TO* macro, whereas with the current code only the cased ones are converted.  I'm not sure this matters too much though.
> 
> OTOH if the non-lowercase cased chars are always either upper or titlecased, checking for both should be equivalent.

AFAIK, there are characters that don't have a case mapping at all.
It may also be the case, that a non-cased character still has a
lower/upper case mapping, e.g. for typographical reasons.

Someone would have to check this against the current Unicode database.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12266>
_______________________________________