Case-insensitive string equality

Chris Angelico rosuav at gmail.com
Thu Aug 31 10:36:33 EDT 2017


On Fri, Sep 1, 2017 at 12:27 AM, Steve D'Aprano
<steve+python at pearwood.info> wrote:
>> Additionally: a proper "case insensitive comparison" should almost
>> certainly start with a Unicode normalization. But should it be NFC/NFD
>> or NFKC/NFKD? IMO that's a good reason to leave it in the hands of the
>> application.
>
> Normalisation is orthogonal to comparisons and searches. Python doesn't
> automatically normalise strings, as people have pointed out a bazillion times
> in the past, and it happily compares
>
> 'ö' LATIN SMALL LETTER O WITH DIAERESIS
>
> 'ö' LATIN SMALL LETTER O + COMBINING DIAERESIS
>
>
> as unequal. I don't propose to change that just so that we can get 'a'
> equals 'A' :-)

You may not, but others will. Which is just one of the reasons that
"case insensitive comparison" is not as simple as it initially seems,
and thus (IMO) is best NOT baked into the language.

ChrisA



More information about the Python-list mailing list