Case-insensitive string equality

Chris Angelico rosuav at gmail.com
Fri Sep 1 11:41:03 EDT 2017


On Fri, Sep 1, 2017 at 11:22 PM, Steve D'Aprano
<steve+python at pearwood.info> wrote:
> On Fri, 1 Sep 2017 09:53 am, MRAB wrote:
>
>> What would you expect the result would be for:
>>
>>      "\N{LATIN SMALL LIGATURE FI}".case_insensitive_find("F")
>>
>>      "\N{LATIN SMALL LIGATURE FI}".case_insensitive_find("I)
>
> That's easy.
>
> -1 in both cases, since neither "F" nor "I" is found in either string. We can
> prove this by manually checking:
>
> py> for c in "\N{LATIN SMALL LIGATURE FI}":
> ...     print(c, 'F' in c, 'f' in c)
> ...     print(c, 'I' in c, 'i' in c)
> ...
> fi False False
> fi False False
>
>
> If you want some other result, then you're not talking about case sensitivity.

>>> "\N{LATIN SMALL LIGATURE FI}".upper()
'FI'
>>> "\N{LATIN SMALL LIGATURE FI}".lower()
'fi'
>>> "\N{LATIN SMALL LIGATURE FI}".casefold()
'fi'

Aside from lower(), which returns the string unchanged, the case
conversion rules say that this contains two letters. So "F" exists in
the uppercased version of the string, and "f" exists in the casefolded
version.

So what's the definition of "case insensitive find"? The most simple
and obvious form is:

def case_insensitive_find(self, other):
    return self.casefold().find(other.casefold())

which would clearly return 0 and 1 for the two original searches.

ChrisA



More information about the Python-list mailing list