Case-insensitive string equality

Steve D'Aprano steve+python at pearwood.info
Fri Sep 1 09:22:30 EDT 2017


On Fri, 1 Sep 2017 09:53 am, MRAB wrote:

> What would you expect the result would be for:
> 
>      "\N{LATIN SMALL LIGATURE FI}".case_insensitive_find("F")
> 
>      "\N{LATIN SMALL LIGATURE FI}".case_insensitive_find("I)

That's easy. 

-1 in both cases, since neither "F" nor "I" is found in either string. We can
prove this by manually checking:

py> for c in "\N{LATIN SMALL LIGATURE FI}":
...     print(c, 'F' in c, 'f' in c)
...     print(c, 'I' in c, 'i' in c)
...
fi False False
fi False False


If you want some other result, then you're not talking about case sensitivity.

If anyone wants to propose "normalisation-insensitive matching", I'll ask you to
please start your own thread rather than derailing this one with an unrelated,
and much more difficult, problem.

The proposal here is *case insensitive* matching, not Unicode normalisation. If
you want to decompose the strings, you know how to:

py> import unicodedata
py> unicodedata.normalize('NFKD', "\N{LATIN SMALL LIGATURE FI}")
'fi'


-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list