Is there any way to say ignore case with "in"?

Paul McGuire ptmcg at austin.rr.com
Sun Apr 6 11:20:59 EDT 2008


On Apr 6, 8:53 am, "Martin v. Löwis" <mar... at v.loewis.de> wrote:
> >> I know I could use:-
>
> >>     if lower(string1) in lower(string2):
> >>         <do something>
>
> >> but it somehow feels there ought to be an easier (tidier?) way.
>
> > Easier?  You mean like some kind of mind meld?
>
> Interestingly enough, it shouldn't be (but apparently is) obvious that
>
>    a.lower() in b.lower()
>
> is a way of expressing "a is a substring of b, with case-insensitive
> matching". Can we be sure that these are really the same concepts,
> and if so, is
>
>   a.upper() in b.upper()
>
> also equivalent?
>
> It's probably a common assumption that, for any character c,
> c.lower()==c.upper().lower(). Yet,
>
> py> [i for i in range(65536) if unichr(i).upper().lower() !=
> unichr(i).lower()]
> [181, 305, 383, 837, 962, 976, 977, 981, 982, 1008, 1009, 1010, 1013,
> 7835, 8126]
>
> Take, for example, U+017F, LATIN SMALL LETTER LONG S. It's .lower() is
> the same character, as the character is already in lower case.
> It's .upper() is U+0053, LATIN CAPITAL LETTER S. Notice that the LONG
> is gone - there is no upper-case version of a "long s".
> It's .upper().lower() is U+0073, LATIN SMALL LETTER S.
>
> So should case-insensitive matching match the small s with the small
> long s, as they have the same upper-case letter?
>
> Regards,
> Martin

Another surprise (or maybe not so surprising) - this "upper != lower"
is not symmetric.  Using the inverse of your list comp, I get

>>> [i for i in range(65536) if unichr(i).lower().upper() !=
... unichr(i).upper()]
[304, 1012, 8486, 8490, 8491]

Instead of 15 exceptions to the rule, conversion to upper has only 5
exceptions.  So perhaps comparsion of upper's is, while not foolproof,
less likely to encounter these exceptions?  Or at least, simpler to
code explicit tests.

-- Paul



More information about the Python-list mailing list