"in" operator for strings

Alex Martelli aleaxit at yahoo.com
Thu Feb 1 06:55:32 EST 2001


"Magnus Lie Hetland" <mlh at idi.ntnu.no> wrote in message
news:95beco$e0u$1 at tyfon.itea.ntnu.no...
    [snip]
> This isn't quite logical... A string works like a sequence
> of characters, and sequence membership only works on
> single elements (in this case characters), not subsequences
> (in this case, substrings).

Right, and an extension of this is basically what's being
asked for (though the original poster may not have thought
of this 'obvious' generalization, specialcasing string would
surely not be warranted).  Unfortunately, for general cases
it doesn't scale well -- i.e., now:

>>> print [1,2] in [6, 4, [1,2], 7]
1
>>> print [6,4] in [6, 4, [1,2], 7]
0

and having it return 1 in the second case too would be making
this 'in' very ambiguous and confusing, alas.

Also, of course, this would throw any parallel between
"x in y" and "for x in y" out of the windows unless the
latter starts looping on all *subsequences* -- eeep!-)

> But you can't do what you ask for, just like you can't write
>
>   [1, 2] in [1, 2, 3, 4]

Sure you can, it's a well-formed test and returns 0 since
[1,2] is not an item in the right-hand operand sequence.


> > try this:
> > >>> "Waldo" in "Ralph Waldo Emerson".split()
> > 1
>
> Probably a better idea.

Only if you're looking for words, not for any substring,
which is at least as frequent.

> And for those of us who can't really
> get used to calling methods directly on string literals,
> you *could* write:
>
>    "Waldo" in split("Ralph Waldo Emerson")
>
> It might be old-fashioned, but... So what :-)

So it doesn't work unless you "from string import *" (horrid
idea), "from string import split" (doubtful), or rewrite it
using an explicit string.split (probably best, but quite
verbose compared to using the split string-method -- and
you still have to add an import string to your module, which
is rarely needed now that we DO have string methods).


For general substring-matching, a class wrapper is not
too bad:

class subsOf:
    def __init__(self, seq):
        self.seq = seq
    def __contains__(self, subseq):
        return self.seq.find(subseq) != -1

this only works for strings, as written, AND only to
enable such idioms as

    if 'ald' in subsOf("Waldo"):
        print 'yep!'

but it's not too hard to generalize it to any sequence
type (at least if you're content to use elementary
algorithms for __contains__!-), implement __getitem__
to allow looping on all subsequences:-), etc.


Alex






More information about the Python-list mailing list