Efficient way of testing for substring being one of a set?

George Sakkis george.sakkis at gmail.com
Thu Apr 3 08:19:11 EDT 2008


On Apr 3, 8:03 am, Jeff <jeffo... at gmail.com> wrote:
> def foo(sample, strings):
>         for s in strings:
>                 if sample in s:
>                         return True
>         return False
>
> This was an order of magnitude faster for me than using str.find or
> str.index.  That was finding rare words in the entire word-list (w/
> duplicates) of War and Peace.

If you test against the same substrings over and over again, an
alternative would be to build a regular expression:

import re
search = re.compile('|'.join(re.escape(x)
                             for x in substrings)).search
p = search(somestring)
if p is not None:
  print 'Found', p.group()


George



More information about the Python-list mailing list