How to check if any item from a list of strings is in a big string?

John Machin sjmachin at lexicon.net
Thu Jul 9 23:07:58 EDT 2009


On Jul 10, 12:53 pm, Nobody <nob... at nowhere.com> wrote:
> On Thu, 09 Jul 2009 18:36:05 -0700, inkhorn wrote:
> > For one of my projects, I came across the need to check if one of many
> > items from a list of strings could be found in a long string.
>
> If you need to match many strings or very long strings against the same
> list of items, the following should (theoretically) be optimal:
>
>         r = re.compile('|'.join(map(re.escape,list_items)))
>         ...
>         result = r.search(string)

"theoretically optimal" happens only if the search mechanism builds a
DFA or similar out of the list of strings. AFAIK Python's re module
doesn't.

Try this:
http://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/



More information about the Python-list mailing list