Most efficient method to search text?
Tim Peters
tim.one at comcast.net
Wed Oct 16 12:42:41 EDT 2002
[Michael Hudson]
> Here's a way to quickly (I hope! Haven't done any benchmarks) tell if
> one of a bunch of words is contained in a chunk of text, assuming the
> words are known beforehand [...]
[Jeff Epler]
> Is there any reason to suppose that this is more efficient than using
> re.compile("|".join(re.escape(words))).match?
Time it.
> I haven't looked at the implementation of sre, but it should be able to
> generate a simple DFA for this RE and execute something very much like
> your 'match' function, but at C speeds.
re doesn't build a DFA. Alternatives are searched for one at a time, left
to right. See Friedl's "Mastering Regular Expression" book for long
discussions of the tradeoffs.
More information about the Python-list
mailing list