Most efficient method to search text?

Tim Peters tim.one at comcast.net
Thu Oct 17 00:28:45 EDT 2002


[Tim]
> re doesn't build a DFA.  Alternatives are searched for one at a
> time, left to right.  See Friedl's "Mastering Regular Expression" book
> for long discussions of the tradeoffs.

[Jeff Epler]
> I repent of the sin of suggesting this approach.  Oh well.

It's not a sin, but it's not nearly as straightforward as what you're
imagining.  Read the book <wink> -- Python/Perl regular expressions are
irregular in ways that go beyond the capabilities of DFAs.

Especially for purposes of building lexers, it might be useful if the re
package could recognize when a DFA approach was sufficient and practical,
and switch to a different scheme entirely then.  Or it might not.  Build
code to try it both ways, and let us know how it turns out ...





More information about the Python-list mailing list