Python and regexp efficiency.. again.. :)
Markus Stenberg
mstenber at cc.Helsinki.FI
Mon Dec 13 01:57:52 EST 1999
Yishai Beeri <yishai at platonix.com> writes:
> What percentage of the lines is expected to actually match?
Very few. Preferably none. Although the real match definition is as
follows: (expr|expr|expr|not expr) match. Thus, the last expr usually
matches.
> What percentage of the lines match the commonstring but none of the tails?
About all lines match initial commonstring, but then next sub-commonstrings
(that my specialized automated regexp optimizer notices) are rarer
(roughly, ~100 different cases, one matches about every time). The final
non-common parts do not usually match, except in terminal case.
> Would it be helpful to look just for the tails and get rid of erroneous
> matches by then looking for the commonstring?
Possibly, yes. Hmm.. I have to think about it - main problem is that last
"not expr" part, as not-matching-something is much more nontrivial than
matching-something.
> Yishai
-Markus
--
The IBM Principle:
Machines should work. People should think.
The Truth About the IBM Principle:
Machines don't often work, people don't often think.
More information about the Python-list
mailing list