Most efficient method to search text?
Michael Hudson
mwh at python.net
Fri Oct 18 06:03:10 EDT 2002
bokr at oz.net (Bengt Richter) writes:
> On Thu, 17 Oct 2002 11:14:09 GMT, Michael Hudson <mwh at python.net> wrote:
>
> >Tim Peters <tim.one at comcast.net> writes:
> >
> >> Especially for purposes of building lexers, it might be useful if the re
> >> package could recognize when a DFA approach was sufficient and practical,
> >> and switch to a different scheme entirely then. Or it might not. Build
> >> code to try it both ways, and let us know how it turns out ...
> >
> >Indeed, my code canes re when the wordlist gets long. Here's some
>
> If this is easy to add to your test harness, I'd be interested to see what
> this search does in comparison, with the longer word lists (it's probably
> faster than has_word.py that I posted elsewhere, depending on relative lengths
> of word lists and strings, and internal vs external looping and allocation.
It's quick:
>>> robin.do_comp(1000)
compile... 3.42289197445
compile2... 2.49848008156
compile3... 0.696313977242
compile4... 4.04265594482
compile_re... 0.627331018448
compile_bengt... 0.00175499916077
test... 1.39854204655
test2... 2.93543899059
test3... 3.2231388092
test4... 2.15867292881
test_re... 8.38554108143
test_bengt... 0.437232971191
I'm sure Tim once said something along the lines of "Python doesn't
give much advice for getting good performance, beyond a not-so-subtle
hint to exploit dicts for all they're worth" but I can't find it now.
Cheers,
M.
hmm, look at that sig...
--
Premature optimization is the root of all evil.
-- Donald E. Knuth, Structured Programming with goto Statements
More information about the Python-list
mailing list