Most efficient method to search text?

Michael Hudson mwh at python.net
Fri Oct 18 06:03:10 EDT 2002


bokr at oz.net (Bengt Richter) writes:

> On Thu, 17 Oct 2002 11:14:09 GMT, Michael Hudson <mwh at python.net> wrote:
> 
> >Tim Peters <tim.one at comcast.net> writes:
> >
> >> Especially for purposes of building lexers, it might be useful if the re
> >> package could recognize when a DFA approach was sufficient and practical,
> >> and switch to a different scheme entirely then.  Or it might not.  Build
> >> code to try it both ways, and let us know how it turns out ...
> >
> >Indeed, my code canes re when the wordlist gets long.  Here's some
> 
> If this is easy to add to your test harness, I'd be interested to see what
> this search does in comparison, with the longer word lists (it's probably
> faster than has_word.py that I posted elsewhere, depending on relative lengths
> of word lists and strings, and internal vs external looping and allocation.

It's quick:

>>> robin.do_comp(1000)
compile...       3.42289197445
compile2...      2.49848008156
compile3...      0.696313977242
compile4...      4.04265594482
compile_re...    0.627331018448
compile_bengt... 0.00175499916077

test...          1.39854204655
test2...         2.93543899059
test3...         3.2231388092
test4...         2.15867292881
test_re...       8.38554108143
test_bengt...    0.437232971191

I'm sure Tim once said something along the lines of "Python doesn't
give much advice for getting good performance, beyond a not-so-subtle
hint to exploit dicts for all they're worth" but I can't find it now.

Cheers,
M.
hmm, look at that sig...

-- 
  Premature optimization is the root of all evil.
       -- Donald E. Knuth, Structured Programming with goto Statements



More information about the Python-list mailing list