Regular expressions vs find?

Eric Hagemann ehagemann at home.com
Sun Jun 18 19:20:11 EDT 2000


[snip]
Also you might have better luck with the regular expression stuff (rather
than the find command) and precompiling the string
[snip]

I think I am the somebody who said this ;)

My mind was racing faster than my fingers.  The re stuff _is_ slower than
find. I have measured the slowdown to be a factor of 2 or 3. As also proven
below.

My real thoughts were that most grep's (or findstr in NT) do more than
simple string matching and thus the re module might be desired or a better
comparison.  If all that is desired is simple matching however, then
string.find is your way to go.

Cheers
Eric




"David C. Ullrich" <ullrich at math.okstate.edu> wrote in message
news:01bfd96b$f7ddcea0$85a38ad1 at daves-dell...
> William Dandreta <wjdandreta at worldnet.att.net> wrote in article
> <oO935.12054$Xx5.557215 at bgtnsc06-news.ops.worldnet.att.net>...
> > Hi David,
> >
> > The message is titled Python GREP(... you can take a look if you like.
It
> > was suggested that reg expr might be faster so give it a try.
>
> Sure enough, someone said "Also you might have better luck
> with the regular expression stuff (rather than the find command) and
> precompiling the string" and it seems clear from the context that
> "better luck" means "faster". I doubt it's so. I could be wrong, it's
> happened before, not that I can ever think of any examples.
> (Note that comparing string.find() to re.search() is not the same
> as comparing string.find() to the system's grep function...)
>
> Ok, try this. I spent a few minutes just now looking up how
> a trivial regex works, not gonna look up the docs on profiler.py
> today. Instead I made a huge example that I can "time" by counting
> "one, two, three...". You may want to start with less ado at
> first and then crank the numbers back up if it goes by too fast:
>
> import string
> import re
>
> r=re.compile('dog')
>
> s='ado'*10000
>
> l=[]
> for j in range(1000): l.append(s[:])
> l.append(s+'g')
>
> def testre():
> localr = r
> for str in l:
> res = localr.search(str)
> if res:
> print res.start()
> break
>
>
> def testfind():
> find=string.find
> for str in l:
> res = find(str,'dog')
> if res > -1:
> print res
> break
>
> I don't _think_ I'm cheating here. On the machine I
> tested this on calling testre() seems to take about
> six seconds while testfind() seems to be less than
> three.
>
> DU
>
> > Bill
> >
> > David C. Ullrich wrote in message
> <01bfd93c$66c03180$2ace8ad1 at daves-dell>...
> > >
> > >
> > >William Dandreta <wjdandreta at worldnet.att.net> wrote in article
> > ><sy535.11434$Xx5.530580 at bgtnsc06-news.ops.worldnet.att.net>...
> > >> I recently read a message that suggested that using regular
> expressions
> > >> might be faster than the find function.
> > >
> > > I would have guessed that if anything find would be faster, although
> > >I certainly could be wrong. Did someone say that using a regular
> expression
> > >would be _faster_, or did they actually say it was more _powerful_ or
> some
> > >such?
> > >
>





More information about the Python-list mailing list