Regular expression to match a #

Bryan Olson fakeaddress at nowhere.org
Fri Aug 12 00:44:17 EDT 2005


John Machin wrote:
[...]
 > Observation: factoring out the compile step makes the difference much
 > more apparent.
 >
 >  >>> ["%.3f" % t.timeit() for t in t3, t4, t5, t6]
 > ['1.578', '1.175', '2.283', '1.174']
 >  >>> ["%.3f" % t.timeit() for t in t3, t4, t5, t6]
 > ['1.582', '1.179', '2.284', '1.172']
 >  >>>

To make it even more apparent, try:

     import re
     import profile

     startsz = re.compile('^z')

     for s in ('x' * 1000, 'x' * 100000, 'x'*10000000):
         profile.run('startsz.search(s)')

Profile report is below.


 > Conclusion: search time depends on length of searched string.
 >
 > Meta-conclusion: Either I have to retract my
 > based-on-hope-rather-than-on-experimentation assertion, or redefine "not
 > dopey" to mean "surely nobody would search for ^x when match x would do,
 > so it would be dopey to optimise re for that" :-)

No question, there's some dopiness to searching for the
beginning of the string at places other than beginning of the
string.

The tricky part would be optimizing '$'.


--
--Bryan


          4 function calls in 0.003 CPU seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.000    0.000    0.000    0.000 :0(search)
         1    0.003    0.003    0.003    0.003 :0(setprofile)
         1    0.000    0.000    0.000    0.000 <string>:1(?)
         0    0.000             0.000          profile:0(profiler)
         1    0.000    0.000    0.003    0.003 profile:0(startsz.search(s))


          4 function calls in 0.002 CPU seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.002    0.002    0.002    0.002 :0(search)
         1    0.000    0.000    0.000    0.000 :0(setprofile)
         1    0.000    0.000    0.002    0.002 <string>:1(?)
         0    0.000             0.000          profile:0(profiler)
         1    0.000    0.000    0.002    0.002 profile:0(startsz.search(s))


          4 function calls in 0.228 CPU seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.228    0.228    0.228    0.228 :0(search)
         1    0.000    0.000    0.000    0.000 :0(setprofile)
         1    0.000    0.000    0.228    0.228 <string>:1(?)
         0    0.000             0.000          profile:0(profiler)
         1    0.000    0.000    0.228    0.228 profile:0(startsz.search(s))



More information about the Python-list mailing list