Newbie regular expression and whitespace question
Fredrik Lundh
fredrik at pythonware.com
Thu Sep 22 16:42:46 EDT 2005
Paul McGuire wrote:
> If you're absolutely stuck on using RE's, then others will have to step
> forward. Meanwhile, here's a pyparsing solution (get pyparsing at
> http://pyparsing.sourceforge.net):
so, let's see. using ...
from pyparsing import *
import re
data = """ ... table example from op ... """
def test1():
LT = Literal("<")
GT = Literal(">")
collapsableSpace = GT + LT
collapsableSpace.setParseAction( replaceWith("><") )
return collapsableSpace.transformString(data)
def test2():
return re.sub(">\s+<", "><", data)
I get
> timeit -s "import test" "test.test1()"
100 loops, best of 3: 6.8 msec per loop
> timeit -s "import test" "test.test2()"
10000 loops, best of 3: 33.3 usec per loop
or in other words, five lines instead of one, and a 200x slowdown.
but alright, maybe we should precompile the expressions to get a
fair comparision. adding
LT = Literal("<")
GT = Literal(">")
collapsableSpace = GT + LT
collapsableSpace.setParseAction( replaceWith("><") )
def test3():
return collapsableSpace.transformString(data)
p = re.compile(">\s+<")
def test4():
return p.sub("><", data)
to the first program, I get
> timeit -s "import test" "test.test3()"
100 loops, best of 3: 6.73 msec per loop
> timeit -s "import test" "test.test4()"
10000 loops, best of 3: 27.8 usec per loop
that's a 240x slowdown. hmm.
</F>
More information about the Python-list
mailing list