Python idiom: Multiple search-and-replace
Robin Becker
robin at jessikat.demon.co.uk
Fri Apr 14 15:51:44 EDT 2000
In article <5C%I4.645$rc9.190288896 at newsb.telia.net>, Fredrik Lundh
<effbot at telia.com> writes
>Randall Hopper <aa8vb at yahoo.com> wrote:
>> Is there a Python feature or standard library API that will get me less
>> Python code spinning inside this loop? re.multisub or equivalent? :-)
>
>haven't benchmarked it, but I suspect that this approach
>is more efficient:
>
>...
>
># based on re-example-5.py
>
>import re
>import string
>
>symbol_map = { "foo": "FOO", "bar": "BAR" }
>
>def symbol_replace(match, get=symbol_map.get):
> return get(match.group(1), "")
>
>symbol_pattern = re.compile(
> "(" + string.join(map(re.escape, symbol_map.keys()), "|") + ")"
> )
>
>print symbol_pattern.sub(symbol_replace, "foobarfiebarfoo")
>
>...
>
></F>
>
...
I'm trying to use the above idiom for the strings '(' ')' '\\' but seem
to have a speed problem ie
import re
import string
from time import time
symbol_map = { '(': '\\(', ')': '\\)', '\\': '\\\\' }
def symbol_replace(match, get=symbol_map.get):
return get(match.group(1), "")
PAT = "(" + string.join(map(re.escape, symbol_map.keys()), "|") + ")"
print PAT
symbol_pattern = re.compile( PAT )
def doit(str,n):
N = xrange(n)
t0 = time()
for n in N:
s=string.replace(str,'\\','\\\\')
s=string.replace(s,'(','\(')
s=string.replace(s,')','\)')
t1 = time()
print 'string %s %.4f' % (s,t1-t0)
t0 = time()
for n in N:
t = symbol_pattern.sub(symbol_replace, str )
t1 = time()
print 're %s %.4f' % (t,t1-t0)
doit("a b (c) \\t",10000)
produces
(\\|\(|\))
string a b \(c\) \\t 0.4900
re a b \(c\) \\t 5.3900
ie for this simple case string is best. Could I do better?
--
Robin Becker
More information about the Python-list
mailing list