Python idiom: Multiple search-and-replace

Robin Becker robin at jessikat.demon.co.uk
Fri Apr 14 15:51:44 EDT 2000


In article <5C%I4.645$rc9.190288896 at newsb.telia.net>, Fredrik Lundh
<effbot at telia.com> writes
>Randall Hopper <aa8vb at yahoo.com> wrote:
>> Is there a Python feature or standard library API that will get me less
>> Python code spinning inside this loop?   re.multisub or equivalent? :-)
>
>haven't benchmarked it, but I suspect that this approach
>is more efficient:
>
>...
>
># based on re-example-5.py
>
>import re
>import string
>
>symbol_map = { "foo": "FOO", "bar": "BAR" }
>
>def symbol_replace(match, get=symbol_map.get):
>    return get(match.group(1), "")
>
>symbol_pattern = re.compile(
>    "(" + string.join(map(re.escape, symbol_map.keys()), "|") + ")"
>    )
>
>print symbol_pattern.sub(symbol_replace, "foobarfiebarfoo")
>
>...
>
></F>
>
...
I'm trying to use the above idiom for the strings '(' ')' '\\' but seem
to have a speed problem ie

import re
import string
from time import time

symbol_map = { '(': '\\(', ')': '\\)', '\\': '\\\\' }
def symbol_replace(match, get=symbol_map.get):
    return get(match.group(1), "")
PAT = "(" + string.join(map(re.escape, symbol_map.keys()), "|") + ")"
print PAT
symbol_pattern = re.compile( PAT )

def doit(str,n):
        N = xrange(n)
        t0 = time()
        for n in N:
                s=string.replace(str,'\\','\\\\')
                s=string.replace(s,'(','\(')
                s=string.replace(s,')','\)')
        t1 = time()
        print 'string %s %.4f' % (s,t1-t0)
        t0 = time()
        for n in N:
                t = symbol_pattern.sub(symbol_replace, str )
        t1 = time()
        print 're     %s %.4f' % (t,t1-t0)


doit("a b (c) \\t",10000)

produces 
(\\|\(|\))
string a b \(c\) \\t 0.4900
re     a b \(c\) \\t 5.3900

ie for this simple case string is best. Could I do better?
-- 
Robin Becker



More information about the Python-list mailing list