Python idiom: Multiple search-and-replace

Siggy Brentrup bsb at winnegan.de
Wed Apr 12 10:37:07 EDT 2000


Randall Hopper <aa8vb at yahoo.com> writes:

> There's got to be a better way.  Is there a Python idiom I'm missing?
> 
> I want to do search-and-replace of multiple symbols on each line of a file.
> But the simple-minded code below takes a while.  It simply uses
> string.replace N times per line (N is 240 in this case).  There are 33994
> lines (1.1Meg).
> 
> Total time: 140.7 seconds.
> 
> I stopped to investigate.  What was slowing it up so much?
> 
> - Comment out the inner loop, and Python completes in 0.9 sec.  No prob
>   there. It can read and write-the data very quickly.
> - It's not dictionary lookups.  Converted to a tuple-list and took 3 sec
>   longer.  
> - I tried a few other things but it only made it take longer than
>   140 sec.
> 
> Is there a Python feature or standard library API that will get me less
> Python code spinning inside this loop?   re.multisub or equivalent? :-)
> 
> Thanks,
> 
> Randall
> 
> 
> ------------------------------------------------------------------------------
> 
>   symbol_map = { 'oldsym1' : 'newsym1', oldsym2' : 'newsym2', ... }
>   fp = open( net_path, "r" )
> 
>   while 1:
>     line = fp.readline()
>     if not line: break
> 
>     for old_sym in symbol_map.keys():
>       line = string.replace( line, old_sym, symbol_map[ old_sym ] )
> 
>     out_fp.write( line )
> 
> ------------------------------------------------------------------------------

symbol_items = symbol_map.items()

while 1:
    line = fp.readline()
    if not line: break

    for old, new in symbol_items:
        line = string.replace(line, old, new)

    out_fp.write(line)

springs into mind (untested)

Regards
  Siggy

-- 
Siggy Brentrup - bsb at winnegan.de - http://www.winnegan.de/
****** ceterum censeo javascriptum esse restrictam *******




More information about the Python-list mailing list