Python idiom: Multiple search-and-replace
Siggy Brentrup
bsb at winnegan.de
Wed Apr 12 10:37:07 EDT 2000
Randall Hopper <aa8vb at yahoo.com> writes:
> There's got to be a better way. Is there a Python idiom I'm missing?
>
> I want to do search-and-replace of multiple symbols on each line of a file.
> But the simple-minded code below takes a while. It simply uses
> string.replace N times per line (N is 240 in this case). There are 33994
> lines (1.1Meg).
>
> Total time: 140.7 seconds.
>
> I stopped to investigate. What was slowing it up so much?
>
> - Comment out the inner loop, and Python completes in 0.9 sec. No prob
> there. It can read and write-the data very quickly.
> - It's not dictionary lookups. Converted to a tuple-list and took 3 sec
> longer.
> - I tried a few other things but it only made it take longer than
> 140 sec.
>
> Is there a Python feature or standard library API that will get me less
> Python code spinning inside this loop? re.multisub or equivalent? :-)
>
> Thanks,
>
> Randall
>
>
> ------------------------------------------------------------------------------
>
> symbol_map = { 'oldsym1' : 'newsym1', oldsym2' : 'newsym2', ... }
> fp = open( net_path, "r" )
>
> while 1:
> line = fp.readline()
> if not line: break
>
> for old_sym in symbol_map.keys():
> line = string.replace( line, old_sym, symbol_map[ old_sym ] )
>
> out_fp.write( line )
>
> ------------------------------------------------------------------------------
symbol_items = symbol_map.items()
while 1:
line = fp.readline()
if not line: break
for old, new in symbol_items:
line = string.replace(line, old, new)
out_fp.write(line)
springs into mind (untested)
Regards
Siggy
--
Siggy Brentrup - bsb at winnegan.de - http://www.winnegan.de/
****** ceterum censeo javascriptum esse restrictam *******
More information about the Python-list
mailing list