Python idiom: Multiple search-and-replace
Randall Hopper
aa8vb at yahoo.com
Wed Apr 12 10:08:16 EDT 2000
There's got to be a better way. Is there a Python idiom I'm missing?
I want to do search-and-replace of multiple symbols on each line of a file.
But the simple-minded code below takes a while. It simply uses
string.replace N times per line (N is 240 in this case). There are 33994
lines (1.1Meg).
Total time: 140.7 seconds.
I stopped to investigate. What was slowing it up so much?
- Comment out the inner loop, and Python completes in 0.9 sec. No prob
there. It can read and write-the data very quickly.
- It's not dictionary lookups. Converted to a tuple-list and took 3 sec
longer.
- I tried a few other things but it only made it take longer than
140 sec.
Is there a Python feature or standard library API that will get me less
Python code spinning inside this loop? re.multisub or equivalent? :-)
Thanks,
Randall
------------------------------------------------------------------------------
symbol_map = { 'oldsym1' : 'newsym1', oldsym2' : 'newsym2', ... }
fp = open( net_path, "r" )
while 1:
line = fp.readline()
if not line: break
for old_sym in symbol_map.keys():
line = string.replace( line, old_sym, symbol_map[ old_sym ] )
out_fp.write( line )
------------------------------------------------------------------------------
--
Randall Hopper
aa8vb at yahoo.com
More information about the Python-list
mailing list