Optimization help needed: Search and Replace using dictionary of parameters
Jason Orendorff
jason at jorendorff.com
Mon Dec 31 13:37:32 EST 2001
> Before I start writing code, any ideas what is the fastest way of
> doing it ?:
> regex- or string -functions ? Map or readlines() ?
sed could be much faster (albeit less flexible) than Python
for this task.
Your data structure could be slow.
Use nested dictionaries instead:
x = {
'filename1': { 'parameter': 'value',
'parameter2': 'value',
'parameter3': 'value' },
'filename2': { 'parameter2': 'value',
'parameter4': 'value',
'parameter6': 'value' },
'filename3': { 'parameter5': 'value',
'parameter7': 'value',
'parameter8': 'value' },
'filename4': { 'parameter': 'value',
'parameter3': 'value' }
}
As for string vs re, it depends. Just use whichever one is easier
for your particular situation. But take a special look at re.sub().
def get_replacement(match):
param = match.group(1)
return lookup[filename][param]
for line in file.xreadlines():
# Find and replace all tags that are set off in a certain way...
line = re.sub(r'<<([A-Z0-9_]+)>>', get_replacement, line)
out.write(line)
To read lines from a file, the fastest thing is probably:
# Convoluted, but speedy
x = file.readlines(16000)
while x:
for line in x:
blah_blah_blah(line)
x = file.readlines(16000)
But it is usually plenty fast enough to do:
# Quick and obvious
for line in file.xreadlines():
blah_blah_blah(line)
Or:
# Quick and even more obvious, but new in Python 2.2
for line in file:
blah_blah_blah(line)
## Jason Orendorff http://www.jorendorff.com/
More information about the Python-list
mailing list