Beginner's performance problem

Alex Martelli aleax at aleax.it
Tue Apr 9 12:53:51 EDT 2002


Mark Charsley wrote:

> For reasons best not gone into, I needed to correct the case of a whole
> bunch of SQL code. As such I created the little script below...
> 
> It reads in a source file, then for each line it does a case-insensitive
> search for each TableName and ColumnName (as contained in the "names"
> collection), checks that it's not just matching a substring, and then
> corrects the case of the word if necessary.


Something like (warning, untested):


substitutionDictionary = {}
for correctName in names:
    substitutionDictionary[correctName.upper()] = correctName

def substitutor(matchObject):
    return substitutionDictionary[matchObject.group().upper()]

import re
correctorRE = re.compile(r'\b' + '|'.join(names) + r'\b', re.I)

newLines = []
for originalLine in theFile.readlines():
    correctedLine = correctorRE(originalLine, substitutor)
    newLines.append(correctedLine)

someFile.writelines(newLines)


looks likely to go MUCH faster.  In general, in scripting languages,
reasoning and programming more abstractly can lead to faster code,
as well as higher productivity, than getting into minute details.
Regular expressions may often help with that (although they're easy
to fall in love with and overuse), here by offering case insensitive
matching as well as "bulk matching" with the | operator and "bulk
substitution" with the sub method.


Alex




More information about the Python-list mailing list