help make it faster please
bearophileHUGS at lycos.com
bearophileHUGS at lycos.com
Thu Nov 10 13:43:04 EST 2005
This can be faster, it avoids doing the same things more times:
from string import maketrans, ascii_lowercase, ascii_uppercase
def create_words(afile):
stripper = """'[",;<>{}_&?!():[]\.=+-*\t\n\r^%0123456789/"""
mapper = maketrans(stripper + ascii_uppercase,
" "*len(stripper) + ascii_lowercase)
countDict = {}
for line in afile:
for w in line.translate(mapper).split():
if w:
if w in countDict:
countDict[w] += 1
else:
countDict[w] = 1
word_freq = countDict.items()
word_freq.sort()
for word, freq in word_freq:
print word, freq
create_words(file("test.txt"))
If you can load the whole file in memory then it can be made a little
faster...
Bear hugs,
bearophile
More information about the Python-list
mailing list