help make it faster please

bonono at gmail.com bonono at gmail.com
Thu Nov 10 13:16:41 EST 2005


why reload wordlist and sort it after each word processing ? seems that
it can be done after the for loop.

pkilambi at gmail.com wrote:
> I wrote this  function which does the following:
> after readling lines from file.It splits and finds the  word occurences
> through a hash table...for some reason this is quite slow..can some one
> help me make it faster...
> f = open(filename)
> lines = f.readlines()
> def create_words(lines):
>     cnt = 0
>     spl_set = '[",;<>{}_&?!():-[\.=+*\t\n\r]+'
>     for content in lines:
>         words=content.split()
>         countDict={}
>         wordlist = []
>         for w in words:
>             w=string.lower(w)
>             if w[-1] in spl_set: w = w[:-1]
>             if w != '':
>                 if countDict.has_key(w):
>                     countDict[w]=countDict[w]+1
>                 else:
>                     countDict[w]=1
>             wordlist = countDict.keys()
>             wordlist.sort()
>         cnt += 1
>         if countDict != {}:
>             for word in wordlist: print (word+' '+
> str(countDict[word])+'\n')




More information about the Python-list mailing list