letter frequency counter / your thoughts..

castironpi at gmail.com castironpi at gmail.com
Wed May 7 13:57:41 EDT 2008


On May 7, 12:30 pm, Paul Melis <p... at floorball-flamingos.nl> wrote:
> umpsu... at gmail.com wrote:
> > Here is my code for a letter frequency counter.  It seems bloated to
> > me and any suggestions of what would be a better way (keep in my mind
> > I'm a beginner) would be greatly appreciated..
>
> Not bad for a beginner I think :)
>
>
>
>
>
> > def valsort(x):
> >    res = []
> >    for key, value in x.items():
> >            res.append((value, key))
> >    return res
>
> > def mostfreq(strng):
> >    dic = {}
> >    for letter in strng:
> >            if letter not in dic:
> >                    dic.setdefault(letter, 1)
> >            else:
> >                    dic[letter] += 1
> >    newd = dic.items()
> >    getvals = valsort(newd)
> >    getvals.sort()
> >    length = len(getvals)
> >    return getvals[length - 3 : length]
>
> > thanks much!!
>
> Slightly shorter:
>
> def mostfreq(strng):
>      dic = {}
>      for letter in strng:
>          if letter not in dic:
>              dic[letter] = 0
>          dic[letter] += 1
>      # Swap letter, count here as we want to sort on count first
>      getvals = [(pair[1],pair[0]) for pair in dic.iteritems()]
>      getvals.sort()
>      return getvals[-3:]
>
> I'm not sure if  you wanted the function mostfreq to return the 3 most
> frequent letters of the first 3 letters? It seems to do the latter. The
> code above uses the former, i.e. letters with highest frequency.
>
> Paul- Hide quoted text -
>
> - Show quoted text -

I think I'd try to get a deque on disk.  Constant indexing.  Store
disk addresses in b-trees.  How long does 'less than' take?  Is a
sector small, and what's inside?



More information about the Python-list mailing list