letter frequency counter / your thoughts..

Paul Rubin http
Wed May 7 17:50:41 EDT 2008


umpsumps at gmail.com writes:
> def valsort(x):
> 	res = []
> 	for key, value in x.items():
> 		res.append((value, key))
> 	return res

Note: all code below is untested and may have errors ;-)

I think the above is misnamed because it doesn't actually sort.
Anyway, you could write it as a list comprehension:

    def valsort(d):
       return [(value, key) for (key, value) in d]


> def mostfreq(strng):
> 	dic = {}
> 	for letter in strng:
> 		if letter not in dic:
> 			dic.setdefault(letter, 1)
> 		else:
> 			dic[letter] += 1

I would write that with the defaultdict module:

    from collections import defaultdict
    def mostfreq(strng):
       dic = defaultdict(int)
       for letter in strng:
           dic[letter] += 1

Alternatively with regular dicts, you could say:

    def mostfreq(strng):
       dic = {}
       for letter in strng:
           dic[letter] = dic.get(letter, 0) + 1

> 	newd = dic.items()
> 	getvals = valsort(newd)
> 	getvals.sort()
> 	length = len(getvals)
> 	return getvals[length - 3 : length]

Someone else suggested the heapq module, which is a good approach
though it might be considered a little bit high-tech.  If you
want to use sorting (conceptually simpler), you could use the
sorted function instead of the in-place sorting function:

      # return the second element of a 2-tuple.  Note how we
      # use tuple unpacking: this is really a function of one argument
      # (the tuple) but we specify the arg as (a,b) so the tuple
      # is automatically unpacked on entry to the function.
      # this is a limited form of the "pattern matching" found in
      # languages like ML.
      def snd((a,b)): return b

      return sorted(dic.iteritems, key=snd, reverse=True)[-3:]



More information about the Python-list mailing list