Sorting distionary by value

Christophe Delord christophe.delord at free.fr
Fri Mar 22 14:48:32 EST 2002


To run faster, you can rewrite your algorithm :


a = string.split(open(sys.argv[1],'r').read())

freq = {}

for i in a:
     freq[i] = freq.get(i,0) + 1

items = freq.items()
items.sort(lambda i,j: cmp(i[1],j[1]))
print items
for (i,n) in items:
     print i, n


freq is an associative array giving the number of occurences of each word.
items is the list of (word, freq) pairs from freq
The function given to sort allow to sort on the freq part of the items.

freq.get(i,0) is the number of occurence of i until now or 0 if the word 
i is new.

This may be faster...

Christophe.


Artur Skura wrote:

> Duncan Booth wrote:
> 
>>Artur Skura <arturs at iidea.pl> wrote in 
>>news:slrna9lqj1.9n1.arturs at aph.waw.pdi.net:
>>
>>>Is there an idiom in Python as to sorting dictionary by value,
>>>not keys? I came up with some solutions which are so inefficient
>>>that I'm sure there must be a simple way.
>>>
>>How do you know they are inefficient? Have you profiled your application 
>>and found this to be a bottleneck?
>>
> 
> No, and it seems the problem is not with sorting.
> I wanted to write a compact word counting script (well, in shell
> it can be done in a 5 lines or so), just  for fun.
> 
> and
> 
> import string,sys
> 
> 
> a = string.split(open(sys.argv[1],'r').read())
> 
> known = []
> times = 0
> output = {}
> 
> for i in a:
>     if i not in known:
>         for l in a:
>             if l == i:
>                 times = times + 1
>         known.append(i)
>         output[i] = times
>     times = 0
> 
> 
> items = [ (output[k], k) for k in output ]
> items.sort()
> items.reverse()
> items = [ (k, v) for (v, k) in items ]
> for k in items:
>     print k[0], k[1]
> 
> it seems it's slow not because of sorting...
> 
> Regards,
> Artur
> 


-- 
Christophe Delord
http://christophe.delord.free.fr/




More information about the Python-list mailing list