Sorting distionary by value

Bengt Richter bokr at oz.net
Thu Mar 28 21:22:22 EST 2002


On Thu, 28 Mar 2002 14:48:28 +0000, philh at comuno.freeserve.co.uk (phil hunt) wrote:

>On Wed, 27 Mar 2002 23:02:20 -0500, Peter Hansen <peter at engcorp.com> wrote:
>>Jim Dennis wrote:
>>> 
>>>  The core loop is something like:
>>> 
>>>         freq = {}
>>>         for line in file:
>>>                 for word in line.split():
>>>                         if word in freq:        freq[word] += 1
>>>                         else:                           freq[word] = 1
>>> 
>>>  (assuming Python2.2 for file interation and dictionary membership
>>>  support using "in."  I *really* like those 2.2 features!  They make
>>>  my psuedo-code so executable!)
>>
>>Something like   freq[word] = freq.get(word, 0) + 1
>>
>>would probably be faster, and it's a little simpler I think,
>>although I could see an argument that it's less readable.
>>Also doesn't depend on 2.2.
>
>IIRC in Awk you can just say:   freq[word] ++ and it works 
>correctly even when there is no pre-existing index of word in freq.
>
>IMO it's a pity Python isn't like that.
>
Since successful freq[word]+=1 is the rule (in proportion to the final frequency ;-)
it should be cheap to write

    try:
        freq[word] += 1
    except KeyError:
        freq[word] = 1

Do I misremember a timbot post to that effect?

Or is "try:" more expensive than all the rest ("if word in freq",
"if freq.has_key(word)", and "freq.get(word,0)") ?

Regards,
Bengt Richter



More information about the Python-list mailing list