Sorting distionary by value
Bengt Richter
bokr at oz.net
Thu Mar 28 21:22:22 EST 2002
On Thu, 28 Mar 2002 14:48:28 +0000, philh at comuno.freeserve.co.uk (phil hunt) wrote:
>On Wed, 27 Mar 2002 23:02:20 -0500, Peter Hansen <peter at engcorp.com> wrote:
>>Jim Dennis wrote:
>>>
>>> The core loop is something like:
>>>
>>> freq = {}
>>> for line in file:
>>> for word in line.split():
>>> if word in freq: freq[word] += 1
>>> else: freq[word] = 1
>>>
>>> (assuming Python2.2 for file interation and dictionary membership
>>> support using "in." I *really* like those 2.2 features! They make
>>> my psuedo-code so executable!)
>>
>>Something like freq[word] = freq.get(word, 0) + 1
>>
>>would probably be faster, and it's a little simpler I think,
>>although I could see an argument that it's less readable.
>>Also doesn't depend on 2.2.
>
>IIRC in Awk you can just say: freq[word] ++ and it works
>correctly even when there is no pre-existing index of word in freq.
>
>IMO it's a pity Python isn't like that.
>
Since successful freq[word]+=1 is the rule (in proportion to the final frequency ;-)
it should be cheap to write
try:
freq[word] += 1
except KeyError:
freq[word] = 1
Do I misremember a timbot post to that effect?
Or is "try:" more expensive than all the rest ("if word in freq",
"if freq.has_key(word)", and "freq.get(word,0)") ?
Regards,
Bengt Richter
More information about the Python-list
mailing list