counting occurrences
Tim Peters
tim.one at home.com
Sat Aug 4 01:18:38 EDT 2001
[Grant Griffin]
> I have long used dictionaries to counting occurrences of
> something (usually strings). It goes something like this:
>
> counter = {}
> for s in strings:
> counter[s] = counter.get(s, 0) + 1
> ...
> This then usually leads to a need to "sort keys by
> value"--something like this:
>
> itemlist = [(value, key) for (key, value) in counter.items()]
> itemlist.sort()
> itemlist.reverse()
> ...
> Then, "itemlist" contains sorted (value, key) pairs--usually to
> be printed or something.
>
> For me, this sort of thing comes up so often that it qualifies as
> an "idiom".
Sure.
> However, the sorting thing is a pain ...
> so I mentioned to a friend that it would sure be nice if Python
> dictionaries had an "items_sorted_by_value" method.
Won't happen -- you can easily write a function for it yourself in Python,
and it's not going to run significantly faster if it's coded by us in C
instead. Dicts have no internal ordering, so the best efficient thing
anyone can do here is materialize a list in full, sort it, and pull it apart
again.
Note that in Python 2.2, you can *subclass* from dicts, like here in
grantdict.py:
class Grant(dictionary):
def items_by_value_backwards(self):
items = [(v, k) for k, v in self.items()]
items.sort()
items.reverse()
return [(k, v) for v, k in items]
g = Grant()
f = open("grantdict.py")
for line in f:
for ch in line:
g[ch] = g.get(ch, 0) + 1
for ch, count in g.items_by_value_backwards()[:10]:
print "%6d %r" % (count, ch)
It counts the number of letters in itself, then prints the 10 most frequent:
93 ' '
20 'e'
18 't'
18 'r'
18 'i'
16 'n'
15 's'
15 '\n'
12 'c'
12 ')'
> But, among other things, my friend (who has been using Python for a
> *long* time) said that he has never wanted such a thing, because he
> doesn't see counting occurrences "as a dictionary operation".
Well, no sane person wants it built in <wink>, but of course dicts are
natural for this.
> So how do you think he does it?
Slowly and inappropriately.
> tim:-help-me-out-here<wink>-ly y'rs,
more-worried-about-your-friend-ly y'rs - tim
More information about the Python-list
mailing list