Learning Python via a little word frequency program

rent rentlong at gmail.com
Fri Jan 11 05:27:55 EST 2008


import collections

names = "freddy fred bill jock kevin andrew kevin kevin jock"
freq = collections.defaultdict(int)
for name in names.split():
      freq[name] += 1
keys = freq.keys()
keys.sort(key = freq.get, reverse = True)
for k in keys:
      print "%-10s: %d" % (k, freq[k])

On Jan 9, 6:58 pm, Andrew Savige <ajsav... at yahoo.com.au> wrote:
> I'm learning Python by reading David Beazley's "Python Essential Reference"
> book and writing a few toy programs. To get a feel for hashes and sorting,
> I set myself this little problem today (not homework, BTW):
>
>   Given a string containing a space-separated list of names:
>
>     names = "freddy fred bill jock kevin andrew kevin kevin jock"
>
>   produce a frequency table of names, sorted descending by frequency.
>   then ascending by name. For the above data, the output should be:
>
>     kevin     : 3
>     jock      : 2
>     andrew    : 1
>     bill      : 1
>     fred      : 1
>     freddy    : 1
>
> Here's my first attempt:
>
> names = "freddy fred bill jock kevin andrew kevin kevin jock"
> freq = {}
> for name in names.split():
>     freq[name] = 1 + freq.get(name, 0)
> deco = zip([-x for x in freq.values()], freq.keys())
> deco.sort()
> for v, k in deco:
>     print "%-10s: %d" % (k, -v)
>
> I'm interested to learn how more experienced Python folks would solve
> this little problem. Though I've read about the DSU Python sorting idiom,
> I'm not sure I've strictly applied it above ... and the -x hack above to
> achieve a descending sort feels a bit odd to me, though I couldn't think
> of a better way to do it.
>
> I also have a few specific questions. Instead of:
>
> for name in names.split():
>     freq[name] = 1 + freq.get(name, 0)
>
> I might try:
>
> for name in names.split():
>     try:
>         freq[name] += 1
>     except KeyError:
>         freq[name] = 1
>
> Which is preferred?
>
> Ditto for:
>
> deco = zip([-x for x in freq.values()], freq.keys())
>
> versus:
>
> deco = zip(map(operator.neg, freq.values()), freq.keys())
>
> Finally, I might replace:
>
> for v, k in deco:
>     print "%-10s: %d" % (k, -v)
>
> with:
>
> print "\n".join("%-10s: %d" % (k, -v) for v, k in deco)
>
> Any feedback on good Python style, performance tips, good books
> to read, etc. is appreciated.
>
> Thanks,
> /-\
>
>       Make the switch to the world's best email. Get the new Yahoo!7 Mail now.www.yahoo7.com.au/worldsbestemail




More information about the Python-list mailing list