[Tutor] Counting & Sorting Instances In File
Jeff Shannon
jeff@ccvcorp.com
Wed May 14 21:58:02 2003
Michael Barrett wrote:
> The hard (?) part is the sorting. Thats the part I need help with, so assume a dictionary of 'Email': count values. If you can think of a better way of storing the data in memory for my sort, that'd be appreciated as well. Thanks again.
>
Doing your counting *is* best done with a dictionary, especially if you
make use of the get() method --
for line in logfile:
email = parse_email_addr(line)
emailcount[email] = emailcount.get(email, 0) + 1
Once you have that dictionary of email:count values, you can convert
that to a list and use the list's built-in sort() method. When sorting
a list where each element is another list (or a tuple), the first
element of the nested list is used to sort on, so the best thing to do
is to ensure that that first element is your count.
# first, use a list comprehension to create a list of (count, email) pairs
email_list = [ (value, key) for key, value in emailcount.items() ]
# now sort the list
email_list.sort()
# to sort from highest count to lowest count, reverse the list
email_list.reverse()
# now print the first 25
for count, email in email_list[:25]:
print "%25s %d" % (email, count)
If there's any of this that doesn't make sense, I can explain in a
little more detail...
Jeff Shannon
Technician/Programmer
Credit International