[Tutor] Counting & Sorting Instances In File
Bob Gailer
bgailer@alum.rpi.edu
Thu May 15 09:07:01 2003
At 06:59 PM 5/14/2003 -0700, Jeff Shannon wrote:
>Michael Barrett wrote:
>
>> The hard (?) part is the sorting. Thats the part I need help with,
>> so assume a dictionary of 'Email': count values. If you can think of a
>> better way of storing the data in memory for my sort, that'd be
>> appreciated as well. Thanks again.
>
>Doing your counting *is* best done with a dictionary, especially if you
>make use of the get() method --
>
>for line in logfile:
> email = parse_email_addr(line)
> emailcount[email] = emailcount.get(email, 0) + 1
>
>Once you have that dictionary of email:count values, you can convert that
>to a list and use the list's built-in sort() method. When sorting a list
>where each element is another list (or a tuple), the first element of the
>nested list is used to sort on
Oh? Consider:
>>> l = [[3,4], [3,2]]
>>> l.sort()
>>> l
[[3, 2], [3, 4]]
>, so the best thing to do is to ensure that that first element is your count.
>
># first, use a list comprehension to create a list of (count, email) pairs
>email_list = [ (value, key) for key, value in emailcount.items() ]
># now sort the list
>email_list.sort()
># to sort from highest count to lowest count, reverse the list
>email_list.reverse()
># now print the first 25
>for count, email in email_list[:25]:
> print "%25s %d" % (email, count)
>
>If there's any of this that doesn't make sense, I can explain in a little
>more detail...
>
>Jeff Shannon
>Technician/Programmer
>Credit International
>
>
>
>_______________________________________________
>Tutor maillist - Tutor@python.org
>http://mail.python.org/mailman/listinfo/tutor
>
>
>
>
>---
>Incoming mail is certified Virus Free.
>Checked by AVG anti-virus system (http://www.grisoft.com).
>Version: 6.0.478 / Virus Database: 275 - Release Date: 5/6/2003
Bob Gailer
bgailer@alum.rpi.edu
303 442 2625