[Tutor] Counting & Sorting Instances In File

Bob Gailer bgailer@alum.rpi.edu
Thu May 15 09:07:01 2003


At 06:59 PM 5/14/2003 -0700, Jeff Shannon wrote:

>Michael Barrett wrote:
>
>>    The hard (?) part is the sorting.  Thats the part I need help with, 
>> so assume a dictionary of 'Email': count values.  If you can think of a 
>> better way of storing the data in memory for my sort, that'd be 
>> appreciated as well.  Thanks again.
>
>Doing your counting *is* best done with a dictionary, especially if you 
>make use of the get() method --
>
>for line in logfile:
>    email = parse_email_addr(line)
>    emailcount[email] = emailcount.get(email, 0) + 1
>
>Once you have that dictionary of email:count values, you can convert that 
>to a list and use the list's built-in sort() method.  When sorting a list 
>where each element is another list (or a tuple), the first element of the 
>nested list is used to sort on

Oh? Consider:
 >>> l = [[3,4], [3,2]]
 >>> l.sort()
 >>> l
[[3, 2], [3, 4]]

>, so the best thing to do is to ensure that that first element is your count.
>
># first, use a list comprehension to create a list of (count, email) pairs
>email_list = [ (value, key) for key, value in emailcount.items() ]
># now sort the list
>email_list.sort()
># to sort from highest count to lowest count, reverse the list
>email_list.reverse()
># now print the first 25
>for count, email in email_list[:25]:
>    print "%25s     %d"  % (email, count)
>
>If there's any of this that doesn't make sense, I can explain in a little 
>more detail...
>
>Jeff Shannon
>Technician/Programmer
>Credit International
>
>
>
>_______________________________________________
>Tutor maillist  -  Tutor@python.org
>http://mail.python.org/mailman/listinfo/tutor
>
>
>
>
>---
>Incoming mail is certified Virus Free.
>Checked by AVG anti-virus system (http://www.grisoft.com).
>Version: 6.0.478 / Virus Database: 275 - Release Date: 5/6/2003

Bob Gailer
bgailer@alum.rpi.edu
303 442 2625