[Tutor] sorting objects on two attributes

Eric Abrahamsen eric at abrahamsen.com
Tue Mar 4 03:50:25 CET 2008


Well I expected to learn a thing or two, but I didn't expect not to  
understand the suggestions at all! :) Thanks to everyone who  
responded, and sorry for the typo (it was meant to be object_id  
throughout, not content_type).

So far Michael's solution works and is most comprehensible to me – ie  
I stand a fighting chance of figuring it out.

I didn't realize you could make direct use of SortedDicts in django,  
so that's worth a try, too.

Itertools.groupby is totally impenetrable to me, but works as well! Is  
there any consensus about which might go faster? I will go right now  
and google until my brain is full.

Thanks again,

Eric


On Mar 4, 2008, at 12:56 AM, Michael H. Goldwasser wrote:

>
>
> Hello Eric,
>
>  Your basic outlook is fine, but you can do it much more efficiently
>  with a single sort.   Here's the way I'd approach the task  
> (untested):
>
>  # --------------------------------------------------
>  # first compute the latest date for each id group; uses O(n) time
>  newest = {}
>  for q in queryset:
>      id,date = q.object_id, q.submit_date
>      if id not in newest or date > newest[id]:
>          newest[id] = date
>
>  # now sort based on the following decorator as a key
>  data.sort(reverse=True, key=lambda x: (newest[x.object_id],  
> x.object_id, x.submit_date))
>  # --------------------------------------------------
>
>  In essence, you compute the max date within each group, but your
>  approach (i.e., building explicit sublists and then repeatedly
>  calling max on those sublists) is far more time-consuming than the
>  above dictionary based approach.
>
>  Note well that using a tuple as a decorator key is more efficient
>  than calling sort separately for each subgroup.   The
>  lexicographical order of the following
>
>    (newest[x.object_id], x.object_id, x.submit_date)
>
>  should produce the order that you desire. The reason for the middle
>  entry is to ensure that items groups by object_id in the case that
>  two different groups achieve the same maximum date.  It wasn't clear
>  to me whether you wanted elements within groups from oldest to
>  newest or newest to oldest. I believe that the code I give produces
>  the ordering that you intend, but you may adjust the sign of the
>  decorate elements if necessary.
>
> With regard,
> Michael
>
>       +-----------------------------------------------
>       | Michael Goldwasser
>       | Associate Professor
>       | Dept. Mathematics and Computer Science
>       | Saint Louis University
>       | 220 North Grand Blvd.
>       | St. Louis, MO 63103-2007
>
>
> On Monday March 3, 2008, Eric Abrahamsen wrote:
>
>>   I have a grisly little sorting problem to which I've hacked  
>> together a
>>   solution, but I'm hoping someone here might have a better  
>> suggestion.
>>
>>   I have a list of objects, each of which has two attributes,  
>> object_id
>>   and submit_date. What I want is to sort them by content_type,  
>> then by
>>   submit_date within content_type, and then sort each content_type  
>> block
>>   according to which block has the newest object by submit_date.  
>> (This
>>   sequence of sorting might not be optimal, I'm not sure). I'm  
>> actually
>>   creating a list of recent comments on blog entries for a python- 
>> based
>>   web framework, and want to arrange the comments according to blog
>>   entry (content_type), by submit_date within that entry, with the
>>   entries with the newest comments showing up on top.
>>
>>   I don't believe a single cmp function fed to list.sort() can do  
>> this,
>>   because you can't know how two objects should be compared until you
>>   know all the values for all the objects. I'd be happy to be proven
>>   wrong here.
>>
>>   After some false starts with dictionaries, here's what I've got.
>>   Queryset is the original list of comments (I'm doing this in  
>> django),
>>   and it returns a list of lists, though I might flatten it  
>> afterwards.
>>   It works, but it's ghastly unreadable and if there were a more
>>   graceful solution I'd feel better about life in general:
>>
>>
>>   def make_com_list(queryset):
>>       ids = set([com.object_id for com in queryset])
>>       xlist = [[com for com in queryset if com.object_id == i] for  
>> i in
>>   ids]
>>       for ls in xlist:
>>           ls.sort(key=lambda x: x.submit_date)
>>       xlist.sort(key=lambda x: max([com.submit_date for com in x]),
>>   reverse=True)
>>       return xlist
>>
>>   I'd appreciate any hints!
>>
>>   Thanks,
>>   Eric
>>
>



More information about the Tutor mailing list