[Tutor] Sorting against another list

Christopher Smith csmith@blakeschool.org
Fri, 13 Jul 2001 22:47:49 -0500


Remco Gerlich wrote:
> On  0, Sharriff Aina <NHYTRO@compuserve.com> wrote:
> > I have a list that is created dynamically, how do I sort this list so
that
> > it matches amother "mater template" list?
> > 
> > example:
> > [ 'aa', 'qq', 'ttt', 'hh']  # master list
> > 
> > my dynamic list cotains all or only a few of the elements in the master
> > list:
> > 
> > ['ttt', 'hh', 'aa'] ### i would like to sort it to produce ['aa''ttt',
> > 'hh']  accoriding to the sequense in the master list
> > 
> > or [ 'hh', 'qq' 'ttt'] ### would like ['qq', 'ttt', 'hh']
> >  
> > I would like to sort these example list so that they at least follow
the
> > order of the master list.
> 
> Hmm. Ok, we have two lists: 'master' and 'dynamic'.
> 
> You can give an argument to .sort(), an alternative way to compare two
> items. You want to compare them using the order of the master list. The
> easiest way to spell that is this:
> 
> def compare_using_master_list(a, b):
>    # Compare the indices of the two items in the master list
>    return cmp(master.index(a), master.index(b))
> 
> dynamic.sort(compare_using_master_list)
> 

Or to be able to allow *any* list to be used as your master,

        def cmp_using(l):
                return lambda a,b: cmp(l.index(a), l.index(b))
                
        dynamic.sort(cmp_using(master))

<cut>

>    
> def compare_using_master_cache(a, b):
>    # Compare the cached index values of a and b
>    return cmp(cachedict[a], cachedict[b])
>    
> dynamic.sort(compare_using_master_cache)
> 

You can do the same sort of trick here, too.

> This should be faster with all but the most trivial lists (but I didn't
test
> it).
> 

I tested it and it was significantly faster--about 3x for n=100,
but then because the list lookup approach is O(n^2) it gets
significantly slower as n increases.

Here are the times for using the dictionary approach
        n       seconds         method
#       10      0.00168800354   Sort(master(d))
#       100     0.010548114777  Sort(master(d))
#       1000    0.184627056122  Sort(master(d))
#       3000    0.680903911591  Sort(master(d))

...and using the list approach
#       10      0.0014672279358 Sort(master(l))
#       100     0.031841278076  Sort(master(l))
#       1000    3.28976583481   Sort(master(l))
#       3000    34.4554972649   Sort(master(l))

Slightly faster than the dictionary approach was the approach
below which adds indices to the master list and then appends
these indices onto the dynamic list, sorts it, and then removes
the indices.  It uses the zip(l1,l2) function which is equivalent
to map(None,l1,l2).

#       10      0.0011219978332 sortwrt
#       100     0.0105338096619 sortwrt
#       1000    0.136965751648  sortwrt
#       3000    0.47709608078   sortwrt

def sortwrt(l,master): #return l sorted wrt master
        '''Sort a list using a master list for the sort order.'''
        #
        # Indices are zipped into the master and then both
        # lists are sorted.  Then the elements of the list get
        # the indices linked in, a sort is performed on the indices
        # and then the indices are removed.  All modifications to
        # the list are done in place.  If an item does not appear
        # in the master, a statement is printed and a normally sorted
        # list is returned.
        #
        #       Christopher P. Smith, 7/13/2001
        #
        ml=master+[]
        ml=zip(ml,range(len(ml)))
        ml.sort()
        l.sort()
        i=j=0
        imax,jmax=len(l),len(ml)
        while i<imax and j<jmax:
                while i<imax and l[i]==ml[j][0]:
                        l[i]=(ml[j][1],ml[j][0])
                        i+=1
                if i<imax:
                        while j<jmax and l[i]<>ml[j][0]:
                                j+=1

	if j==jmax: #this is an error condition...perhaps raise an exception
		print l[i],"not found in master."
		imax=i #this is how many modifications will need to be undone
	else:
		l.sort()
        # 
        # Now remove the sorting keys
        #
        # Note: must modify in place rather than doing something like
        #  'l=[i[1] for i in l]' 
        # which would create a local l and not affect the original
        #
        for k in range(imax):
                l[k]=l[k][1]
        
/c