Tough sorting problem: or, I'm confusing myself

Raymond Hettinger python at rcn.com
Sun Apr 11 15:39:25 EDT 2010


On Apr 9, 8:03 am, david jensen <dmj.... at gmail.com> wrote:
> Hi all,
>
> I'm trying to find a good way of doing the following:
>
> Each n-tuple in combinations( range( 2 ** m ), n ) has a corresponding
> value n-tuple (call them "scores" for clarity later). I'm currently
> storing them in a dictionary, by doing:
>
> ----
> res={}
> for i in itertools.combinations( range( 2**m ) , n):
>     res[ i ] = getValues( i )    # getValues() is computationally
> expensive
> ----
>
> For each (n-1)-tuple, I need to find the two numbers that have the
> highest scores versus them. I know this isn't crystal clear, but
> hopefully an example will help: with m=n=3:
>
> Looking at only the (1, 3) case, assuming:
> getValues( (1, 2, 3) ) == ( -200, 125, 75 )    # this contains the
> highest "other" score, where 2 scores 125
> getValues( (1, 3, 4) ) == ( 50, -50, 0 )
> getValues( (1, 3, 5) ) == ( 25, 300, -325 )
> getValues( (1, 3, 6) ) == ( -100, 0, 100 )    # this contains the
> second-highest, where 6 scores 100
> getValues( (1, 3, 7) ) == ( 80, -90, 10  )
> getValues( (1, 3, 8) ) == ( 10, -5, -5 )
>
> I'd like to return ( (2, 125), (6, 100) ).
>
> The most obvious (to me) way to do this would be not to generate the
> res dictionary at the beginning, but just to go through each
> combinations( range( 2**m), n-1) and try every possibility... this
> will test each combination n times, however, and generating those
> values is expensive. [e.g. (1,2,3)'s scores will be generated when
> finding the best possibilities for (1,2), (1,3) and (2,3)]
>
> What I'm doing now is ugly, and i think is where i'm confusing myself:
>
> ----
> best2={}
> for i in itertools.combinations( range( 2**m), n-1):
>     scorelist=[]
>     for j in range( 2**m ):
>         if j not in i:
>             k=list(i)
>             k.append(j)
>             k=tuple(sorted(k))    #gets the key for looking up the
> scores in res
>             scorelist.append((j,res[k][k.index(j)]))
>     best2[i]=sorted(scorelist,key=lambda x: -x[1])[:2]
> ----
>
> Am I missing an obviously better way?

The overall algorithm looks about right.
The inner-loop could be tighted-up a bit.
And you could replace the outer sort with a heap.

best2 = {}
for i in itertools.combinations(range( 2**m), n-1):
    scorelist = []
    for j in range( 2**m ):
        if j not in i:
            k = tuple(sorted(i + (j,)))
            scorelist.append((j, res[k][k.index(j)]))
    best2[i] = heapq.nlargest(2, scorelist,
key=operator.itemgetter(1))


Raymond



More information about the Python-list mailing list