Sorting by item_in_another_list

Steven Bethard steven.bethard at gmail.com
Thu Oct 26 03:48:27 EDT 2006


Cameron Walsh wrote:
> Which brings me to the question, would this solution:
> 
> B = set(B)
> A = B + list(x for x in A if x not in B)
> 
> be faster than this solution:
> 
> B = set(B)
> A.sort(key=B.__contains__, reverse=True)
> 
> My guess is yes, since while the __contains__ method is only run once 
> for each object, the list still does not require sorting.

You're probably going to need to time that for your particular data. 
Here's what it looks like on an artificial data set:

$ python -m timeit -s "A = range(100); B = range(40, 60, 2); B_set = 
set(B)" "A = B + list(x for x in A if x not in B_set)"
10000 loops, best of 3: 54.5 usec per loop

$ python -m timeit -s "A = range(100); B = range(40, 60, 2); B_set = 
set(B)" "A.sort(key=B_set.__contains__, reverse=True)"
10000 loops, best of 3: 39.7 usec per loop

That said, I'd probably still use the first solution -- it's more 
immediately obvious why that one works.

STeVe



More information about the Python-list mailing list