fastest way to find the intersection of n lists of sets

James Stroud jstroud at mbi.ucla.edu
Sun Apr 29 18:48:09 EDT 2007


Prateek wrote:
> I have 3 variable length lists of sets. I need to find the common
> elements in each list (across sets) really really quickly.
> 
> Here is some sample code:
> 
> # Doesn't make sense to union the sets - we're going to do
> intersections later anyway
> l1 = reduce(operator.add, list(x) for x in l1)
> l2 = reduce(operator.add, list(x) for x in l2)
> l3 = reduce(operator.add, list(x) for x in l3)
> 
> # Should I do this in two steps? Maybe by intersecting the two
> shortest lists first?
> s = frozenset(l1) & frozenset(l2) & frozenset(l3)
> 
> I'm assuming frozensets are (somehow) quicker than sets because
> they're immutable.
> 
> Any code suggestions? Maybe using something in the new fancy-schmancy
> itertools module?
> 
> Thanks,
> Prateek
> 

I don't understand why you cast to list. I would propose:

lists_of_sets = [l1, l2, l3]

reduce(set.intersection, (reduce(set.union, x) for x in lists_of_sets))

Since this is simplest, I'm guessing it should be fastest because I'm 
also guessing that set impelmentation is as optimized as list--I think 
this would hold true especially for large sets with sparse overlap 
between lists. So it might be reasonable consider the content of your 
sets and lists and write your code based on the content or on assuming a 
particular composition.

James



More information about the Python-list mailing list