fastest way to find the intersection of n lists of sets

Raymond Hettinger python at rcn.com
Mon Apr 30 03:28:40 EDT 2007


[Prateek]
> I have 3 variable length lists of sets. I need to find the common
> elements in each list (across sets) really really quickly.
. . .
> l1 = reduce(operator.add, list(x) for x in l1)
> l2 = reduce(operator.add, list(x) for x in l2)
> l3 = reduce(operator.add, list(x) for x in l3)
> s = frozenset(l1) & frozenset(l2) & frozenset(l3)

I would write it like this:

def multi_union(listofsets):
    result = set()
    for s in listofsets:
        result |= s
    return result

def multi_intersection(listofsets):
    return reduce(set.intersection, sorted(listofsets, key=len))

s = multi_intersection(map(multi_union, [l1, l2, l3]))


> I'm assuming frozensets are (somehow) quicker than sets because
> they're immutable.

Frozensets share the same implementation code as regular sets.
They are not more efficient.


> Any code suggestions? Maybe using something in the new fancy-schmancy
> itertools module?

The sets.py module shows how to implement set operations using
itertools.
In general, real set objects should do as well or better than anything
you can cook-up using itertools.

Real set objects have the advantage of knowing the hash values of
their
elements.  Accordingly, binary set operations can run without any
calls to element.__hash__().


Raymond Hettinger




More information about the Python-list mailing list