fastest way to find the intersection of n lists of sets

Raymond Hettinger python at rcn.com
Mon Apr 30 03:37:41 EDT 2007


[Prateek]
> The reason why I'm casting to a list first is because I found that
> creating a long list which I convert to a set in a single operation is
> faster (although probably less memory efficient - which I can deal
> with) than doing all the unions.

That would be a surprising result because set-to-set operations do
not have to recompute hash values.  Also, underneath-the-hood,
both approaches share the exact same implementation for inserting
new values one the hash value is known.

If you've seen an unfavorable speed comparison, then you most likely
had code that built new intermediate sets between step:

   common = s1 | s2 | s3 | s4 | s5 | s6 | s7 | s8 | s9

Instead, it is faster to build-up a single result set:

    common = set()
    for s in s1, s2, s3, s4, s5, s6, s7, s8, s9:
        common |= s


Raymond Hettinger




More information about the Python-list mailing list