finding items that occur more than once in a list

Bryan Olson fakeaddress at nowhere.org
Tue Mar 18 07:24:12 EDT 2008


Simon Forman wrote:
> Is there a more efficient way to do this?
> 
> def f(L):
>     '''Return a set of the items that occur more than once in L.'''
>     L = list(L)
>     for item in set(L):
>         L.remove(item)
>     return set(L)

That's neat, but quadratic time because list.remove() requires
a linear search. We can make an efficient variant by using
remove on a set rather than a list:

     def multiples(lst):
         singles = set(lst)
         mults = set()
         for x in lst:
             if x in singles:
                 singles.remove(x)
             else:
                 mults.add(x)
         return mults

Though probably better is:

     def multiples(lst):
         seen = set()
         mults = set()
         for x in lst:
             if x in seen:
                 mults.add(x)
             else:
                 seen.add(x)
         return mults


I've typically used dicts for such things, as in:

     def multiples(lst):
         h = {}
         for x in lst:
             h[x] = h.get(x, 0) + 1
         return set([x for x in h if h[x] > 1])


-- 
--Bryan



More information about the Python-list mailing list