Uniquifying a list?

Ben Finney bignose+hates-spam at benfinney.id.au
Tue Apr 18 20:22:57 EDT 2006


Tim Chase <tim at thechases.com> writes:

> Is there an obvious/pythonic way to remove duplicates from a 
> list (resulting order doesn't matter, or can be sorted 
> postfacto)?  My first-pass hack was something of the form
> 
>  >>> myList = [3,1,4,1,5,9,2,6,5,3,5]
>  >>> uniq = dict([k,None for k in myList).keys()
> 
> or alternatively
> 
>  >>> uniq = list(set(myList))
> 
> However, it seems like there's a fair bit of overhead 
> here...

It seems you haven't timed these to find out which one actually takes
longer; the first one gives syntax errors.

Rather than guessing about overhead, and things we can't put our
finger on, let's actually finger it:

$ python2.4 -m timeit -c 'raw = [3,1,4,1,5,9,2,6,5,3,5]; uniq = dict([(k,None) for k in raw]).keys()'
10000 loops, best of 3: 45 usec per loop

$ python2.4 -m timeit -c 'raw = [3,1,4,1,5,9,2,6,5,3,5]; uniq = list(set(raw))'
10000 loops, best of 3: 21 usec per loop


The set method is faster, in this example. It also seems to do what
you want. Why do you believe you even need to create a list from the
set?

$ python2.4 -m timeit -c 'raw = [3,1,4,1,5,9,2,6,5,3,5]; uniq = set(raw)'
100000 loops, best of 3: 13.1 usec per loop

Why not just create a set and use that?

-- 
 \       "It is forbidden to steal hotel towels. Please if you are not |
  `\        person to do such is please not to read notice."  -- Hotel |
_o__)                                         sign, Kowloon, Hong Kong |
Ben Finney




More information about the Python-list mailing list