Is there any advantage or disadvantage to using sets over list comps to ensure a list of unique entries?

Mon Jun 20 15:59:57 EDT 2011

On Mon, Jun 20, 2011 at 1:43 PM, deathweaselx86 <deathweasel at gmail.com> wrote:
> Howdy guys, I am new.
>
> I've been converting lists to sets, then back to lists again to get
> unique lists.
> e.g
>
> Python 2.5.2 (r252:60911, Jan 20 2010, 21:48:48)
> [GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> foo = ['1','2','3']
>>>> bar = ['2','5']
>>>> foo.extend(bar)
>>>> foo = list(set(foo))
>>>> foo
> ['1', '3', '2', '5']
>
> I used to use list comps to do this instead.
>>>> foo = ['1','2','3']
>>>> bar = ['2','5']
>>>> foo.extend([a for a in bar if a not in foo])
>>>> foo
> ['1', '2', '3', '5']
>
> A very long time ago, we all used dictionaries, but I'm not interested
> in that ever again. ;-)
> Is there any performance hit to using one of these methods over the
> other for rather large lists?

Yes.  In the second approach, "if a not in foo" is O(n) if foo is a
list, and since you're doing it m times, that's O(n * m).

The first approach is merely O(n + m).