unique-ifying a list

Fri Aug 7 17:41:07 EDT 2009

On Aug 7, 1:53 pm, kj <no.em... at please.post> wrote:
> Suppose that x is some list.  To produce a version of the list with
> duplicate elements removed one could, I suppose, do this:
>
>     x = list(set(x))
>
> but I expect that this will not preserve the original order of
> elements.
>
> I suppose that I could write something like
>
> def uniquify(items):
>     seen = set()
>     ret = []
>     for i in items:
>         if not i in seen:
>             ret.append(i)
>             seen.add(i)
>     return ret
>
> But this seems to me like such a commonly needed operation that I
> find it hard to believe one would need to resort to such self-rolled
> solutions.  Isn't there some more standard (and hopefully more
> efficient, as in "C-coded"/built-in) approach?
>

Honestly, doing unique operations is pretty rare in the application
level. Unless you're writing some kind of database, I don't see why
you'd do it. (My recommendation is not to write databases.)

If you want unique elements, use a set. If you want to order them,
sort a list of the items in the set.

If you want to preserve the order, then using a dict may be even
better.

    orig_order = dict(reversed([reversed(i) for i in enumerate
(items)])
    unique_ordered = sorted(orig_order.keys(), key=lambda k: orig_order
[k])

Hints to understanding:
* enumerate generates (index, item) pairs.
* We reverse each pair so that we get an item -> index mapping.
* We reverse it so that the first ones appear last. Later pairs
override earlier ones in dict().