[Baypiggies] hello and returning unique elements

Hy Carrinski hcarrinski at gmail.com
Fri Apr 8 22:10:04 CEST 2011


Thank you for demonstrating a simpler and faster solution. It is
equivalent to a solution suggested by Andrew Dalke in the initially
referenced blog's comments back in 2006. My performance comparisons
had been only with even older solutions (designed prior to Python 2.4)
based on dictionaries rather than sets. The advantages are now clear
of storing a set of encountered objects, adding to a unique list in a
preferred order, and better yet of making it a generator.

Thanks,
Hy

On Fri, Apr 8, 2011 at 10:49 AM, Tim Hatch <tim at timhatch.com> wrote:
> By some quick experiments, it looks like you can double the speed from
> your version by not doing all the reversed/sorted stuff and just keeping
> track of it in a much less exciting way:
>
> def unique2(seq):
>  seen = set()
>  out = []
>  for i in seq:
>    if i not in seen:
>      out.append(i)
>      seen.add(i)
>  return out
>
> # yours
> $ python -m timeit -s 'from demo import unique' 'sum(unique([1, 1, 2, 3,
> 4]))'
> 100000 loops, best of 3: 5.07 usec per loop
> # mine
> $ python -m timeit -s 'from demo import unique2' 'sum(unique2([1, 1, 2, 3,
> 4]))'
> 100000 loops, best of 3: 2.65 usec per loop
>
> You can actually make it a smidge faster still by making it a generator,
> for at least this tiny case (as it becomes bigger, into the thousands of
> items, it's much more like unique2's timings):
>
> def unique3(seq):
>  seen = set()
>  for i in seq:
>    if i not in seen:
>      yield i
>      seen.add(i)
>
> $ python -m timeit -s 'from demo import unique3' 'sum(unique3([1, 1, 2, 3,
> 4]))'
> 1000000 loops, best of 3: 1.99 usec per loop
>
> Tim


More information about the Baypiggies mailing list