Merging ordered lists
Peter Otten
__peter__ at web.de
Tue Jun 3 04:22:55 EDT 2008
etal wrote:
> > def unique(items):
> > u = set(items)
> > if len(u) == len(items):
> > return items
> > result = []
> > for item in items:
> > if item in u:
> > result.append(item)
> > u.remove(item)
> > return result
>
> You did right by preserving the original (non-alphabetical) ordering,
> but I'm less enthusiastic about the shape of this function. My
> original function used 7 lines of code, and only 1 for the unique()
> step. This uses up to three container objects. Is it really an
> improvement?
Yes :)
Seriously, you are using O(n) containers and O(n) lookup where mine uses
O(1). For short lists it doesn't matter, but as the list length grows the
difference gets huge:
$ cat unique.py
def unique(items):
u = set(items)
if len(u) == len(items):
return items
result = []
for item in items:
if item in u:
result.append(item)
u.remove(item)
return result
def unique_lc(ref):
return [r for i, r in enumerate(ref) if r and r not in ref[i+1:]]
sample_a = range(1000)
sample_b = range(1000)
import random
for i in random.sample(sample_b, 10):
sample_b[i] = 0
$ python -m timeit -s"from unique import unique as u, sample_a as s" "u(s)"
10000 loops, best of 3: 52.8 usec per loop
$ python -m timeit -s"from unique import unique_lc as u, sample_a as
s" "u(s)"
10 loops, best of 3: 44.2 msec per loop
3 orders of magnitude for the shortcut.
$ python -m timeit -s"from unique import unique as u, sample_b as s" "u(s)"
1000 loops, best of 3: 646 usec per loop
$ python -m timeit -s"from unique import unique_lc as u, sample_b as
s" "u(s)"
10 loops, best of 3: 39 msec per loop
Still 2 orders of magnitude if my unique() does some actual work.
Peter
More information about the Python-list
mailing list