Appending a list's elements to another list using a list comprehension

Alex Martelli aleax at mac.com
Thu Oct 18 09:47:54 EDT 2007


Debajit Adhikary <debajit1 at gmail.com> wrote:
   ...
> How does "a.extend(b)" compare with "a += b" when it comes to
> performance? Does a + b create a completely new list that it assigns
> back to a? If so, a.extend(b) would seem to be faster. How could I
> verify things like these?

That's what the timeit module is for, but make sure that the snippet
you're timing has no side effects (since it's repeatedly executed).
E.g.:

brain:~ alex$ python -mtimeit -s'z=[1,2,3];b=[4,5,6]'
'a=z[:];a.extend(b)'
1000000 loops, best of 3: 0.769 usec per loop
brain:~ alex$ python -mtimeit -s'z=[1,2,3];b=[4,5,6]' 'a=z[:];a+=b'
1000000 loops, best of 3: 0.664 usec per loop
brain:~ alex$ python -mtimeit -s'z=[1,2,3];b=[4,5,6]'
'a=z[:];a.extend(b)'
1000000 loops, best of 3: 0.769 usec per loop
brain:~ alex$ python -mtimeit -s'z=[1,2,3];b=[4,5,6]' 'a=z[:];a+=b'
1000000 loops, best of 3: 0.665 usec per loop
brain:~ alex$ 

The repetition of the measurements show them very steady, so now you
know that += is about 100 nanoseconds faster (on my laptop) than extend
(the reason is: it saves the tiny cost of looking up 'extend' on a; to
verify this, use much longer lists and you'll notice that while overall
times for both approaches increase, the difference between the two
approaches remains about the same for lists of any length).

But the key point to retain is: make sure that the snippet is free of
side effects, so that each of the MANY repetitions that timeit does is
repeating the SAME operation.  If we initialized a in the -s and then
just extended it in the snippet, we'd be extending a list that keeps
growing at each repetition -- a very different operation than extending
a list of a certain fixed starting length (here, serendipitously, we'd
end up measuring the same difference -- but in the general case, where
timing difference between approaches DOES depend on the sizes of the
objects involved, our measurements would instead become meaningless).

Therefore, we initialize in -s an auxiliary list, and copy it in the
snippet.  That's better than the more natural alternative:

brain:~ alex$ python -mtimeit 'a=[1,2,3];a+=[4,5,6]'
1000000 loops, best of 3: 1.01 usec per loop
brain:~ alex$ python -mtimeit 'a=[1,2,3];a.extend([4,5,6])'
1000000 loops, best of 3: 1.12 usec per loop
brain:~ alex$ python -mtimeit 'a=[1,2,3];a+=[4,5,6]'
1000000 loops, best of 3: 1.02 usec per loop
brain:~ alex$ python -mtimeit 'a=[1,2,3];a.extend([4,5,6])'
1000000 loops, best of 3: 1.12 usec per loop

as in this "more natural alternative" we're also paying each time
through the snippet the cost of building the literal lists; this
overhead (which is a lot larger than the difference we're trying to
measure!) does not DISTORT the measurement but it sure OBSCURES it to
some extend (losing us about one significant digit worth of difference
in this case).  Remember, the WORST simple operation you can do in
measurement is gauging a small number delta as the difference of two
much larger numbers X and X+delta... so, make X as small as feasible to
reduce the resulting loss of precision!-)

You can find more details on commandline use of timeit at
<http://docs.python.org/lib/node808.html> (see adjacent nodes in Python
docs for examples and details on the more advanced use of timeit inside
your own code) but I hope these indications may be of help anyway.


Alex



More information about the Python-list mailing list