When Python outruns C++ ...

Tue Apr 1 13:08:59 EST 2003

Mike Meyer  <mwm at mired.org> wrote:
> The trick to getting good performance out of interpreted languages is
> doing things to take advantage of those libraries.

This is a very true statement, and one which I think is
under-appreciated by a lot of people.

Over the past couple of years, I've been part of a group that wrote
about 18 KLOC in TCL.  One thing we found over and over again is that
if you care about speed, you need to throw away a lot of your
pre-conceptions about what's fast and what's not.  This is as true in
Python or Perl or Java as it is in TCL. To a certain extent, it's even
true of raw C++ vs. STL.

For example, let's say you're building up a list and at the end, you
want to know how many elements you added to the list.  The obvious
thing to do would be to keep track of how many elements you add as you
build the list.  It certainly seems like it should be more efficient
than building the list and then calling llength (that's TCL for
len()).  In fact, that's true in C or C++, but not true in most
interpreted languages.  It certainly wasn't true in TCL (we did
measurements).

I just tried it in Python, and (not surprisingly), found that it's not
true in Python either.  The program at the end of this post builds a 2
million element list two different way.  One way, it counts the
elements as it appends them.  The other way, it just appends them and
calls len() on the result.  When I run it on my 700 MHz Pentium Linux
box, I get:

-----------------------------------------
The first list is 2000000 long
It took 7.678 seconds to build, 0.000 to count, 7.678 total
The second list is 2000000 long
It took 9.103 seconds to build and count it
-----------------------------------------

Once you've performed the experiment, the reason is obvious.  The
naive thought that llength, or len(), would have to walk the list to
count the elements in linear time is just plain wrong; in both cases
it's got the length stored somewhere and can return it in constant
time.  By building the list, you've already paid for the cost of
computing its length.

The bottom line is exactly what Mike said: in interpreted languages,
the more you can push down into the libraries and interpreter core,
the better off you are most of the time.  And profiling helps too :-)

--------------------------------------

#!/usr/bin/env python

import time

n = 2000 * 1000

t0 = time.time()
l = []
for i in range (n):
    l.append(i)

t1 = time.time()
length = len(l)
t2 = time.time()

print "The first list is %d long" % length
print "It took %3.3f seconds to build, %3.3f to count, %3.3f total" \
      % (t1-t0, t2-t1, t2-t0)

del l

t0 = time.time()
l = []
length = 0
for i in range (n):
    l.append(i)
    length += 1
t1 = time.time()

print "The second list is %d long" % length
print "It took %3.3f seconds to build and count it" % (t1 - t0)