General Numerical Python question

Alex Martelli aleaxit at yahoo.com
Sun Oct 12 11:45:44 EDT 2003


2mc wrote:

> Generally speaking, if one had a list (from regular Python) and an
> array (from Numerical Python) that contained the same number of
> elements, would a While loop or a For loop process them at the same
> speed?  Or, would the array process faster?
> 
> I'm new to Python, so my question may expose my ignorance.  I
> appreciate anyone's effort to help me understand.

I don't know, I've never measured.  Let's find out together.

The best way to answer these performance questions, which may
easily vary a little depending on your platform and exact versions
involved, is to _measure_.  Python 2.3's standard library comes with
timeit.py, a little script that's made just for that.  I've copied it to my
~/bin/ directory and done a chmod +x (it starts with a shebang line
so that's sufficient), or in Windows you might set up a .bat or .cmd
file to call Python on it.  Anyway, it's easy to use: you specify zero
or more -s 'blahblah' arguments to set things up, then the specific
statement you want to time.  Watch...:

[alex at lancelot pop]$ timeit.py -s'import Numeric' -s'x=Numeric.arange(555)' 
'for i in x: id(i)'
1000 loops, best of 3: 296 usec per loop
[alex at lancelot pop]$ timeit.py -s'import Numeric' -s'x=range(555)' 'for i in 
x: id(i)'
1000 loops, best of 3: 212 usec per loop
[alex at lancelot pop]$ timeit.py -s'import Numeric' -s'x=Numeric.arange(555)' 
'for i in x: id(i)'
1000 loops, best of 3: 296 usec per loop
[alex at lancelot pop]$ timeit.py -s'import Numeric' -s'x=range(555)' 'for i in 
x: id(i)'
1000 loops, best of 3: 207 usec per loop
[alex at lancelot pop]$

So, on this specific case, looping over a list of ints is a bit faster than 
looping over an otherwise equivalent Numeric.array -- about 210 
microseconds versus about 300.

Similarly:

[alex at lancelot pop]$ timeit.py -s'import Numeric' -s'x=range(555)' 'for i in 
range(len(x)): x[i]=id(x[i])'
1000 loops, best of 3: 353 usec per loop
[alex at lancelot pop]$ timeit.py -s'import Numeric' -s'x=range(555)' 'for i in 
range(len(x)): x[i]=id(x[i])'
1000 loops, best of 3: 356 usec per loop
[alex at lancelot pop]$ timeit.py -s'import Numeric' -s'x=Numeric.arange(555)' 
'for i in range(len(x)): x[i]=id(x[i])'
1000 loops, best of 3: 581 usec per loop
[alex at lancelot pop]$ timeit.py -s'import Numeric' -s'x=Numeric.arange(555)' 
'for i in range(len(x)): x[i]=id(x[i])'
1000 loops, best of 3: 585 usec per loop

Here we're accessing AND also modifying each element by index, and the
list outperforms the array about 350 microseconds to 580.

So, measure operations of your interest, on platforms of your interest,
for roughly the kinds of list/array sizes you'll be using, and you'll KNOW
what performance issues you may be facing, rather than guessing.

In most cases you'll conclude that the difference is not important enough --
a factor of 1.5 or more may seem large, but here we're just doing a trivial
operation on each item -- if we were doing more the looping overhead
would matter less.  AND some operations are available as ufuncs in
Numeric, cutting down loop overhead dramatically.  And in the end a
100 to 200 microseconds' difference may just not matter much, depending
on your application.  But anyway, you do get the ability to measure just
what you need to.


Alex





More information about the Python-list mailing list