Microbenchmark: Summing over array of doubles

Mon Aug 2 20:36:33 EDT 2004

On 2 Aug 2004, Yaroslav Bulatov wrote:

> You are right, how silly of me! Fixing the script now results in 130
> millis mean, 8.42 millis standard deviation, which is slower than
> numarray (104, 2.6 respectively). I wonder why numarray gives faster
> results on such a simple task?

The reason is that your C loop uses array indexing, whereas numarray's
increments the array pointer on each iteration through the loop an forgoes
any indexing.  Even though this translates into about same number of
opcodes on x86, the latter is slightly faster because it skips the
(implied) bitshift operation.  Also don't forget to compile your C version
with -O3, as this can make a big difference in that little loop (maybe not
in your case though).

In principle, for simple access patterns like summation, etc., numarray 
will always be faster than your "typical" C implementation, since its core 
vector functions were designed to be very efficient.  Of course, an 
efficient C implementation will be as fast or faster, but you'd have to 
use those efficiency tricks in every vector operation you perform, which 
could be quite tedious without writing your own library.  (The above 
generaliztion of course only applies to very large arrays.)

Either way, your results are quite promising -- they show that Python with 
numarray is as good or better than C when it comes to vector processing 
applications, all while being easier to use.  That's good publicity!