[Numpy-discussion] Array vectorization in numpy

Tue Jul 19 15:29:43 EDT 2011

On Tue, Jul 19, 2011 at 4:05 AM, Carlos Becker <carlosbecker at gmail.com> wrote:
> Hi, I started with numpy a few days ago. I was timing some array operations
> and found that numpy takes 3 or 4 times longer than Matlab on a simple
> array-minus-scalar operation.

Doing these kinds of timings correctly is a tricky issue, and the
method you used is at fault.  It is testing more than just the
vectorized array-minus-scalar operation, it is also timing a range()
call and list creation for the loop, as well as vector result object
creation and deletion time, both of which add constant overhead to the
result (which is itself rather small and susceptible to overhead
bias).  Whereas the matlab loop range equivalent is part of the syntax
itself, and can therefore be optimized better.  And depending on the
type of garbage collection Matlab uses, it may defer the destruction
of the temporaries until after the timing is done (ie. when it exits,
whereas Python has to destruct the object on each loop they way you've
written it.)

First of all, use the 'timeit' module for timing:

%python
>>> import timeit
>>> t=timeit.Timer('k = m - 0.5', setup='import numpy as np;m = np.ones([2000,2000],float)')
>>> np.mean(t.repeat(repeat=100, number=1))
0.022081942558288575

That will at least give you a more accurate timing of just the summing
expression itself, and not the loop overhead.  Furthermore, you can
also reuse the m array for the sum, rather than allocating a new one,
which will give you a better idea of just the vectorized subtration
time:

>>> t=timeit.Timer('m -= 0.5', setup='import numpy as np;m = np.ones([2000,2000],float)')
>>> np.mean(t.repeat(repeat=100, number=1))
0.015955450534820555

Note that the value has dropped considerably.

In the end, what you are attempting to time is fairly simple, so any
extra overhead you add that is not actually the vectorized sum, will
bias your results.  You have to be extremely careful with these timing
comparisons, since you may be comparing apples to oranges.

At the least, try to give the vectorizing code much more work to do,
for example you are summing only over about 32 Megs.  Try about half a
gig, and compare that with Matlab, in order to reduce the percentage
of overhead to summing in your timings:

>>> t=timeit.Timer('m -= 0.5', setup='import numpy as np;m = np.ones([8092,8092],float)')
>>> np.mean(t.repeat(repeat=100, number=1))
0.26796033143997194

Try comparing these examples to your existing Matlab timings, and you
should find Python w/ numpy comparing favorably (or even beating
Matlab).  Of course, then you could improve your Matlab timings; in
the end they should be almost the same when done properly.  If not, by
all means let us know.

-Chad