[Numpy-discussion] Adding a 2D with a 1D array...

Thu Sep 10 04:49:13 EDT 2009

A Wednesday 09 September 2009 20:17:20 Dag Sverre Seljebotn escrigué:
> Ruben Salvador wrote:
> > Your results are what I expected...but. This code is called from my main
> > program, and what I have in there (output array already created for both
> > cases) is:
> >
> > print "lambd", lambd
> > print "np.shape(a)", np.shape(a)
> > print "np.shape(r)", np.shape(r)
> > print "np.shape(offspr)", np.shape(offspr)
> > t = clock()
> > for i in range(lambd):
> >     offspr[i] = r[i] + a[i]
> > t1 = clock() - t
> > print "For loop time ==> %.8f seconds" % t1
> > t2 = clock()
> > offspr = r + a[:,None]
> > t3 = clock() - t2
> > print "Pythonic time ==> %.8f seconds" % t3
> >
> > The results I obtain are:
> >
> > lambd 80000
> > np.shape(a) (80000,)
> > np.shape(r) (80000, 26)
> > np.shape(offspr) (80000, 26)
> > For loop time ==> 0.34528804 seconds
> > Pythonic time ==> 0.35956192 seconds
> >
> > Maybe I'm not measuring properly, so, how should I do it?
>
> Like Luca said, you are not including the creation time of offspr in the
> for-loop version. A fairer comparison would be
>
> offspr[...] = r + a[:, None]
>
> Even fairer (one less temporary copy):
>
> offspr[...] = r
> offspr += a[:, None]
>
> Of course, see how the trend is for larger N as well.
>
> Also your timings are a bit crude (though this depends on how many times
> you ran your script to check :-)). To get better measurements, use the
> timeit module, or (easier) IPython and the %timeit command.

Oh well, the art of benchmarking :)

The timeit module allows you normally get less jitter in timings because it 
loops on doing the same operation repeatedly and get a mean.  However, this 
has the drawback of filling your cache with the datasets (or part of them) so, 
in the end, your measurements with timeit does not take into account the time 
to transmit the data in main memory into the CPU caches, and that may be not 
what you want to measure.

In the case of Ruben, I think what he is seeing are cache effects.  Maybe if 
he does a loop, he would finally see the difference coming up (although this 
may be not what he want, of course ;-)

-- 
Francesc Alted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20090910/25a33766/attachment.html>