[SciPy-User] Speed-up simple function

Francesc Alted faltet at pytables.org
Tue Jan 11 07:39:50 EST 2011


A Monday 10 January 2011 18:57:18 Nicolau Werneck escrigué:
> Great suggestion. I have tried just modifying the original numpy
> expression replacing the S**1.5 for S*sqrt(S), and just by doing that
> I already got a 2x speedup.
> 
> In the Cython version, using S*sqrt(S) gives a 7.3 speedup. Much
> better than using exponentiation. Using the approximate rsqrt will
> probably bring that closer to 10x.

Mmh, for double precision you cannot expect big speed-ups (at least, 
until the new AVX instruction set is broadly available).  Here it is an 
estimation on the speed-up you can get by using Numexpr+VML, which uses 
SSEx (SSE4 for my case):

>>> x = np.linspace(0,1,1e6)
>>> timeit np.sqrt(x)
100 loops, best of 3: 6.69 ms per loop
>>> timeit ne.evaluate("sqrt(x)")
100 loops, best of 3: 4.37 ms per loop  # only 1.5x speed-up

With simple precision things are different:

>>> x = np.linspace(0,1,1e6).astype('f4')
>>> timeit np.sqrt(x)
100 loops, best of 3: 4.61 ms per loop
>>> timeit ne.evaluate("sqrt(x)")
1000 loops, best of 3: 1.83 ms per loop  # 2.5x speed-up

In my opinion, as newer processors will wear more cores into them, 
multithreading will become a simpler (and cheaper) option for 
accelerating this sort of computations (as well as computations limited 
by memory bandwidth).  SIMD could help in getting more speed, of course, 
but as I see it, it is multithreading that will be key for computational 
problems in the next future (present?).

-- 
Francesc Alted



More information about the SciPy-User mailing list