[SciPy-User] Speed-up simple function
Francesc Alted
faltet at pytables.org
Tue Jan 11 07:39:50 EST 2011
A Monday 10 January 2011 18:57:18 Nicolau Werneck escrigué:
> Great suggestion. I have tried just modifying the original numpy
> expression replacing the S**1.5 for S*sqrt(S), and just by doing that
> I already got a 2x speedup.
>
> In the Cython version, using S*sqrt(S) gives a 7.3 speedup. Much
> better than using exponentiation. Using the approximate rsqrt will
> probably bring that closer to 10x.
Mmh, for double precision you cannot expect big speed-ups (at least,
until the new AVX instruction set is broadly available). Here it is an
estimation on the speed-up you can get by using Numexpr+VML, which uses
SSEx (SSE4 for my case):
>>> x = np.linspace(0,1,1e6)
>>> timeit np.sqrt(x)
100 loops, best of 3: 6.69 ms per loop
>>> timeit ne.evaluate("sqrt(x)")
100 loops, best of 3: 4.37 ms per loop # only 1.5x speed-up
With simple precision things are different:
>>> x = np.linspace(0,1,1e6).astype('f4')
>>> timeit np.sqrt(x)
100 loops, best of 3: 4.61 ms per loop
>>> timeit ne.evaluate("sqrt(x)")
1000 loops, best of 3: 1.83 ms per loop # 2.5x speed-up
In my opinion, as newer processors will wear more cores into them,
multithreading will become a simpler (and cheaper) option for
accelerating this sort of computations (as well as computations limited
by memory bandwidth). SIMD could help in getting more speed, of course,
but as I see it, it is multithreading that will be key for computational
problems in the next future (present?).
--
Francesc Alted
More information about the SciPy-User
mailing list