[Numpy-discussion] numexpr with the new iterator

Mark Wiebe mwwiebe at gmail.com
Tue Jan 11 00:45:28 EST 2011


On Mon, Jan 10, 2011 at 11:35 AM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> I'm a bit curious why the jump from 1 to 2 threads is scaling so poorly.
>  Your timings have improvement factors of 1.85, 1.68, 1.64, and 1.79.  Since
> the computation is trivial data parallelism, and I believe it's still pretty
> far off the memory bandwidth limit, I would expect a speedup of 1.95 or
> higher.


It looks like it is the memory bandwidth which is limiting the scalability.
 The slower operations scale much better than faster ones.  Below are some
timings of successively faster operations.  When the operation is slow
enough, it scales like I was expecting...

-Mark

Computing: 'cos(x**1.1) + sin(x**1.3) + tan(x**2.3)' with 20000000 points
Using numpy:
*** Time elapsed: 14.47
Using numexpr:
*** Time elapsed for 1 threads: 12.659000
*** Time elapsed for 2 threads: 6.357000
*** Ratio from 1 to 2 threads: 1.991348
Using numexpr_iter:
*** Time elapsed for 1 threads: 12.573000
*** Time elapsed for 2 threads: 6.398000
*** Ratio from 1 to 2 threads: 1.965145

Computing: 'x**2.345' with 20000000 points
Using numpy:
*** Time elapsed: 3.506
Using numexpr:
*** Time elapsed for 1 threads: 3.375000
*** Time elapsed for 2 threads: 1.747000
*** Ratio from 1 to 2 threads: 1.931883
Using numexpr_iter:
*** Time elapsed for 1 threads: 3.266000
*** Time elapsed for 2 threads: 1.760000
*** Ratio from 1 to 2 threads: 1.855682

Computing: '1*x+2*x+3*x+4*x+5*x+6*x+7*x+8*x+9*x+10*x+11*x+12*x+13*x+14*x'
with 20000000 points
Using numpy:
*** Time elapsed: 9.774
Using numexpr:
*** Time elapsed for 1 threads: 1.314000
*** Time elapsed for 2 threads: 0.703000
*** Ratio from 1 to 2 threads: 1.869132
Using numexpr_iter:
*** Time elapsed for 1 threads: 1.257000
*** Time elapsed for 2 threads: 0.683000
*** Ratio from 1 to 2 threads: 1.840410

Computing: 'x+2.345' with 20000000 points
Using numpy:
*** Time elapsed: 0.343
Using numexpr:
*** Time elapsed for 1 threads: 0.348000
*** Time elapsed for 2 threads: 0.300000
*** Ratio from 1 to 2 threads: 1.160000
Using numexpr_iter:
*** Time elapsed for 1 threads: 0.354000
*** Time elapsed for 2 threads: 0.293000
*** Ratio from 1 to 2 threads: 1.208191
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110110/a5bfc4ab/attachment.html>


More information about the NumPy-Discussion mailing list