[Numpy-discussion] the difference between "+" and np.add?
Chris Barker - NOAA Federal
chris.barker at noaa.gov
Fri Nov 23 14:00:35 EST 2012
On Thu, Nov 22, 2012 at 6:20 AM, Francesc Alted <francesc at continuum.io> wrote:
> As Nathaniel said, there is not a difference in terms of *what* is
> computed. However, the methods that you suggested actually differ on
> *how* they are computed, and that has dramatic effects on the time
> used. For example:
>
> In []: arr1, arr2, arr3, arr4, arr5 = [np.arange(1e7) for x in range(5)]
>
> In []: %time arr1 + arr2 + arr3 + arr4 + arr5
> CPU times: user 0.05 s, sys: 0.10 s, total: 0.14 s
> Wall time: 0.15 s
> There are also ways to minimize the size of temporaries, and numexpr is
> one of the simplests:
but you can also use np.add (and friends) to reduce the number of
temporaries. It can make a difference:
In [11]: def add_5_arrays(arr1, arr2, arr3, arr4, arr5):
....: result = arr1 + arr2
....: np.add(result, arr3, out=result)
....: np.add(result, arr4, out=result)
....: np.add(result, arr5, out=result)
In [13]: timeit arr1 + arr2 + arr3 + arr4 + arr5
1 loops, best of 3: 528 ms per loop
In [17]: timeit add_5_arrays(arr1, arr2, arr3, arr4, arr5)
1 loops, best of 3: 293 ms per loop
(don't have numexpr on this machine for a comparison)
NOTE: no point in going through all this unless this operation is
really a bottleneck in your code -- profile, profile, profile!
-Chris
PS: you can put a loop in the function to make it more generic:
In [18]: def add_n_arrays(*args):
....: result = args[0] + args[1]
....: for arr in args[2:]:
....: np.add(result, arr, result)
....: return result
In [21]: timeit add_n_arrays(arr1, arr2, arr3, arr4, arr5)
1 loops, best of 3: 317 ms per loop
> In []: import numexpr as ne
>
> In []: %time ne.evaluate('arr1 + arr2 + arr3 + arr4 + arr5')
> CPU times: user 0.04 s, sys: 0.04 s, total: 0.08 s
> Wall time: 0.04 s
> Out[]:
> array([ 0.00000000e+00, 5.00000000e+00, 1.00000000e+01, ...,
> 4.99999850e+07, 4.99999900e+07, 4.99999950e+07])
>
> Again, the computations are the same, but how you manage memory is critical.
>
> --
> Francesc Alted
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list