[Numpy-discussion] [ANN] Nanny, faster NaN functions
Keith Goodman
kwgoodman at gmail.com
Fri Nov 19 15:19:56 EST 2010
On Fri, Nov 19, 2010 at 12:10 PM, <josef.pktd at gmail.com> wrote:
> What's the speed advantage of nanny compared to np.nansum that you
> have if the arrays are larger, say (1000,10) or (10000,100) axis=0 ?
Good point. In the small examples I showed so far maybe the speed up
was all in overhead. Fortunately, that's not the case:
>> arr = np.random.rand(1000, 1000)
>> timeit np.nansum(arr)
100 loops, best of 3: 4.79 ms per loop
>> timeit ny.nansum(arr)
1000 loops, best of 3: 1.53 ms per loop
>> arr[arr > 0.5] = np.nan
>> timeit np.nansum(arr)
10 loops, best of 3: 44.5 ms per loop
>> timeit ny.nansum(arr)
100 loops, best of 3: 6.18 ms per loop
>> timeit np.nansum(arr, axis=0)
10 loops, best of 3: 52.3 ms per loop
>> timeit ny.nansum(arr, axis=0)
100 loops, best of 3: 12.2 ms per loop
np.nansum makes a copy of the input array and makes a mask (another
copy) and then uses the mask to set the NaNs to zero in the copy. So
not only is nanny faster, but it uses less memory.
More information about the NumPy-Discussion
mailing list