[Numpy-discussion] [ANN] Nanny, faster NaN functions

Keith Goodman kwgoodman at gmail.com
Fri Nov 19 15:19:56 EST 2010


On Fri, Nov 19, 2010 at 12:10 PM,  <josef.pktd at gmail.com> wrote:

> What's the speed advantage of nanny compared to np.nansum that you
> have if the arrays are larger, say (1000,10) or (10000,100) axis=0 ?

Good point. In the small examples I showed so far maybe the speed up
was all in overhead. Fortunately, that's not the case:

>> arr = np.random.rand(1000, 1000)
>> timeit np.nansum(arr)
100 loops, best of 3: 4.79 ms per loop
>> timeit ny.nansum(arr)
1000 loops, best of 3: 1.53 ms per loop

>> arr[arr > 0.5] = np.nan
>> timeit np.nansum(arr)
10 loops, best of 3: 44.5 ms per loop
>> timeit ny.nansum(arr)
100 loops, best of 3: 6.18 ms per loop

>> timeit np.nansum(arr, axis=0)
10 loops, best of 3: 52.3 ms per loop
>> timeit ny.nansum(arr, axis=0)
100 loops, best of 3: 12.2 ms per loop

np.nansum makes a copy of the input array and makes a mask (another
copy) and then uses the mask to set the NaNs to zero in the copy. So
not only is nanny faster, but it uses less memory.



More information about the NumPy-Discussion mailing list