[Numpy-discussion] [ANN] Nanny, faster NaN functions

Fri Nov 19 15:19:56 EST 2010

On Fri, Nov 19, 2010 at 12:10 PM,  <josef.pktd at gmail.com> wrote:

> What's the speed advantage of nanny compared to np.nansum that you
> have if the arrays are larger, say (1000,10) or (10000,100) axis=0 ?

Good point. In the small examples I showed so far maybe the speed up
was all in overhead. Fortunately, that's not the case:

>> arr = np.random.rand(1000, 1000)
>> timeit np.nansum(arr)
100 loops, best of 3: 4.79 ms per loop
>> timeit ny.nansum(arr)
1000 loops, best of 3: 1.53 ms per loop

>> arr[arr > 0.5] = np.nan
>> timeit np.nansum(arr)
10 loops, best of 3: 44.5 ms per loop
>> timeit ny.nansum(arr)
100 loops, best of 3: 6.18 ms per loop

>> timeit np.nansum(arr, axis=0)
10 loops, best of 3: 52.3 ms per loop
>> timeit ny.nansum(arr, axis=0)
100 loops, best of 3: 12.2 ms per loop

np.nansum makes a copy of the input array and makes a mask (another
copy) and then uses the mask to set the NaNs to zero in the copy. So
not only is nanny faster, but it uses less memory.