[Numpy-discussion] [ANN] Nanny, faster NaN functions

Fri Nov 19 14:19:57 EST 2010

On Fri, Nov 19, 2010 at 10:55 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Fri, Nov 19, 2010 at 10:33 AM, Keith Goodman <kwgoodman at gmail.com> wrote:
>> Nanny uses the magic of Cython to give you a faster, drop-in replacement for
>> the NaN functions in NumPy and SciPy.
>
> Neat!
>
> Why not make this a patch to numpy/scipy instead?

My guess is that having separate underlying functions for each dtype,
ndim, and axis would be a nightmare for a large project like Numpy.
But manageable for a focused project like nanny.

>> Nanny uses a separate Cython function for each combination of ndim, dtype, and
>> axis. You can get rid of a lot of overhead (useful in an inner loop, e.g.) by
>> directly importing the function that matches your problem::
>>
>>    >> arr = np.random.rand(10, 10)
>>    >> from nansum import nansum_2d_float64_axis1
>
> If this is really useful, then better to provide a function that finds
> the correct function for you?
>
> best_nansum = ny.get_best_nansum(ary[0, :, :], axis=1)
> for i in xrange(ary.shape[0]):
>    best_nansum(ary[i, :, :], axis=1)

That would be useful. It is what nanny.nansum does but it returns the
sum instead of the function.

>> - functions: nansum
>> - Operating systems: 64-bit (accumulator for int32 is hard coded to int64)
>> - dtype: int32, int64, float64
>> - ndim: 1, 2, and 3
>
> What does it even mean to do NaN operations on integers? (I'd
> sometimes find it *really convenient* if there were a NaN value for
> standard computer integers... but there isn't?)

Well, sometimes you write functions without knowing the dtype of the input.