[Numpy-discussion] Revert the return of a single NaN for `np.unique` with floating point numbers?

Mon Aug 2 13:03:32 EDT 2021

Hi all,

In NumPy 1.21, the output of `np.unique` changed in the presence of
multiple NaNs.  Previously, all NaNs were returned when we now only
return one (all NaNs were considered unique):

    a = np.array([1, 1, np.nan, np.nan, np.nan])

Before 1.21:

    >>> np.unique(a)
    array([ 1., nan, nan, nan])

After 1.21:

    array([ 1., nan])

This change was requested in an old issue:

     https://github.com/numpy/numpy/issues/2111

And happened here:

     https://github.com/numpy/numpy/pull/18070

While, it has a release note.  I am not sure the change got the
attention it deserved.  This would be especially worrying if it is a
regression for anyone?

Cheers,

Sebastian

PS: One additional note, is that this does not work for object arrays
(it cannot reasonable):

    >>> np.unique(a.astype(object))
    array([1.0, nan, nan, nan], dtype=object)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210802/aa0a54e0/attachment.sig>