[Numpy-discussion] np.unique with structured arrays

Jaime Fernández del Río jaime.frio at gmail.com
Fri Aug 22 13:43:35 EDT 2014


structured arrays are of VOID dtype, but with a non-None names attribute:

>>> V_.dtype.num
20
>>> V_.dtype.names
('v',)
>>> V_.view(np.void).dtype.num
20
>>> V_.view(np.void).dtype.names
>>>

The comparison function uses the STRING comparison function if names is
None, or a proper field by field comparison if not, see here:

https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/arraytypes.c.src#L2675

With a quick look at the source, the only fishy thing I see is that the
original array has the sort axis moved to the end of the shape tuple, and
is then copied into a contiguous array here:

https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/item_selection.c#L1151

But that new array should preserve the dtype unchanged, and hence the right
compare function should be called. If no one with a better understanding of
the internals spots it, I will try to further debug it over the weekend.

Jaime


On Fri, Aug 22, 2014 at 7:54 AM, Eelco Hoogendoorn <
hoogendoorn.eelco at gmail.com> wrote:

> Oh yeah this could be. Floating point equality and bitwise equality are
> not the same thing.
> ------------------------------
> From: Jaime Fernández del Río <jaime.frio at gmail.com>
> Sent: ‎22-‎8-‎2014 16:22
>
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Subject: Re: [Numpy-discussion] np.unique with structured arrays
>
> I can confirm, the issue seems to be in sorting:
>
> >>> np.sort(V_)
> array([([0.5, 0.0, 1.0],), ([0.5, 0.0, -1.0],), ([0.5, -0.0, 1.0],),
>        ([0.5, -0.0, -1.0],)],
>       dtype=[('v', '<f4', (3,))])
>
> These I think are handled by the generic sort functions, and it looks like
> the comparison function being used is the one for a VOID dtype with no
> fields, so it is being done byte-wise, hence the problems with 0.0 and
> -0.0. Not sure where exactly the bug is, though...
>
> Jaime
>
>
>
> On Fri, Aug 22, 2014 at 6:20 AM, Nicolas P. Rougier <
> Nicolas.Rougier at inria.fr> wrote:
>
>>
>> Hello,
>>
>> I've found a strange behavior or I'm missing something obvious (or
>> np.unique is not supposed to work with structured arrays).
>>
>> I'm trying to extract unique values from a simple structured array but it
>> does not seem to work as expected.
>> Here is a minimal script showing the problem:
>>
>> import numpy as np
>>
>> V = np.zeros(4, dtype=[("v", np.float32, 3)])
>> V["v"] = [ [0.5,    0.0,   1.0],
>>            [0.5, -1.e-16,  1.0], # [0.5, +1.e-16,  1.0] works
>>            [0.5,    0.0,  -1.0],
>>            [0.5, -1.e-16, -1.0]] # [0.5, +1.e-16, -1.0]] works
>> V_ = np.zeros_like(V)
>> V_["v"][:,0] = V["v"][:,0].round(decimals=3)
>> V_["v"][:,1] = V["v"][:,1].round(decimals=3)
>> V_["v"][:,2] = V["v"][:,2].round(decimals=3)
>>
>> print np.unique(V_)
>> [([0.5, 0.0, 1.0],) ([0.5, 0.0, -1.0],) ([0.5, -0.0, 1.0],) ([0.5, -0.0,
>> -1.0],)]
>>
>>
>> While I would have expected:
>>
>> [([0.5, 0.0, 1.0],) ([0.5, 0.0, -1.0],)]
>>
>>
>> Can anyone confirm ?
>>
>>
>> Nicolas
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
> (\__/)
> ( O.o)
> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
> de dominación mundial.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140822/ebe9639a/attachment.html>


More information about the NumPy-Discussion mailing list