[Numpy-discussion] empty_like for masked arrays

Nathaniel Smith njs at pobox.com
Wed Jul 17 11:18:07 EDT 2013


On Mon, Jul 15, 2013 at 2:33 PM, Gregorio Bastardo
<gregorio.bastardo at gmail.com> wrote:
> Hi,
>
> On Mon, Jun 10, 2013 at 3:47 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> Hi all,
>>
>> Is there anyone out there using numpy masked arrays, who has an
>> opinion on how empty_like (and its friends ones_like, zeros_like)
>> should handle the mask?
>>
>> Right now apparently if you call np.ma.empty_like on a masked array,
>> you get a new masked array that shares the original array's mask, so
>> modifying one modifies the other. That's almost certainly wrong. This
>> PR:
>>   https://github.com/numpy/numpy/pull/3404
>> makes it so instead the new array has values that are all set to
>> empty/zero/one, and a mask which is set to match the input array's
>> mask (so whenever something was masked in the original array, the
>> empty/zero/one in that place is also masked). We don't know if this is
>> the desired behaviour for these functions, though. Maybe it's more
>> intuitive for the new array to match the original array in shape and
>> dtype, but to always have an empty mask. Or maybe not. None of us
>> really use np.ma, so if you do and have an opinion then please speak
>> up...
>
> I recently joined the mailing list, so the message might not reach the
> original thread, sorry for that.
>
> I use masked arrays extensively, and would vote for the first option,
> as I use the *_like operations with the assumption that the resulting
> array has the same mask as the original. I think it's more intuitive
> than selecting between all masked or all unmasked behaviour. If it's
> not too late, please consider my use case.

The original submitter of that PR has been silent since then, so so
far nothing has happened.

So that's 2 votes for copying the mask and 3 against, I guess. That's
not very consensus-ful. If there's really a lot of confusion here,
then it's possible the answer is that np.ma.empty_like should just
raise an error or not be defined. Or can you all agree?

-n



More information about the NumPy-Discussion mailing list