[Numpy-discussion] Adding an nd generalization of np.ma.mask_rowscols

Hameer Abbasi einstein.edison at gmail.com
Fri Jan 17 10:29:46 EST 2020


IMHO, masked arrays and extending masks like that is a weird API. I would prefer a more functional approach: Where we take in an input 1-D or N-D boolean array in addition to a masked array with multiple axes over which to extend the mask.

From: NumPy-Discussion <numpy-discussion-bounces+einstein.edison=gmail.com at python.org> on behalf of Eric Wieser <wieser.eric+numpy at gmail.com>
Reply to: Discussion of Numerical Python <numpy-discussion at python.org>
Date: Friday, 17. January 2020 at 11:40
To: Discussion of Numerical Python <numpy-discussion at python.org>
Subject: [Numpy-discussion] Adding an nd generalization of np.ma.mask_rowscols


Today, numpy has a np.ma.mask_rowcols function, which stretches masks along
the full length of an axis. For example, given the matrix::

>>> a2d = np.zeros((3, 3), dtype=int)

>>> a2d[1, 1] = 1

>>> a2d = np.ma.masked_equal(a2d, 1)

>>> print(a2d)

[[0 0 0]

 [0 -- 0]

 [0 0 0]]

The API allows::

>>> print(np.ma.mask_rowcols(a2d, axis=0))

[[0 0 0]

 [-- -- --]

 [0 0 0]]



>>> print(np.ma.mask_rowcols(a2d, axis=1))

[[0 -- 0]

 [0 -- 0]

 [0 -- 0]]



>>> print(np.ma.mask_rowcols(a2d, axis=None))

[[0 -- 0]

 [-- -- --]

 [0 -- 0]]

However, this function only works for 2D arrays.
It would be useful to generalize this to work on ND arrays as well.

Unfortunately, the current function is messy to generalize, because axis=0 means “spread the mask along axis 1”, and vice versa. Additionally, the name is not particularly good for an ND function.

My proposal in PR 14998<https://github.com/numpy/numpy/pull/14998> is to introduce a new function, mask_extend_axis, which fixes this shortcoming.
Given an 3D array::

>>> a3d = np.zeros((2, 2, 2), dtype=int)

>>> a3d[0, 0, 0] = 1

>>> a3d = np.ma.masked_equal(a3d, 1)

>>> print(a3d)

[[[-- 0]

  [0 0]]



 [[0 0]

  [0 0]]]

This, in my opinion, has clearer axis semantics:

>>> print(np.ma.mask_extend_axis(a2d, axis=0))

[[[-- 0]

  [0 0]]



 [[-- 0]

  [0 0]]]



>>> print(np.ma.mask_extend_axis(a2d, axis=1))

[[[-- 0]

  [-- 0]]



 [[0 0]

  [0 0]]]



>>> print(np.ma.mask_extend_axis(a2d, axis=2))

[[[-- --]

  [0 0]]



 [[0 0]

  [0 0]]]

Stretching over multiple axes remains possible:

>>> print(np.ma.mask_extend_axis(a2d, axis=(1, 2)))

[[[-- --]

  [-- 0]]



 [[0 0]

  [0 0]]]



# extending sequentially is not the same as extending in parallel

>>> print(np.ma.mask_extend_axis(np.ma.mask_extend_axis(a2d, axis=1), axis=2))

[[[-- --]

  [-- --]]



 [[0 0]

  [0 0]]]

Questions for the mailing list then:
·         Can you think of a better name than mask_extend_axis?
·         Does my proposed meaning of axis make more sense to you than the one used by mask_rowcols?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200117/ab71c1d6/attachment-0001.html>


More information about the NumPy-Discussion mailing list