[Numpy-discussion] slicing with boolean in numpy master

Sebastian Berg sebastian at sipsolutions.net
Wed Jun 26 13:16:17 EDT 2013


On Wed, 2013-06-26 at 12:52 -0400, josef.pktd at gmail.com wrote:
> On Wed, Jun 26, 2013 at 12:01 PM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
> > On Wed, 2013-06-26 at 11:30 -0400, josef.pktd at gmail.com wrote:
> >> Is there a change in the behavior of boolean slicing in current master?
> >>
> >
> > Yes, but I think this is probably a bug in statsmodel. I would expect
> > you should be using "..." and not ":" here, because ":" requires the
> > dimension to actually exist, and I *expect* that your mask actually has
> > the same dimensionality as the array itself.
> >
> > I.e.:
> >
> > x = np.arange(8).reshape(4,4)
> > mask = np.ones_like(x)
> > x[mask,:] # should NOT work, but this was buggy before current master.
> 
> Why should this not work?
> 
> How do you select rows that don't have nans in them?
> 
> mask = np.isfinite(x).all(1)
> x[mask, :]
> 
> or columns with switched axis.
> 
> >>> x[mask[:, None]]
> array([ 1.,  1.,  1.,  1.])
> ???
> 

I assume you wanted to write x[:, mask] there. Since boolean masks do
*not* broadcast, instead they eat away as many dimensions as they have.

Maybe these examples will help explain why the new behaviour is correct:

x = np.random.random((3,3))
mask = np.ones((3,3), dtype=np.bool_)

# Check slices:
x[:,:] # OK, result 2-d
x[:,:,:] # too many indices.

# replace first dimension with the mask:
x[mask[:,0], :] # OK, result 2-d
x[mask[:,0], :, :] # too many indices.

# replace *both* slices with a (single) mask:
x[mask] # OK, result 1-d (i.e. there nothing more then the mask)
x[mask, :] # too many indices! But it still works in 1.7.

# In fact we can make this absurd:
x[mask, :, :, :, :, :] # Too many slices even without the mask!

The last case used to work in pre-master due to a bug.

- Sebastian
 

> (I have to check the usage in statsmodels, but I thought this is standard.)
> 
> Josef
> 
> >
> > - Sebastian
> >
> >> If not I have to find another candidate in numpy master.
> >>
> >> (py27d) E:\Josef\testing\tox\py27d\Scripts>python
> >> Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit
> >> (Intel)] on win32
> >> Type "help", "copyright", "credits" or "license" for more information.
> >> >>> import numpy as np
> >> >>> np.__version__
> >> '1.7.1'
> >> >>> x = np.ones((5,3))
> >> >>> mask = np.arange(5) < 4
> >> >>> x[mask, :]
> >> array([[ 1.,  1.,  1.],
> >>        [ 1.,  1.,  1.],
> >>        [ 1.,  1.,  1.],
> >>        [ 1.,  1.,  1.]])
> >>
> >>
> >> We get errors like the following when running the statsmodels tests
> >> with a current or recent numpy master, but not with numpy 1.7.1
> >>
> >> ======================================================================
> >> ERROR: Failure: IndexError (too many indices)
> >> ----------------------------------------------------------------------
> >> Traceback (most recent call last):
> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/nose/loader.py",
> >> line 518, in makeTest
> >>     return self._makeTest(obj, parent)
> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/nose/loader.py",
> >> line 577, in _makeTest
> >>     return MethodTestCase(obj)
> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/nose/case.py",
> >> line 345, in __init__
> >>     self.inst = self.cls()
> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-macosx-10.8-x86_64.egg/statsmodels/emplike/tests/test_aft.py",
> >> line 19, in __init__
> >>     super(Test_AFTModel, self).__init__()
> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-macosx-10.8-x86_64.egg/statsmodels/emplike/tests/test_aft.py",
> >> line 12, in __init__
> >>     self.mod1 = sm.emplike.emplikeAFT(endog, exog, data.censors)
> >>   File "/Users/tom/python2.7/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-macosx-10.8-x86_64.egg/statsmodels/emplike/aft_el.py",
> >> line 248, in __init__
> >>     self.uncens_endog = self.endog[np.bool_(self.censors), :].\
> >> IndexError: too many indices
> >>
> >> Thanks,
> >>
> >> Josef
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 





More information about the NumPy-Discussion mailing list