[Numpy-discussion] NA masks for NumPy are ready to test

Mark Wiebe mwwiebe at gmail.com
Fri Aug 19 11:14:39 EDT 2011


On Fri, Aug 19, 2011 at 7:55 AM, Bruce Southey <bsouthey at gmail.com> wrote:

> **
> On 08/18/2011 04:43 PM, Mark Wiebe wrote:
>
> It's taken a lot of changes to get the NA mask support to its current
> point, but the code ready for some testing now. You can read the
> work-in-progress release notes here:
>
>
> https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst
>
>  To try it out, check out the missingdata branch from my github account,
> here, and build in the standard way:
>
>  https://github.com/m-paradox/numpy
>
>  The things most important to test are:
>
>  * Confirm that existing code still works correctly. I've tested against
> SciPy and matplotlib.
> * Confirm that the performance of code not using NA masks is the same or
> better.
> * Try to do computations with the NA values, find places they don't work
> yet, and nominate unimplemented functionality important to you to be next on
> the development list. The release notes have a preliminary list of
> implemented/unimplemented functions.
> * Report any crashes, build problems, or unexpected behaviors.
>
>  In addition to adding the NA mask, I've also added features and done a
> few performance changes here and there, like letting reductions like sum
> take lists of axes instead of being a single axis or all of them. These
> changes affect various bugs like
> http://projects.scipy.org/numpy/ticket/1143 and
> http://projects.scipy.org/numpy/ticket/533.
>
>  Thanks!
> Mark
>
>  Here's a small example run using NAs:
>
>  >>> import numpy as np
> >>> np.__version__
> '2.0.0.dev-8a5e2a1'
> >>> a = np.random.rand(3,3,3)
> >>> a.flags.maskna = True
> >>> a[np.random.rand(3,3,3) < 0.5] = np.NA
> >>> a
> array([[[NA, NA,  0.11511708],
>         [ 0.46661454,  0.47565512, NA],
>         [NA, NA, NA]],
>
>         [[NA,  0.57860351, NA],
>         [NA, NA,  0.72012669],
>         [ 0.36582123, NA,  0.76289794]],
>
>         [[ 0.65322748,  0.92794386, NA],
>         [ 0.53745165,  0.97520989,  0.17515083],
>         [ 0.71219688,  0.5184328 ,  0.75802805]]])
> >>> np.mean(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>        [NA,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>         [NA,  0.32710662,  0.10384331]])
> >>> np.mean(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.11511708,  0.47113483,         nan],
>        [ 0.57860351,  0.72012669,  0.56435958],
>        [ 0.79058567,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe')
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.        ,  0.00452029,         nan],
>        [ 0.        ,  0.        ,  0.19853835],
>        [ 0.13735819,  0.32710662,  0.10384331]])
>  >>> np.std(a, axis=(1,2), skipna=True)
> array([ 0.16786895,  0.15498008,  0.23811937])
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>  Hi,
> I had to rebuild my Python2.6 as a 'normal' version.
>
> Anyhow, Python2.4, 2.5, 2.6 and 2.7 all build and pass the numpy tests.
>

Thanks for running the tests!

>
> Curiously, only tests in Python2.7 give almost no warnings but all the
> other Python2.x give lots of warnings - Python2.6 and Python2.7 are below.
> My expectation is that all versions should behave the same regarding
> printing messages.
>

The lack of deprecation warnings is because you need to add -Wd explicitly
when you run under 2.7. There was an idea to make this the default from
within the test suite execution code, but no one has stepped up and
implemented that. See here:

http://projects.scipy.org/numpy/ticket/1894


> Also the message 'Need pytz library to test datetime timezones' means that
> there are invalid tests that have to be rewritten (ticket 1939:
> http://projects.scipy.org/numpy/ticket/1939 ).
>

I did it this way because Python has no timezone objects built in, just
provides the interface. If someone is willing to copy or write timezone
instances into the testsuite to fix this I would be very grateful!

I think all these policies I keep breaking should be written down somewhere.
I don't think it's reasonable to call something a community/project policy
unless a particular wording of it in an easily discoverable official
document has been agreed upon by the community. I nominate this as a new
policy. ;)

Thanks,
Mark


>
> Bruce
>
> $ python2.6 -c "import numpy; numpy.test()"
> Running unit tests for numpy
> NumPy version 2.0.0.dev-93236a2
> NumPy is installed in /usr/local/lib/python2.6/site-packages/numpy
> Python version 2.6.6 (r266:84292, Aug 19 2011, 09:21:38) [GCC 4.5.1
> 20100924 (Red Hat 4.5.1-4)]
> nose version 1.0.0
> ......................../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_datetime.py:1313:
> UserWarning: Need pytz library to test datetime timezones
>   warnings.warn("Need pytz library to test datetime timezones")
> .........................................................................................................................../usr/local/lib/python2.6/unittest.py:336:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   callableObj(*args, **kwargs)
> ............................................................................................................................................................................................................./usr/local/lib/python2.6/site-packages/numpy/core/_internal.py:555:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   value.names = tuple(names)
> ...../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1912:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   dt.names = tuple(names)
> ...../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:804:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   return loads(obj)
> ..../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1046:
> DeprecationWarning: putmask has been deprecated. Use copyto with 'where' as
> the mask instead
>   np.putmask(x,[True,False,True],-1)
> ../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1025:
> DeprecationWarning: putmask has been deprecated. Use copyto with 'where' as
> the mask instead
>   np.putmask(x, mask, val)
> ................................................/usr/local/lib/python2.6/unittest.py:336:
> DeprecationWarning: putmask has been deprecated. Use copyto with 'where' as
> the mask instead
>   callableObj(*args, **kwargs)
> ../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1057:
> DeprecationWarning: putmask has been deprecated. Use copyto with 'where' as
> the mask instead
>   np.putmask(rec['x'],[True,False],10)
> /usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1061:
> DeprecationWarning: putmask has been deprecated. Use copyto with 'where' as
> the mask instead
>   np.putmask(rec['y'],[True,False],11)
> .S/usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1395:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   dt.names = ['p','q']
> ..................................................................................................................................................................................................................................................................................................................................................................................../usr/local/lib/python2.6/site-packages/numpy/core/records.py:157:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   dtype = sb.dtype(formats, aligned)
> ........................................................./usr/local/lib/python2.6/site-packages/numpy/core/tests/test_regression.py:1426:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   ra.dtype.names = ('f1', 'f2')
> /usr/local/lib/python2.6/unittest.py:336: DeprecationWarning: Setting NumPy
> dtype names is deprecated, the dtype will become immutable in a future
> version
>   callableObj(*args, **kwargs)
> ............../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_regression.py:1017:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   a.dtype.names = b
> ......................................................................................................................./usr/local/lib/python2.6/pickle.py:1133:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   value = func(*args)
> ..........................................................................................K..................................................................................................K......................K..........................................................................................................S...................................../usr/local/lib/python2.6/site-packages/numpy/lib/_iotools.py:857:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   ndtype.names = validate(ndtype.names, defaultfmt=defaultfmt)
> /usr/local/lib/python2.6/site-packages/numpy/lib/_iotools.py:854:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   ndtype.names = validate([''] * nbtypes, defaultfmt=defaultfmt)
> /usr/local/lib/python2.6/site-packages/numpy/lib/_iotools.py:847:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   defaultfmt=defaultfmt)
> ......................................................................................................................................................................................./usr/local/lib/python2.6/site-packages/numpy/lib/format.py:358:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   dtype = numpy.dtype(d['descr'])
> /usr/local/lib/python2.6/site-packages/numpy/lib/format.py:449:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   array = cPickle.load(fp)
> .............................................................................................................................................................................................................................................................................................................................../usr/local/lib/python2.6/site-packages/numpy/ma/core.py:366:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   deflist.append(default_fill_value(np.dtype(currenttype)))
> ................/usr/local/lib/python2.6/site-packages/numpy/lib/npyio.py:1640:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   dtype.names = names
> ..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ..........................................................................................................................................................................................................................
> ----------------------------------------------------------------------
> Ran 3064 tests in 22.795s
>
> OK (KNOWNFAIL=3, SKIP=2)
> $ python -c "import numpy; numpy.test()"
> Running unit tests for numpy
> NumPy version 2.0.0.dev-93236a2
> NumPy is installed in /usr/lib64/python2.7/site-packages/numpy
> Python version 2.7 (r27:82500, Sep 16 2010, 18:02:00) [GCC 4.5.1 20100907
> (Red Hat 4.5.1-3)]
> nose version 1.0.0
> ......................../usr/lib64/python2.7/site-packages/numpy/core/tests/test_datetime.py:1313:
> UserWarning: Need pytz library to test datetime timezones
>   warnings.warn("Need pytz library to test datetime timezones")
> ...........................................................................................................................................................................................................................................................................................................................................................................................................S..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ..........................................................K..................................................................................................K......................K..........................................................................................................S..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> .......................................................................
> ----------------------------------------------------------------------
> Ran 3064 tests in 23.180s
>
> OK (KNOWNFAIL=3, SKIP=2)
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/d2146b47/attachment.html>


More information about the NumPy-Discussion mailing list