[Numpy-discussion] Scalar coercion
Christopher Hanley
chanley at stsci.edu
Mon Mar 5 10:23:33 EST 2007
Hello Everyone,
Another behavior we might consider changing for 1.0.2 that I believe is
somewhat related in theme is the default type used in computations like
the mean() method.
This is best illustrated with the following example:
sparty> python
Python 2.5 (r25:51908, Sep 21 2006, 13:33:15)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-56)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as n
>>> n.__version__
'1.0.2.dev3568'
>>> a = n.ones((1000,1000),dtype=n.float32)*132.00005
>>> print a
[[ 132.00004578 132.00004578 132.00004578 ..., 132.00004578
132.00004578 132.00004578]
[ 132.00004578 132.00004578 132.00004578 ..., 132.00004578
132.00004578 132.00004578]
[ 132.00004578 132.00004578 132.00004578 ..., 132.00004578
132.00004578 132.00004578]
...,
[ 132.00004578 132.00004578 132.00004578 ..., 132.00004578
132.00004578 132.00004578]
[ 132.00004578 132.00004578 132.00004578 ..., 132.00004578
132.00004578 132.00004578]
[ 132.00004578 132.00004578 132.00004578 ..., 132.00004578
132.00004578 132.00004578]]
>>> a.min()
132.000045776
>>> a.max()
132.000045776
>>> a.mean()
133.96639999999999
>>>
Having the mean be greater than the max is a tad odd.
The calculation of the mean is occurring with a single precision
accumulator. I do understand that I can force a double precision
calculation with the following command:
>>> a.mean(dtype=n.float64)
132.00004577636719
>>>
I realize that one reason for not doing all calculations as double
precision is performance. However, my users would rather have the
correct answer by default than quickly arriving at the wrong one.
In my opinion we should swap the default behavior. All calculations
should be done in double precision. If you need the performance you can
then go back and start setting data types.
Not having to worry about overflow would also be consistent with
numarray's behavior.
Thank you for considering my opinion,
Chris
More information about the NumPy-Discussion
mailing list