[Numpy-discussion] numpy.mean still broken for large float32 arrays

Thu Jul 24 07:56:17 EDT 2014

On Thu, Jul 24, 2014 at 1:33 PM, Fabien <fabien.maussion at gmail.com> wrote:
> Hi all,
>
> On 24.07.2014 11:59, Eelco Hoogendoorn wrote:
>> np.mean isn't broken; your understanding of floating point number is.
>
> I am quite new to python, and this problem is discussed over and over
> for other languages too. However, numpy's summation problem appears with
> relatively small arrays already:
>
> py>import numpy as np
> py>np.ones((4000,4000), np.float32).mean()
> 1.0
> py>np.ones((5000,5000), np.float32).mean()
> 0.67108864000000001
>
> A 5000*5000 image is not unusual anymore today.
>
> In IDL:
> IDL> mean(fltarr(5000L, 5000L)+1)
>         1.0000000
> IDL> mean(fltarr(7000L, 7000L)+1)
>         1.0000000
> IDL> mean(fltarr(10000L, 10000L)+1)
>        0.67108864
>
> I can't really explain why there are differences between the two
> languages (IDL uses 32-bit, single-precision, floating-point numbers)
>
> Fabien
>

something as simple as summation is already an interesting algorithmic
problem there are several ways do to with different speeds and
accuracies. E.g. pythons math.fsum is always exact to one ulp but is
very slow as it requires partial sorting. Then there is kahan
summation that has an accuracy of O(1) ulp but its about 4 times
slower than the naive sum.
In practice one of the better methods is pairwise summation that is
pretty much as fast as a naive summation but has an accuracy of
O(logN) ulp.
This is the method numpy 1.9 will use this method by default (+ its
even a bit faster than our old implementation of the naive sum):
https://github.com/numpy/numpy/pull/3685

but it has some limitations, it is limited to blocks fo the buffer
size (8192 elements by default) and does not work along the slow axes
due to limitations in the numpy iterator.