[Numpy-discussion] large float32 array issue

Warren Weckesser warren.weckesser at enthought.com
Wed Nov 3 06:59:08 EDT 2010


On Wed, Nov 3, 2010 at 3:54 AM, Vincent Schut <schut at sarvision.nl> wrote:

> Hi, I'm running in this strange issue when using some pretty large
> float32 arrays. In the following code I create a large array filled with
> ones, and calculate mean and sum, first with a float64 version, then
> with a float32 version. Note the difference between the two. NB the
> float64 version is obviously right :-)
>
>
>
> In [2]: areaGrid = numpy.ones((11334, 16002))
> In [3]: print(areaGrid.dtype)
> float64
> In [4]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(),
> areaGrid.mean(), areaGrid.sum())
> ((11334, 16002), 1.0, 1.0, 1.0, 181366668.0)
>
>
> In [5]: areaGrid = numpy.ones((11334, 16002), numpy.float32)
> In [6]: print(areaGrid.dtype)
> float32
> In [7]: print(areaGrid.shape, areaGrid.min(), areaGrid.max(),
> areaGrid.mean(), areaGrid.sum())
> ((11334, 16002), 1.0, 1.0, 0.092504406598019437, 16777216.0)
>
>
> Can anybody confirm this? And better: explain it? Am I running into a
> for me till now hidden ieee float 'feature'? Or is it a bug somewhere?
>
> Btw I'd like to use float32 arrays, as precision is not really an issue
> in this case, but memory usage is...
>
>
> This is using python 2.7, numpy from git (yesterday's checkout), on arch
> linux 64bit.
>
>

The problem kicks in with an array of ones of size 2**24.  Note that
np.float32(2**24) + np.float32(1.0) equals np.float32(2**24):


In [41]: b = np.ones(2**24, np.float32)

In [42]: b.size, b.sum()
Out[42]: (16777216, 16777216.0)

In [43]: b = np.ones(2**24+1, np.float32)

In [44]: b.size, b.sum()
Out[44]: (16777217, 16777216.0)

In [45]: np.spacing(np.float32(2**24))
Out[45]: 2.0

In [46]: np.float32(2**24) + np.float32(1)
Out[46]: 16777216.0


Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101103/315fa19f/attachment.html>


More information about the NumPy-Discussion mailing list