[Numpy-discussion] Random int64 and float64 numbers

Thu Nov 5 23:32:37 EST 2009

2009/11/5  <josef.pktd at gmail.com>:
> On Thu, Nov 5, 2009 at 10:42 PM, David Goldsmith
> <d.l.goldsmith at gmail.com> wrote:
>> On Thu, Nov 5, 2009 at 3:26 PM, David Warde-Farley <dwf at cs.toronto.edu>
>> wrote:
>>>
>>> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote:
>>>
>>> > Interesting thread, which leaves me wondering two things: is it
>>> > documented
>>> > somewhere (e.g., at the IEEE site) precisely how many *decimal*
>>> > mantissae
>>> > are representable using the 64-bit IEEE standard for float
>>> > representation
>>> > (if that makes sense);
>>>
>>> IEEE-754 says nothing about decimal representations aside from how to
>>> round when converting to and from strings. You have to provide/accept
>>> *at least* 9 decimal digits in the significand for single-precision
>>> and 17 for double-precision (section 5.6). AFAIK implementations will
>>> vary in how they handle cases where a binary significand would yield
>>> more digits than that.
>>
>> I was actually more interested in the opposite situation, where the decimal
>> representation (which is what a user would most likely provide) doesn't have
>> a finite binary expansion: what happens then, something analogous to the
>> decimal "rule of fives"?
>
> Since according to my calculations there are only about
>
>>>> 4* 10**17 * 308
> 123200000000000000000L

More straightforwardly, it's not too far below 2**64.

> double-precision floats, there are huge gaps in the floating point
> representation of the real line.
> Any user input or calculation result just gets converted to the
> closest float.

Yes. But the "huge" gaps are only huge in absolute size; in fractional
error they're always about the same size. They are usually only an
issue if you're representing numbers in a small range with a huge
offset. To take a not-so-random example, you could be representing
times in days since November 17 1858, when what you care about are the
microsecond-scale differences in photon arrival times. Even then
you're probably okay as long as you compute directly (t[i] =
i*dt/86400 + start_t) rather than having some sort of running
accumulator (t[i]=t; t += dt/86400).

Professor Kahan's reasoning for using doubles for most everything is
that they generally have so much more precision than you actually need
that you can get away with being sloppy.

Anne

> Josef
>
>
>>
>> DG
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>