float("nan") in set or as key

Sun Jun 5 18:54:40 EDT 2011

In article <mailman.2438.1307133316.9059.python-list at python.org>
Chris Angelico  <rosuav at gmail.com> wrote:
>Uhh, noob question here. I'm way out of my depth with hardware
>floating point.
>
>Isn't a signaling nan basically the same as an exception?

Not exactly, but one could think of them as "very similar".

Elsethread, someone brought up the key distinction, which is
that in hardware that implements IEEE arithmetic, you have two
possibilities at pretty much all times:

 - op(args) causes an exception (and therefore does not deliver
   a result), or
 - op(args) delivers a result that may indicate "exception-like
   lack of result".

In both cases, a set of "accrued exceptions" flags accumulates the
new exception, and a set of "most recent exceptions" flags tells
you about the current exception.  A set of "exception enable"
flags -- which has all the same elements as "current" and
"accrued" -- tells the hardware which "exceptional results"
should trap.

A number is "NaN" if it has all-1-bits for its exponent and at
least one nonzero bit in its mantissa.  (All-1s exponent, all-0s
mantissa represents Infinity, of the sign specified by the sign
bit.)  For IEEE double precision floating point, there are 52
mantissa bits, so there are (2^52-1) different NaN bit patterns.
One of those 52 bits is the "please signal on use" bit.

A signalling NaN traps at (more or less -- details vary depending
on FPU architecture) load time.  However, there must necessarily
(for OS and thread-library level context switching) be a method
of saving the FPU state without causing an exception when loading
a NaN bit pattern, even if the NaN has the "signal" bit set.

>Which would imply that the hardware did support exceptions (if it
>did indeed support IEEE floating point, which specifies signalling nan)?

The actual hardware implementations (of which there are many) handle
the niggling details differently.  Some CPUs do not implement
Infinity and NaN in hardware at all, delivering a trap to the OS
on every use of an Inf-or-NaN bit pattern.  The OS then has to
emulate what the hardware specification says (if anything), and
make it look as though the hardware did the job.  Sometimes denorms
are also done in software.

Some implementations handle everything directly in hardware, and
some of those get it wrong. :-)  Often the OS has to fix up some
special case -- for instance, the hardware might trap on every NaN
and make software decide whether the bit pattern was a signalling
NaN, and if so, whether user code should receive an exception.

As I think John Nagle pointed out earlier, sometimes the hardware
does "support" exceptions, but rather loosely, where the hardware
delivers a morass of internal state and a vague indication that
one or more exceptions happened "somewhere near address <A>",
leaving a huge pile of work for software.

In Python, the decimal module gets everything either right or
close-to-right per the (draft? final? I have not kept up with
decimal FP standards) standard.  Internal Python floating point,
not quite so much.
-- 
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: gmail (figure it out)      http://web.torek.net/torek/index.html