[Python-Dev] Mixing float and Decimal -- thread reboot

Wed Mar 24 13:04:56 CET 2010

On Wed, 24 Mar 2010 10:47:26 pm Nick Coghlan wrote:
> Steven D'Aprano wrote:
> > On Wed, 24 Mar 2010 08:51:36 pm Mark Dickinson wrote:
> >>> I don't see how it can be so.  Aren't all of those entries
> >>> garbage? To compute a histogram of results for computations on a
> >>> series of cases would you not have to test each result for
> >>> NaN-hood, then hash on a proxy such as the string "Nan"?
> >
> > Not necessarily -- you could merely ignore any key which is a NaN,
> > or you could pass each key through this first:
> >
> > def intern_nan(x, nan=float('nan')):
> >     if math.isnan(x):  return nan
> >     return x
> >
> > thus ensuring that all NaN keys were the same NaN.
>
> Interning NaN certainly seems like it should be sufficient to
> eliminate the set/dict membership weirdness.

I didn't mean to suggest that Python should do that automatically! I 
meant that the developer could easily intern NaNs if needed.

I wouldn't want Python to automatically intern NaNs, the reason being 
that this would throw away information (at least potentially, depending 
on the C library). According to the relevant IEEE standard, NaNs should 
(may?) carry a payload. For example, Apple's SANE math library back in 
the 1980s exposed this payload: NaNs created from different failures 
would have a consistent payload, allowing the programmer to tell how 
the NaN appeared in the calculation. 

E.g. INF-INF would give you a payload of 123 (or whatever it was), while 
log(-1) would give you a payload of 456. (I've made up the numbers, 
it's been far too many years for me to remember what they were.)

The point is, whether Python currently exposes these payloads or not, we 
shouldn't prohibit it. If programmers want to explicitly fold all NaNs 
into one, it is easy to do so themselves.

-- 
Steven D'Aprano