[Python-Dev] Not-a-Number (was PyObject_RichCompareBool identity shortcut)
Terry Reedy
tjreedy at udel.edu
Thu Apr 28 22:13:25 CEST 2011
On 4/28/2011 4:40 AM, Mark Shannon wrote:
> NaN is *not* a number (the clue is in the name).
The problem is that the committee itself did not believe or stay
consistent with that. In the text of the draft, they apparently refer to
Nan as an indefinite, unspecified *number*. Sort of like a random
variable with a uniform pseudo* distribution over the reals (* 0
everywhere with integral 1). Or a quantum particle present but smeared
out over all space. And that apparently is their rationale for Nan !=
NaN: an unspecified number will equal another unspecified number with
probability 0. The rationale for bool(NaN)==True is that an unspecified
*number* will be 0 with probability 0. If Nan truly indicated an
*absence* (like 0 and '') then bool(NaN) should be False,
I think the committee goofed -- badly. Statisticians used missing value
indicators long before the committee existed. They has no problem
thinking that the indicator, as an object, equaled itself. So one could
write (and I often did through the 1980s) the equivalent of
for i,x in enumerate(datavec):
if x == XMIS: # singleton missing value indicator for BMDP
datavec[i] = default
(Statistics packages have no concept of identity different from equality.)
If statisticians had made XMIS != XMIS, that obvious code would not have
worked, as it will not today for Python. Instead, the special case
circumlocution of "if isXMIS(x):" would have been required, adding one
more unnecessary function to the list of builtins.
NaN is, in its domain, the equivalent of None (== Not a Value), which
also serves an an alternative to immediately raising an exception. But
like XMIS, None==None. Also, bool(None) is corretly for something that
indicates absence.
> Python treats it as if it were a number:
As I said, so did the committee, and that was its mistake that we are
more or less stuck with.
> NaN does not have to be a float or a Decimal.
> Perhaps it should have its own class.
Like None
> As pointed out by Meyer:
> NaN == NaN is False
> is no more logical than
> NaN != NaN is False
This is wrong if False/True are interpreted as probabilities 0 and 1.
> To summarise:
>
> NaN is required so that floating point operations on arrays and lists
> do not raise unwanted exceptions.
Like None.
> NaN is Not a Number (therefore should be neither a float nor a Decimal).
> Making it a new class would solve some of the problems discussed,
> but would create new problems instead.
Agreed, if we were starting fresh.
> Correct behaviour of collections is more important than IEEE conformance
> of NaN comparisons.
Also agreed.
--
Terry Jan Reedy
More information about the Python-Dev
mailing list