The NaNny State

Avi Gross avigross at verizon.net
Mon Feb 18 18:24:48 EST 2019


[DISCLAIMER: I can read documentation and have. The following is more of a demo showing step by step what I find experimentally along with some discussion.]

Ben asked:

> Who says that the “correct spelling in python is all lower case "nan"”?

Fair enough. Except for reserved words in the language, all we have is hints of what the designers used. The reality is that whoever codes the __str__ and/or __repr__ chose lower case and so did the persons who made math.nan and numpy.nan and obviously we can over-ride these decisions in our own derived customized classes.

	>>> import math, numpy
	>>> nantucket, nanometer, nanoseconds = float("NAN"), math.nan, numpy.nan
	>>> nantucket, nanometer, nanoseconds
	(nan, nan, nan)
	>>> [ spelled.__str__() for spelled in [nantucket, nanometer, nanoseconds]]
	['nan', 'nan', 'nan']
	>>> [ spelled.__repr__() for spelled in [nantucket, nanometer, nanoseconds]]
	['nan', 'nan', 'nan']

I have a hard time in thinking of anywhere else you normally would type "nan" or a variant in native python other than the float example and a similar one such as complex("NAN", 5).

	>>> math.isnan(nanometer)
	True
	>>> numpy.isnan(nanoseconds)
	True

So I decided to look and see what some python extensions (modules) that I am aware of choose to display. I assume many of them rely on the str/repr of the underlying "float" and I see numpy does:

	>>> nannite = numpy.array([1, float("NAN"), 2, math.nan, 3, numpy.nan])
	>>> nannite
	array([ 1., nan,  2., nan,  3., nan])

I mentioned earlier that the pandas module has an exception, of sorts. 

	>>> import pandas

	>>> nanism = pandas.DataFrame( {'FORWARD': nannite, 'BACKWARD': nannite[::-1]})
			       
	>>> nanism			       
	   FORWARD  BACKWARD
	0      1.0       NaN
	1      NaN       3.0
	2      2.0       NaN
	3      NaN       2.0
	4      3.0       NaN
	5      NaN       1.0

So, as I said, at least one extension chose differently and uses NaN.

BUT if I explicitly request the one item on the second row and first column it says 'nan' as it is no longer being handled by the same decision:

	>>> nanism.at[1,'FORWARD']
	nan

I have no doubt that if you search all over the place, you will find additional spellings say in a machine the sklearn module which is probably using pandas below it.

And, yes, I tried many other things I will not trouble you with. I repeat. I concede there is not necessarily any enforced spelling for 'nan' in the language. 

So, anyone want to decide how to spell inf properly?
	>>> nanism.at[0,'FORWARD'] = math.inf
	>>> nanism.at[1,'BACKWARD'] = float("-InF")
	>>> nanism
		     
	    FORWARD  BACKWARD
	0       inf       NaN
	1       NaN      -inf
	2  2.000000       NaN
	3       NaN  2.000000
	4  3.000000       NaN
	5       NaN  1.000000

I think it is time for me stop talking about what is not. 😉

-----Original Message-----
From: Python-list <python-list-bounces+avigross=verizon.net at python.org> On Behalf Of Ben Finney
Sent: Monday, February 18, 2019 5:08 PM
To: python-list at python.org
Subject: Re: The NaNny State

"Avi Gross" <avigross at verizon.net> writes:

> It is about the recent discussion about the concept and word "nan" as 
> used in python and elsewhere. As noted, the correct spelling in python 
> is all lower case as in "nan" with a minor exception that creating a 
> nan using
> float(string) allows any combination of cases such as string="nAN".

Who says that the “correct spelling in python is all lower case "nan"”?

The text representation of a Python ‘float’ NaN object is 'nan'. That is what the object emits as its text representation; it is not the same thing as "the correct spelling".

As you note, the ‘float’ type accepts several spellings as input to create a NaN object, all of them correct spelling.

Similarly, I can spell the number one thousand ‘1000.0’, ‘1.0e+3’
‘1.000e+3’, ‘1000.00000’, and so on. Those are all correct (and, as it happens, they all result in equal ‘float’ values).

The resulting object will, when I interrogate it, represent itself *by
default* as ‘1000.0’; Python is not showing *the* correct spelling, just one possible correct spelling.

-- 
 \     “I wrote a song, but I can't read music so I don't know what it |
  `\    is. Every once in a while I'll be listening to the radio and I |
_o__)        say, ‘I think I might have written that.’” —Steven Wright |
Ben Finney

--
https://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list