None versus MISSING sentinel -- request for design feedback
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Fri Jul 15 06:46:53 EDT 2011
Chris Angelico wrote:
> On Fri, Jul 15, 2011 at 3:28 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> My question is, should I accept None as the missing value, or a dedicated
>> singleton?
>>
>> In favour of None: it's already there, no extra code required. People may
>> expect it to work.
>>
>> Against None: it's too easy to mistakenly add None to a data set by
>> mistake, because functions return None by default.
>
> I guess the question is: Why are the missing values there? If they're
> there because some function returned None because it didn't have a
> value to return, and therefore it's a missing value, then using None
> as "missing" would make a lot of sense. But if it's a more explicit
> concept of "here's a table of values, and the user said that this one
> doesn't exist", it'd be better to have an explicit MISSING. (Which I
> assume would be exposed as yourmodule.MISSING or something.)
In general, you have missing values in statistics because somebody wouldn't
answer a question, and the Ethics Committee frowns on researchers torturing
their subjects to get information. They make you fill out forms.
Seriously, missing data is just missing. Unknown. Lost. Not available. Like:
Name Age Income Years of schooling
==============================================
Bill 42 150,000 16
Susan 23 39,000 14
Karen unknown 89,000 15
Bob 31 0 7
George 79 12,000 unknown
Sally 17 19,000 5
Fred 66 unknown 11
One might still like to calculate the average age as 43.
--
Steven
More information about the Python-list
mailing list