None versus MISSING sentinel -- request for design feedback

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Jul 15 06:46:53 EDT 2011


Chris Angelico wrote:

> On Fri, Jul 15, 2011 at 3:28 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> My question is, should I accept None as the missing value, or a dedicated
>> singleton?
>>
>> In favour of None: it's already there, no extra code required. People may
>> expect it to work.
>>
>> Against None: it's too easy to mistakenly add None to a data set by
>> mistake, because functions return None by default.
> 
> I guess the question is: Why are the missing values there? If they're
> there because some function returned None because it didn't have a
> value to return, and therefore it's a missing value, then using None
> as "missing" would make a lot of sense. But if it's a more explicit
> concept of "here's a table of values, and the user said that this one
> doesn't exist", it'd be better to have an explicit MISSING. (Which I
> assume would be exposed as yourmodule.MISSING or something.)

In general, you have missing values in statistics because somebody wouldn't
answer a question, and the Ethics Committee frowns on researchers torturing
their subjects to get information. They make you fill out forms.

Seriously, missing data is just missing. Unknown. Lost. Not available. Like:

Name    Age     Income     Years of schooling
==============================================
Bill    42      150,000    16
Susan   23      39,000     14
Karen   unknown 89,000     15
Bob     31      0          7
George  79      12,000     unknown
Sally   17      19,000     5
Fred    66      unknown    11

One might still like to calculate the average age as 43.



-- 
Steven




More information about the Python-list mailing list