None versus MISSING sentinel -- request for design feedback

OKB (not okblacke) brenNOSPAMbarn at NObrenSPAMbarn.net
Fri Jul 15 13:40:58 EDT 2011


Steven D'Aprano wrote:

> Rob Williscroft wrote:
>> MISSING = MissingObject()
>> def mean( sequence, missing = MISSING ):
> 
> So you think the right API is to allow the caller to specify what
> counts as a missing value at runtime? Are you aware of any other
> statistics packages that do that?

    	R does it, not in the stats functions itself but in, for instance 
read.table.  When reading data from an external file, you can specify a 
set of values that will be converted to NA in the resulting data frame.

    	I think it's worth considering this approach, namely separating the 
input of the data into your system from the calculations on that 
data.  You haven't said exactly how people are going to be using your 
API, but your example of "where mising data comes from" showed something 
like a table of data from a survey.  If this is the case, and users are 
going to be importing sets of data from external files, it makes a lot 
of sense to let them specify "convert these particular values to MISSING 
when importing".

    	Either way, my answer to your original question would be: if you 
want to err on the side of caution, use your own MISSING value and just 
provide a simple function that will MISSING-ize specified values:

def ckeanUp(data, missing=None):
    	if missing is None:
    	    	missing = []
    	return [d for d in data if d not in missing else MISSING]

(Yet another use of None here! :-)

    	Then if people find their functions are returning None (or any 
other value, such as an empty string) to mean a "genuine" missing value, 
they can just wrap the call in this cleanUp function.  The reverse is 
harder to do: if you use None as your missing-value sentinel, you 
irrevocably lose the ability to tell it apart from other uses of None.

-- 
--OKB (not okblacke)
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is
no path, and leave a trail."
	--author unknown



More information about the Python-list mailing list