[Python-ideas] Pre-PEP: adding a statistics module to Python

Steven D'Aprano steve at pearwood.info
Mon Aug 5 17:12:14 CEST 2013


On 03/08/13 04:53, Michele Lacchia wrote:

> As for Steven's implementation I think it's very accurate. I have one
> question though. Why there is a class for 'median' with various methods and
> not one for 'stdev' and 'variance' with maybe two methods, 'population' and
> 'sample'?

Those familiar with calculator statistics will expect separate functions for sample variance and population variance, and the same for standard deviation. This is a de facto standard in nearly everything I've looked at (although numpy is a conspicuous exception), so I chose to follow the same convention.

On the other hand, median is less commonly found on calculators and I did not want to overload the beginner with too many top-level median functions, so I made a decision to bless the version taught in secondary schools as the default (even though it is probably the least useful), and provide the others as methods.

A previous version of this module had a single median function that took an optional argument to select between different kinds of median:

median(data, scheme='grouped')

I have come to the conclusion that having separate methods on median not only simplifies the implementation, but it also reads better:

median.grouped(data)


-- 
Steven


More information about the Python-ideas mailing list