[Python-ideas] Adding quantile to the statistics module

Steven D'Aprano steve at pearwood.info
Mon Mar 19 07:36:09 EDT 2018


On Fri, Mar 16, 2018 at 01:36:31PM +0900, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
> 
>  > Indeed. I've been considering quantiles and quartiles for a long time, 
>  > and I've found at least ten different definitions for quantiles and 
>  > sixteen for quartiles.

> I'd like to see your list written up.

On checking my notes, there is considerable overlap in the numbers above 
(some calculatation methods are equivalent to others) but overall I find 
a total of 16 distinct methods in use. Some of those are only suitable 
for generating quartiles.

This should not be considered an exhaustive list, I may have missed 
some. Additions and corrections will be welcomed :-)

My major sources are Hyndman & Fan:

https://www.amherst.edu/media/view/129116/original/Sample+Quantiles.pdf

and Langford:

https://ww2.amstat.org/publications/jse/v14n3/langford.html

Langford concentrates on methods of calculating quartiles, while Hyndman 
& Fan consider more general quantile methods. Obviously if you have a 
general quantile method, you can use it to calculate quartiles.

I have compiled a summary in the following table. Reading across the row 
are the (usually numeric) label or parameter used to specify a 
calculation method. Entries in the same column are the same calculation 
method regardless of the label.

For example, what Hyndman & Fan call method 1, Langford calls method 15, 
and the SAS software uses a parameter of 3. The Excel QUARTILE function 
is equivalent to what H&F call method 7 and what Langford calls 12.

You will need to use a monospaced font for the columns to line up.


H&F                   1  2  3  4  5  6  7  8  9
Langford              15 4  14 13 10 11 12       1  2  5  6  9
Excel                                   Q
Excel 2010+                          QE QI
JMP                                  X
Maple           1  2           3  4  5  6  7  8
Mathematica                    AQ MQ
Minitab                              X
R                     1  2  3  4  5  6  7  8  9
S                                       X
SAS                   3  5  2  1     4
SPSS                                 X
TI calc                                             X


Notes:
    X   Only calculation method used by the software.
    Q   Excel QUARTILE function (pre 2010)
    QE  Excel QUARTILE.EXC function
    QI  Excel QUARTILE and QUARTILE.INC functions
    AQ  Mathematica AsymmetricQuartiles function
    MQ  Mathematica Quartiles function

    Langford's 3 and 7 (not shown) is the same as his 1;
    his 8 (not shown) is the same as his 2.

Hyndman & Fan recommend method 8 as the best method for general 
quantiles.

Langford (who has certainly read H&F) recommends his method 4, which is 
H&F's method 2, as the standard quartile. That is the same as the 
default used by SAS.

For what it's worth, the method taught in Australian high schools for 
calculating quartiles and interquartile range is Langford's method 2. 
That's the method that Texas Instruments calculators use.


I haven't personally confirmed all of the software equivalences, in 
particular I'm a bit dubious about the Maple methods. If anyone has 
access to Maple and doesn't mind running a few sample calculations for 
me, please contact me off-list.



-- 
Steve


More information about the Python-ideas mailing list