[SciPy-User] scipy.stats one-sided two-sided less, greater, signed ?

Sun Jun 12 21:50:53 EDT 2011

On Sun, Jun 12, 2011 at 7:52 PM,  <josef.pktd at gmail.com> wrote:
> On Sun, Jun 12, 2011 at 8:30 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>> On Sun, Jun 12, 2011 at 8:56 AM,  <josef.pktd at gmail.com> wrote:
>>> On Sun, Jun 12, 2011 at 9:36 AM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>> On Sun, Jun 12, 2011 at 5:20 AM, Ralf Gommers
>>>> <ralf.gommers at googlemail.com> wrote:
>>>>>
>>>>>
>>>>> On Wed, Jun 8, 2011 at 12:56 PM, <josef.pktd at gmail.com> wrote:
>>>>>>
>>>>>> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>>>> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers
>>>>>> > <ralf.gommers at googlemail.com> wrote:
>>>>>> >>
>>>>>> >>
>>>>>> >> On Mon, Jun 6, 2011 at 9:34 PM, <josef.pktd at gmail.com> wrote:
>>>>>> >>>
>>>>>> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey <bsouthey at gmail.com>
>>>>>> >>> wrote:
>>>>>> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote:
>>>>>> >>> >> What should be the policy on one-sided versus two-sided?
>>>>>> >>> > Yes :-)
>>>>>> >>> >
>>>>>> >>> >> The main reason right now for looking at this is
>>>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a
>>>>>> >>> >> "one-sided" alternative and provides both lower and upper tail.
>>>>>> >>> > That refers to the Fisher's test rather than the more 'traditional'
>>>>>> >>> > one-sided tests. Each value of the Fisher's test has special
>>>>>> >>> > meanings
>>>>>> >>> > about the value or probability of the 'first cell' under the null
>>>>>> >>> > hypothesis.  So it is necessary to provide those three values.
>>>>>> >>> >
>>>>>> >>> >> I would prefer that we follow the alternative patterns similar to R
>>>>>> >>> >>
>>>>>> >>> >> currently only kstest has    alternative : 'two_sided' (default),
>>>>>> >>> >> 'less' or 'greater'
>>>>>> >>> >> but this should be added to other tests where it makes sense
>>>>>> >>> > I think that these Kolmogorov-Smirnov  tests are not the traditional
>>>>>> >>> > meaning either. It is a little mind-boggling to try to think about
>>>>>> >>> > cdfs!
>>>>>> >>> >
>>>>>> >>> >> R fisher.exact
>>>>>> >>> >> """alternative        indicates the alternative hypothesis and must
>>>>>> >>> >> be
>>>>>> >>> >> one
>>>>>> >>> >> of "two.sided", "greater" or "less". You can specify just the
>>>>>> >>> >> initial
>>>>>> >>> >> letter. Only used in the 2 by 2 case."""
>>>>>> >>> >>
>>>>>> >>> >> mannwhitneyu reports a one-sided test without actually specifying
>>>>>> >>> >> which alternative is used  (I thought I remembered other cases like
>>>>>> >>> >> this but don't find any right now)
>>>>>> >>> >>
>>>>>> >>> >> related:
>>>>>> >>> >> in many cases in the two-sided tests the test statistic has a sign
>>>>>> >>> >> that indicates in which tail the test-statistic falls.
>>>>>> >>> >> This is useful in ttests for example, because the one-sided tests
>>>>>> >>> >> can
>>>>>> >>> >> be backed out from the two-sided tests. (With symmetric
>>>>>> >>> >> distributions
>>>>>> >>> >> one-sided p-value is just half of the two-sided pvalue)
>>>>>> >>> >>
>>>>>> >>> >> In the discussion of https://github.com/scipy/scipy/pull/8  I
>>>>>> >>> >> argued
>>>>>> >>> >> that this might mislead users to interpret a two-sided result as a
>>>>>> >>> >> one-sided result. However, I doubt now that this is a strong
>>>>>> >>> >> argument
>>>>>> >>> >> against not reporting the signed test statistic.
>>>>>> >>> > (I do not follow pull requests so is there a relevant ticket?)
>>>>>> >>> >
>>>>>> >>> >> After going through scipy.stats.stats, it looks like we always
>>>>>> >>> >> report
>>>>>> >>> >> the signed test statistic.
>>>>>> >>> >>
>>>>>> >>> >> The test statistic in ks_2samp is in all cases defined as a max
>>>>>> >>> >> value
>>>>>> >>> >> and doesn't have a sign in R either, so adding a sign there would
>>>>>> >>> >> break with the standard definition.
>>>>>> >>> >> one-sided option for ks_2samp would just require to find the
>>>>>> >>> >> distribution of the test statistics D+, D-
>>>>>> >>> >>
>>>>>> >>> >> ---
>>>>>> >>> >>
>>>>>> >>> >> So my proposal for the general pattern (with exceptions for special
>>>>>> >>> >> reasons) would be
>>>>>> >>> >>
>>>>>> >>> >> * add/offer alternative : 'two_sided' (default), 'less' or
>>>>>> >>> >> 'greater'
>>>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394  for now,
>>>>>> >>> >> and adjustments of existing tests in the future (adding the option
>>>>>> >>> >> can
>>>>>> >>> >> be mostly done in a backwards compatible way and for symmetric
>>>>>> >>> >> distributions like ttest it's just a convenience)
>>>>>> >>> >> mannwhitneyu seems to be the only "weird" one
>>>>>> >>
>>>>>> >> This would actually make the fisher_exact implementation more
>>>>>> >> consistent,
>>>>>> >> since only one p-value is returned in all cases. I just don't like the
>>>>>> >> R
>>>>>> >> naming much; alternative="greater" does not convey to me that this is a
>>>>>> >> one-sided test using the upper tail. How about:
>>>>>> >>     test : {"two-tailed", "lower-tail", "upper-tail"}
>>>>>> >> with two-tailed the default?
>>>>>>
>>>>>> I think matlab uses (in general) larger and smaller, the advantage of
>>>>>> less/smaller and greater/larger is that it directly refers to the
>>>>>> alternative hypothesis, while the meaning in terms of tails is not
>>>>>> always clear (in kstest and I guess some others the test statistics is
>>>>>> just reversed and uses the same tail in both cases)
>>>>>>
>>>>>> so greater smaller is mostly "future proof" across tests, while
>>>>>> reference to the tail can only be used where this is an unambiguous
>>>>>> statement. but see below
>>>>>>
>>>>> I think I understand your terminology a bit better now, and consistency
>>>>> across all tests is important. So I've updated the Fisher's exact patch to
>>>>> use alternative={'two-sided', 'less', greater'} and sent a pull request:
>>>>> https://github.com/scipy/scipy/pull/32
>>>>>
>>>>> Cheers,
>>>>> Ralf
>>>>>
>>>>>>
>>>>>>
>>>>>> >>
>>>>>> >> Ralf
>>>>>> >>
>>>>>> >>
>>>>>> >>>
>>>>>> >>> >>
>>>>>> >>> >> * report signed test statistic for two-sided alternative (when a
>>>>>> >>> >> signed test statistic exists):  which is the status quo in
>>>>>> >>> >> stats.stats, but I didn't know that this is actually pretty
>>>>>> >>> >> consistent
>>>>>> >>> >> across tests.
>>>>>> >>> >>
>>>>>> >>> >> Opinions ?
>>>>>> >>> >>
>>>>>> >>> >> Josef
>>>>>> >>> >> _______________________________________________
>>>>>> >>> >> SciPy-User mailing list
>>>>>> >>> >> SciPy-User at scipy.org
>>>>>> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>> >>> > I think that there is some valid misunderstanding here (as I was in
>>>>>> >>> > the
>>>>>> >>> > same situation) regarding what is meant here. My understanding is
>>>>>> >>> > that
>>>>>> >>> > under a one-sided hypothesis, all the values of the null hypothesis
>>>>>> >>> > only
>>>>>> >>> > exist in one tail of the test distribution. In contrast the values
>>>>>> >>> > of
>>>>>> >>> > null distribution exist in both tails with a two-sided hypothesis.
>>>>>> >>> > Yet
>>>>>> >>> > that interpretation does not have the same meaning as the tails in
>>>>>> >>> > the
>>>>>> >>> > Fisher or Kolmogorov-Smirnov tests.
>>>>>> >>>
>>>>>> >>> The tests have a clear Null Hypothesis (equality) and Alternative
>>>>>> >>> Hypothesis (not equal or directional, less or greater).
>>>>>> >>> So the "alternative" should be clearly specified in the function
>>>>>> >>> argument, as in R.
>>>>>> >>>
>>>>>> >>> Whether this corresponds to left and right tails of the distribution
>>>>>> >>> is an "implementation detail" which holds for ttests but not for
>>>>>> >>> kstest/ks_2samp.
>>>>>> >>>
>>>>>> >>> kstest/ks2sample   H0: cdf1 == cdf2  and H1:  cdf1 != cdf2 or H1:
>>>>>> >>> cdf1 < cdf2 or H1:  cdf1 > cdf2
>>>>>> >>> (looks similar to comparing two survival curves in Kaplan-Meier ?)
>>>>>> >>>
>>>>>> >>> fisher_exact (2 by 2)  H0: odds-ratio == 1 and H1: odds-ratio != 1 or
>>>>>> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1
>>>>>> >>>
>>>>>> >>> I know the kolmogorov-smirnov tests, but for fisher exact and
>>>>>> >>> contingency tables I rely on R
>>>>>> >>>
>>>>>> >>> from R-help:
>>>>>> >>> For 2 by 2 tables, the null of conditional independence is equivalent
>>>>>> >>> to the hypothesis that the odds ratio equals one. <...> The
>>>>>> >>> alternative for a one-sided test is based on the odds ratio, so
>>>>>> >>> alternative = "greater" is a test of the odds ratio being bigger than
>>>>>> >>> or.
>>>>>> >>> Two-sided tests are based on the probabilities of the tables, and take
>>>>>> >>> as ‘more extreme’ all tables with probabilities less than or equal to
>>>>>> >>> that of the observed table, the p-value being the sum of such
>>>>>> >>> probabilities.
>>>>>> >>>
>>>>>> >>> Josef
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> >
>>>>>> >>> > I never paid much attention to the frequency based tests but it does
>>>>>> >>> > not
>>>>>> >>> > surprise if there are no one-sided tests. Most are rank-based so it
>>>>>> >>> > is
>>>>>> >>> > rather hard to do in a simply manner - actually I am not even sure
>>>>>> >>> > how
>>>>>> >>> > to use a permutation test.
>>>>>> >>> >
>>>>>> >>> > Bruce
>>>>>> >>> >
>>>>>> >>> >
>>>>>> >>> >
>>>>>> >>> > _______________________________________________
>>>>>> >>> > SciPy-User mailing list
>>>>>> >>> > SciPy-User at scipy.org
>>>>>> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>> >>> >
>>>>>> >>> _______________________________________________
>>>>>> >>> SciPy-User mailing list
>>>>>> >>> SciPy-User at scipy.org
>>>>>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>> >>
>>>>>> >>
>>>>>> >> _______________________________________________
>>>>>> >> SciPy-User mailing list
>>>>>> >> SciPy-User at scipy.org
>>>>>> >> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>> >>
>>>>>> >>
>>>>>> >
>>>>>> > But that is NOT the correct interpretation  here!
>>>>>> > I tried to explain to you that this is the not the usual idea
>>>>>> > one-sided vs two-sided tests.
>>>>>> > For example:
>>>>>> > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt
>>>>>> > "The test holds the marginal totals fixed and computes the
>>>>>> > hypergeometric probability that n11 is at least as large as the
>>>>>> > observed value"
>>>>>>
>>>>>> this still sounds like a less/greater test to me
>>>>>>
>>>>>>
>>>>>> > "The output consists of three p-values:
>>>>>> > Left: Use this when the alternative to independence is that there is
>>>>>> > negative association between the variables.  That is, the observations
>>>>>> > tend to lie in lower left and upper right.
>>>>>> > Right: Use this when the alternative to independence is that there is
>>>>>> > positive association between the variables. That is, the observations
>>>>>> > tend to lie in upper left and lower right.
>>>>>> > 2-Tail: Use this when there is no prior alternative.
>>>>>> > "
>>>>>> > There is also the book "Categorical data analysis: using the SAS
>>>>>> > system  By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came
>>>>>> > up via Google that also refers to the n11 cell.
>>>>>> >
>>>>>> > http://www.langsrud.com/fisher.htm
>>>>>>
>>>>>> I was trying to read the Agresti paper referenced there but it has too
>>>>>> much detail to get through in 15 minutes :)
>>>>>>
>>>>>> > "The output consists of three p-values:
>>>>>> >
>>>>>> >    Left: Use this when the alternative to independence is that there
>>>>>> > is negative association between the variables.
>>>>>> >    That is, the observations tend to lie in lower left and upper right.
>>>>>> >    Right: Use this when the alternative to independence is that there
>>>>>> > is positive association between the variables.
>>>>>> >    That is, the observations tend to lie in upper left and lower right.
>>>>>> >    2-Tail: Use this when there is no prior alternative.
>>>>>> >
>>>>>> > NOTE: Decide to use Left, Right or 2-Tail before collecting (or
>>>>>> > looking at) the data."
>>>>>> >
>>>>>> > But you will get a different p-value if you switch rows and columns
>>>>>> > because of the dependence on the n11 cell. If you do that then the
>>>>>> > p-values switch between left and right sides as these now refer to
>>>>>> > different hypotheses regarding that first cell.
>>>>>>
>>>>>> switching row and columns doesn't change the p-value in R
>>>>>> reversing columns changes the definition of less and greater, reverses
>>>>>> them
>>>>>>
>>>>>> The problem with 2 by 2 contingency tables with given marginals, i.e.
>>>>>> row and column totals, is that we only have one free entry. Any test
>>>>>> on one entry, e.g. element 0,0, pins down all the other ones and
>>>>>> (many) tests then become equivalent.
>>>>>>
>>>>>>
>>>>>> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm
>>>>>> some math got lost
>>>>>> """
>>>>>> For <2 by 2> tables, one-sided -values for Fisher’s exact test are
>>>>>> defined in terms of the frequency of the cell in the first row and
>>>>>> first column of the table, the (1,1) cell. Denoting the observed (1,1)
>>>>>> cell frequency by , the left-sided -value for Fisher’s exact test is
>>>>>> the probability that the (1,1) cell frequency is less than or equal to
>>>>>> . For the left-sided -value, the set includes those tables with a
>>>>>> (1,1) cell frequency less than or equal to . A small left-sided -value
>>>>>> supports the alternative hypothesis that the probability of an
>>>>>> observation being in the first cell is actually less than expected
>>>>>> under the null hypothesis of independent row and column variables.
>>>>>>
>>>>>> Similarly, for a right-sided alternative hypothesis, is the set of
>>>>>> tables where the frequency of the (1,1) cell is greater than or equal
>>>>>> to that in the observed table. A small right-sided -value supports the
>>>>>> alternative that the probability of the first cell is actually greater
>>>>>> than that expected under the null hypothesis.
>>>>>>
>>>>>> Because the (1,1) cell frequency completely determines the table when
>>>>>> the marginal row and column sums are fixed, these one-sided
>>>>>> alternatives can be stated equivalently in terms of other cell
>>>>>> probabilities or ratios of cell probabilities. The left-sided
>>>>>> alternative is equivalent to an odds ratio less than 1, where the odds
>>>>>> ratio equals (). Additionally, the left-sided alternative is
>>>>>> equivalent to the column 1 risk for row 1 being less than the column 1
>>>>>> risk for row 2, . Similarly, the right-sided alternative is equivalent
>>>>>> to the column 1 risk for row 1 being greater than the column 1 risk
>>>>>> for row 2, . See Agresti (2007) for details.
>>>>>> R C Tables
>>>>>> """
>>>>>>
>>>>>> I'm not a user of Fisher's exact test (and I have a hard time keeping
>>>>>> the different statements straight), so if left/right or lower/upper
>>>>>> makes more sense to users, then I don't complain.
>>>>>>
>>>>>> To me they are all just independence tests with possible one-sided
>>>>>> alternatives that one distribution dominates the other. (with the same
>>>>>> pattern as ks_2samp or ttest_2samp)
>>>>>>
>>>>>> Josef
>>>>>>
>>>>>> >
>>>>>> >
>>>>>> > Bruce
>>>>>> > _______________________________________________
>>>>>> > SciPy-User mailing list
>>>>>> > SciPy-User at scipy.org
>>>>>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>> >
>>>>>> _______________________________________________
>>>>>> SciPy-User mailing list
>>>>>> SciPy-User at scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> SciPy-User mailing list
>>>>> SciPy-User at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>
>>>>>
>>>> This is just wrong and plain ignorant! Please read the references and
>>>> stats books about what the tails actually mean!
>>>>
>>>> You really need all three tests because these have different meanings
>>>> that you do not know in advance which you need.
>>>
>>> Sorry, but I'm perfectly happy to follow R and SAS in this.
>>>
>>> Josef
>>>
>>>>
>>>> Bruce
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>> So am I which is NOT what is happening here!
>
> Why do you think that?
Because all the stuff given above including SAS which YOU provided
includes all three tests.

> I quoted all the relevant descriptions from the R and SAS help, and I
> checked the following and similar for the cases that are in the
> changeset for the tests:
>
>> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='g')
>
>        Fisher's Exact Test for Count Data
>
> data:  t(matrix(c(190, 800, 200, 900), nrow = 2))
> p-value = 0.296
> alternative hypothesis: true odds ratio is greater than 1
> 95 percent confidence interval:
>  0.8828407       Inf
> sample estimates:
> odds ratio
>  1.068698
>
>> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='l')
>
>        Fisher's Exact Test for Count Data
>
> data:  t(matrix(c(190, 800, 200, 900), nrow = 2))
> p-value = 0.7416
> alternative hypothesis: true odds ratio is less than 1
> 95 percent confidence interval:
>  0.000000 1.293552
> sample estimates:
> odds ratio
>  1.068698
>
>> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='t')
>
>        Fisher's Exact Test for Count Data
>
> data:  t(matrix(c(190, 800, 200, 900), nrow = 2))
> p-value = 0.5741
> alternative hypothesis: true odds ratio is not equal to 1
> 95 percent confidence interval:
>  0.8520463 1.3401490
> sample estimates:
> odds ratio
>  1.068698
>
> All the p-values agree for the alternatives two-sided, less, and
> greater, the odds ratio is defined differently as explained pretty
> well in the docstring.
>
> Josef
>
>
>>
>> Bruce
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>

Yes, but you said to follow BOTH R and SAS - that means providing all three:

The FREQ Procedure

Table of Exposure by Response

Exposure     Response

Frequency|       0|       1|  Total
---------+--------+--------+
       0 |    190 |    800 |    990
---------+--------+--------+
       1 |    200 |    900 |   1100
---------+--------+--------+
Total         390     1700     2090

Statistics for Table of Exposure by Response

Statistic                     DF       Value      Prob
------------------------------------------------------
Chi-Square                     1      0.3503    0.5540
Likelihood Ratio Chi-Square    1      0.3500    0.5541
Continuity Adj. Chi-Square     1      0.2869    0.5922
Mantel-Haenszel Chi-Square     1      0.3501    0.5541
Phi Coefficient                       0.0129
Contingency Coefficient               0.0129
Cramer's V                            0.0129

     Pearson Chi-Square Test
----------------------------------
Chi-Square                  0.3503
DF                               1
Asymptotic Pr >  ChiSq      0.5540
Exact      Pr >= ChiSq      0.5741

       Fisher's Exact Test
----------------------------------
Cell (1,1) Frequency (F)       190
Left-sided Pr <= F          0.7416
Right-sided Pr >= F         0.2960

Table Probability (P)       0.0376
Two-sided Pr <= P           0.5741

Sample Size = 2090

Thus providing all three is the correct answer.

Bruce