[SciPy-User] scipy.stats one-sided two-sided less, greater, signed ?

Sun Jun 12 09:36:10 EDT 2011

On Sun, Jun 12, 2011 at 5:20 AM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Wed, Jun 8, 2011 at 12:56 PM, <josef.pktd at gmail.com> wrote:
>>
>> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers
>> > <ralf.gommers at googlemail.com> wrote:
>> >>
>> >>
>> >> On Mon, Jun 6, 2011 at 9:34 PM, <josef.pktd at gmail.com> wrote:
>> >>>
>> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey <bsouthey at gmail.com>
>> >>> wrote:
>> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote:
>> >>> >> What should be the policy on one-sided versus two-sided?
>> >>> > Yes :-)
>> >>> >
>> >>> >> The main reason right now for looking at this is
>> >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a
>> >>> >> "one-sided" alternative and provides both lower and upper tail.
>> >>> > That refers to the Fisher's test rather than the more 'traditional'
>> >>> > one-sided tests. Each value of the Fisher's test has special
>> >>> > meanings
>> >>> > about the value or probability of the 'first cell' under the null
>> >>> > hypothesis.  So it is necessary to provide those three values.
>> >>> >
>> >>> >> I would prefer that we follow the alternative patterns similar to R
>> >>> >>
>> >>> >> currently only kstest has    alternative : 'two_sided' (default),
>> >>> >> 'less' or 'greater'
>> >>> >> but this should be added to other tests where it makes sense
>> >>> > I think that these Kolmogorov-Smirnov  tests are not the traditional
>> >>> > meaning either. It is a little mind-boggling to try to think about
>> >>> > cdfs!
>> >>> >
>> >>> >> R fisher.exact
>> >>> >> """alternative        indicates the alternative hypothesis and must
>> >>> >> be
>> >>> >> one
>> >>> >> of "two.sided", "greater" or "less". You can specify just the
>> >>> >> initial
>> >>> >> letter. Only used in the 2 by 2 case."""
>> >>> >>
>> >>> >> mannwhitneyu reports a one-sided test without actually specifying
>> >>> >> which alternative is used  (I thought I remembered other cases like
>> >>> >> this but don't find any right now)
>> >>> >>
>> >>> >> related:
>> >>> >> in many cases in the two-sided tests the test statistic has a sign
>> >>> >> that indicates in which tail the test-statistic falls.
>> >>> >> This is useful in ttests for example, because the one-sided tests
>> >>> >> can
>> >>> >> be backed out from the two-sided tests. (With symmetric
>> >>> >> distributions
>> >>> >> one-sided p-value is just half of the two-sided pvalue)
>> >>> >>
>> >>> >> In the discussion of https://github.com/scipy/scipy/pull/8  I
>> >>> >> argued
>> >>> >> that this might mislead users to interpret a two-sided result as a
>> >>> >> one-sided result. However, I doubt now that this is a strong
>> >>> >> argument
>> >>> >> against not reporting the signed test statistic.
>> >>> > (I do not follow pull requests so is there a relevant ticket?)
>> >>> >
>> >>> >> After going through scipy.stats.stats, it looks like we always
>> >>> >> report
>> >>> >> the signed test statistic.
>> >>> >>
>> >>> >> The test statistic in ks_2samp is in all cases defined as a max
>> >>> >> value
>> >>> >> and doesn't have a sign in R either, so adding a sign there would
>> >>> >> break with the standard definition.
>> >>> >> one-sided option for ks_2samp would just require to find the
>> >>> >> distribution of the test statistics D+, D-
>> >>> >>
>> >>> >> ---
>> >>> >>
>> >>> >> So my proposal for the general pattern (with exceptions for special
>> >>> >> reasons) would be
>> >>> >>
>> >>> >> * add/offer alternative : 'two_sided' (default), 'less' or
>> >>> >> 'greater'
>> >>> >> http://projects.scipy.org/scipy/ticket/1394  for now,
>> >>> >> and adjustments of existing tests in the future (adding the option
>> >>> >> can
>> >>> >> be mostly done in a backwards compatible way and for symmetric
>> >>> >> distributions like ttest it's just a convenience)
>> >>> >> mannwhitneyu seems to be the only "weird" one
>> >>
>> >> This would actually make the fisher_exact implementation more
>> >> consistent,
>> >> since only one p-value is returned in all cases. I just don't like the
>> >> R
>> >> naming much; alternative="greater" does not convey to me that this is a
>> >> one-sided test using the upper tail. How about:
>> >>     test : {"two-tailed", "lower-tail", "upper-tail"}
>> >> with two-tailed the default?
>>
>> I think matlab uses (in general) larger and smaller, the advantage of
>> less/smaller and greater/larger is that it directly refers to the
>> alternative hypothesis, while the meaning in terms of tails is not
>> always clear (in kstest and I guess some others the test statistics is
>> just reversed and uses the same tail in both cases)
>>
>> so greater smaller is mostly "future proof" across tests, while
>> reference to the tail can only be used where this is an unambiguous
>> statement. but see below
>>
> I think I understand your terminology a bit better now, and consistency
> across all tests is important. So I've updated the Fisher's exact patch to
> use alternative={'two-sided', 'less', greater'} and sent a pull request:
> https://github.com/scipy/scipy/pull/32
>
> Cheers,
> Ralf
>
>>
>>
>> >>
>> >> Ralf
>> >>
>> >>
>> >>>
>> >>> >>
>> >>> >> * report signed test statistic for two-sided alternative (when a
>> >>> >> signed test statistic exists):  which is the status quo in
>> >>> >> stats.stats, but I didn't know that this is actually pretty
>> >>> >> consistent
>> >>> >> across tests.
>> >>> >>
>> >>> >> Opinions ?
>> >>> >>
>> >>> >> Josef
>> >>> >> _______________________________________________
>> >>> >> SciPy-User mailing list
>> >>> >> SciPy-User at scipy.org
>> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> > I think that there is some valid misunderstanding here (as I was in
>> >>> > the
>> >>> > same situation) regarding what is meant here. My understanding is
>> >>> > that
>> >>> > under a one-sided hypothesis, all the values of the null hypothesis
>> >>> > only
>> >>> > exist in one tail of the test distribution. In contrast the values
>> >>> > of
>> >>> > null distribution exist in both tails with a two-sided hypothesis.
>> >>> > Yet
>> >>> > that interpretation does not have the same meaning as the tails in
>> >>> > the
>> >>> > Fisher or Kolmogorov-Smirnov tests.
>> >>>
>> >>> The tests have a clear Null Hypothesis (equality) and Alternative
>> >>> Hypothesis (not equal or directional, less or greater).
>> >>> So the "alternative" should be clearly specified in the function
>> >>> argument, as in R.
>> >>>
>> >>> Whether this corresponds to left and right tails of the distribution
>> >>> is an "implementation detail" which holds for ttests but not for
>> >>> kstest/ks_2samp.
>> >>>
>> >>> kstest/ks2sample   H0: cdf1 == cdf2  and H1:  cdf1 != cdf2 or H1:
>> >>> cdf1 < cdf2 or H1:  cdf1 > cdf2
>> >>> (looks similar to comparing two survival curves in Kaplan-Meier ?)
>> >>>
>> >>> fisher_exact (2 by 2)  H0: odds-ratio == 1 and H1: odds-ratio != 1 or
>> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1
>> >>>
>> >>> I know the kolmogorov-smirnov tests, but for fisher exact and
>> >>> contingency tables I rely on R
>> >>>
>> >>> from R-help:
>> >>> For 2 by 2 tables, the null of conditional independence is equivalent
>> >>> to the hypothesis that the odds ratio equals one. <...> The
>> >>> alternative for a one-sided test is based on the odds ratio, so
>> >>> alternative = "greater" is a test of the odds ratio being bigger than
>> >>> or.
>> >>> Two-sided tests are based on the probabilities of the tables, and take
>> >>> as ‘more extreme’ all tables with probabilities less than or equal to
>> >>> that of the observed table, the p-value being the sum of such
>> >>> probabilities.
>> >>>
>> >>> Josef
>> >>>
>> >>>
>> >>> >
>> >>> > I never paid much attention to the frequency based tests but it does
>> >>> > not
>> >>> > surprise if there are no one-sided tests. Most are rank-based so it
>> >>> > is
>> >>> > rather hard to do in a simply manner - actually I am not even sure
>> >>> > how
>> >>> > to use a permutation test.
>> >>> >
>> >>> > Bruce
>> >>> >
>> >>> >
>> >>> >
>> >>> > _______________________________________________
>> >>> > SciPy-User mailing list
>> >>> > SciPy-User at scipy.org
>> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >
>> >>> _______________________________________________
>> >>> SciPy-User mailing list
>> >>> SciPy-User at scipy.org
>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>
>> >>
>> >> _______________________________________________
>> >> SciPy-User mailing list
>> >> SciPy-User at scipy.org
>> >> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>
>> >>
>> >
>> > But that is NOT the correct interpretation  here!
>> > I tried to explain to you that this is the not the usual idea
>> > one-sided vs two-sided tests.
>> > For example:
>> > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt
>> > "The test holds the marginal totals fixed and computes the
>> > hypergeometric probability that n11 is at least as large as the
>> > observed value"
>>
>> this still sounds like a less/greater test to me
>>
>>
>> > "The output consists of three p-values:
>> > Left: Use this when the alternative to independence is that there is
>> > negative association between the variables.  That is, the observations
>> > tend to lie in lower left and upper right.
>> > Right: Use this when the alternative to independence is that there is
>> > positive association between the variables. That is, the observations
>> > tend to lie in upper left and lower right.
>> > 2-Tail: Use this when there is no prior alternative.
>> > "
>> > There is also the book "Categorical data analysis: using the SAS
>> > system  By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came
>> > up via Google that also refers to the n11 cell.
>> >
>> > http://www.langsrud.com/fisher.htm
>>
>> I was trying to read the Agresti paper referenced there but it has too
>> much detail to get through in 15 minutes :)
>>
>> > "The output consists of three p-values:
>> >
>> >    Left: Use this when the alternative to independence is that there
>> > is negative association between the variables.
>> >    That is, the observations tend to lie in lower left and upper right.
>> >    Right: Use this when the alternative to independence is that there
>> > is positive association between the variables.
>> >    That is, the observations tend to lie in upper left and lower right.
>> >    2-Tail: Use this when there is no prior alternative.
>> >
>> > NOTE: Decide to use Left, Right or 2-Tail before collecting (or
>> > looking at) the data."
>> >
>> > But you will get a different p-value if you switch rows and columns
>> > because of the dependence on the n11 cell. If you do that then the
>> > p-values switch between left and right sides as these now refer to
>> > different hypotheses regarding that first cell.
>>
>> switching row and columns doesn't change the p-value in R
>> reversing columns changes the definition of less and greater, reverses
>> them
>>
>> The problem with 2 by 2 contingency tables with given marginals, i.e.
>> row and column totals, is that we only have one free entry. Any test
>> on one entry, e.g. element 0,0, pins down all the other ones and
>> (many) tests then become equivalent.
>>
>>
>> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm
>> some math got lost
>> """
>> For <2 by 2> tables, one-sided -values for Fisher’s exact test are
>> defined in terms of the frequency of the cell in the first row and
>> first column of the table, the (1,1) cell. Denoting the observed (1,1)
>> cell frequency by , the left-sided -value for Fisher’s exact test is
>> the probability that the (1,1) cell frequency is less than or equal to
>> . For the left-sided -value, the set includes those tables with a
>> (1,1) cell frequency less than or equal to . A small left-sided -value
>> supports the alternative hypothesis that the probability of an
>> observation being in the first cell is actually less than expected
>> under the null hypothesis of independent row and column variables.
>>
>> Similarly, for a right-sided alternative hypothesis, is the set of
>> tables where the frequency of the (1,1) cell is greater than or equal
>> to that in the observed table. A small right-sided -value supports the
>> alternative that the probability of the first cell is actually greater
>> than that expected under the null hypothesis.
>>
>> Because the (1,1) cell frequency completely determines the table when
>> the marginal row and column sums are fixed, these one-sided
>> alternatives can be stated equivalently in terms of other cell
>> probabilities or ratios of cell probabilities. The left-sided
>> alternative is equivalent to an odds ratio less than 1, where the odds
>> ratio equals (). Additionally, the left-sided alternative is
>> equivalent to the column 1 risk for row 1 being less than the column 1
>> risk for row 2, . Similarly, the right-sided alternative is equivalent
>> to the column 1 risk for row 1 being greater than the column 1 risk
>> for row 2, . See Agresti (2007) for details.
>> R C Tables
>> """
>>
>> I'm not a user of Fisher's exact test (and I have a hard time keeping
>> the different statements straight), so if left/right or lower/upper
>> makes more sense to users, then I don't complain.
>>
>> To me they are all just independence tests with possible one-sided
>> alternatives that one distribution dominates the other. (with the same
>> pattern as ks_2samp or ttest_2samp)
>>
>> Josef
>>
>> >
>> >
>> > Bruce
>> > _______________________________________________
>> > SciPy-User mailing list
>> > SciPy-User at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>> >
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
This is just wrong and plain ignorant! Please read the references and
stats books about what the tails actually mean!

You really need all three tests because these have different meanings
that you do not know in advance which you need.

Bruce