[SciPy-Dev] (M)ANOVA, and deprecating stats.f_value* functions

josef.pktd at gmail.com josef.pktd at gmail.com
Thu Jun 18 14:10:49 EDT 2015


On Thu, Jun 18, 2015 at 1:34 PM, Eric Larson <larson.eric.d at gmail.com>
wrote:

> I agree that it makes sense to move statistical testing code to
> statsmodels. From what I understand, the space of functions is probably too
> large for scipy to reasonably take on, and such functions seem likely to
> get more attention from the statsmodels folks.
>


To clarify a bit the current situation, or make it more explicit:

It's house cleaning time in scipy.stats. And the main question is whether
to drop some functions that have accumulated in the past but have
essentially lost their purpose within scipy.stata.

so there are essentially two option

1) deprecate and delete those function, or
2) expand on them so they become useful again.

The general opinion (or at least Ralf's and mine and nobody else
complained) is that new functionality that is not closely related to the
good stuff in scipy stats should go to statsmodels.

However, there are currently no plans to move the "good stuff" in
scipy.stats to statsmodels.
scipy.stats has a set of good library functions that remain in scipy, get
improved and enhanced.

Also, scipy.stats has more code reviewers than statsmodels (and the main
code reviewer of statsmodels gets to easily distracted with weird things.
:).


Josef







>
> Eric
>
>
> On Tue, Jun 16, 2015 at 5:03 PM, Abraham Escalante <aeklant at gmail.com>
> wrote:
>
>> You can find the corresponding PR here: gh-4968
>> <https://github.com/scipy/scipy/pull/4968>
>>
>> Cheers,
>> Abraham.
>>
>> 2015-06-14 15:50 GMT-05:00 Ralf Gommers <ralf.gommers at gmail.com>:
>>
>>> Hi all,
>>>
>>> In scipy.stats there are three functions that calculate various
>>> F-statistics for inputs obtained from univariate or multivariate ANOVA.
>>> These are f_value, f_value_multivariate and f_value_wilks_lambda:
>>> https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L4603-L4683
>>>
>>> The problem with those is that they're not very useful standalone.
>>> f_value implements a statistic that's also calculated and given as a return
>>> by f_oneway (which does one-way ANOVA). The other two functions are related
>>> to multivariate ANOVA, for which scipy.stats doesn't provide any
>>> functionality.
>>>
>>> At the moment Statsmodels provides a lot more ANOVA functionality than
>>> scipy.stats does, and I agree with Josef [1, 2] that adding new
>>> functionality in this area to Statsmodels would fit better than adding it
>>> to Scipy. There's also a recent proposal [3] for M-way repeated ANOVA to be
>>> added to scipy.stats. That could be added to Statsmodels instead (my
>>> preference). If we do want to add it to Scipy, we need to have a clear list
>>> of what else is needed to create a coherent set of functions in this area.
>>>
>>> Thoughts?
>>>
>>> Ralf
>>>
>>> [1] https://github.com/scipy/scipy/issues/650
>>> [2] https://github.com/scipy/scipy/issues/660
>>> [3] https://github.com/scipy/scipy/issues/4913
>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>>>
>>>
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>>
>>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20150618/1b78b1af/attachment.html>


More information about the SciPy-Dev mailing list