[SciPy-Dev] Van der Waerden test in Scipy?

josef.pktd at gmail.com josef.pktd at gmail.com
Sat Jun 4 08:07:13 EDT 2011


On Sat, Jun 4, 2011 at 6:57 AM, Annalisa Minelli <annalisa.py at gmail.com> wrote:
> Hi Josef, hi all,
> thanks for answering and thanks for the questions/hints..
>
> 2011/6/1 <josef.pktd at gmail.com>
>>
>> On Wed, Jun 1, 2011 at 3:56 AM, Annalisa Minelli <annalisa.py at gmail.com>
>> wrote:
>> > Dear Scipy developers,
>> > I'm Annalisa Minelli, a PhD student from Perugia (Italy), and I'm using
>> > Scipy library because I'm writing some code for my PhD thesis.
>> > Since I needed van der waerden chisquare pvalue[0], which is implemented
>> > in
>> > R but not in Scipy (if I'm not wrong)[1], I implemented the test in
>> > python
>> > language [3] to break any R dependency for my module.
>> >
>> > So if you think it can be useful, maybe it could be integrated into the
>> > Scipy library?
>> > Could you give me any advice on how to proceed?
>>
>> Thanks Annalisa,
>>
>> I think it would be useful to have this test (and similar tests based
>> on transformation to the normal distribution).
>>
>> I downloaded your vdw.py. I haven't looked at the details yet, but
>> here are 3 main issues:
>>
>> your vdw is GPL which is incompatible with scipy's BSD license, for
>> inclusion in scipy it would need to be a contribution licensed as BSD
>
> ok, I suppose I can change the license. I initially made this choice because
> I need this function into a GRASS GIS module - r.broscoe [0] and the Project
> uses GPL, but for sure I'd like to include the function in the Scipy library
> (which is used by GRASS).
>
>>
>> your function is not using numpy, and it seems to me that many of the
>> list operations can be vectorized with numpy
>
> ok, I'll try this way if you think it's better; I initially thought that the
> less dependencies was the best, but if you think I can considerably hack the
> code in this way, I will do it.

I agree that holding dependencies down is useful to work for, but in
your case it doesn't make a difference. Since you are already using
and requiring scipy, numpy is also required already as a scipy
dependency.

When it's relicensed, I can have a closer look, there might still be
the loop over groups necessary, unless numpy.bincount can handle all
the group operations.

A background question on the test: Do you know if there is any special
treatment of ties in the test?
Ties sound a bit inconsistent with conversion to normal scores, but
maybe it doesn't matter in this case.

Josef

>>
>> no tests: since you have the equivalent function in R, it would be
>> good to add some tests that have results from R to check that the
>> function produces the same results. ( for explanation and examples:
>> https://github.com/numpy/numpy/blob/master/doc/TESTS.rst.txt or look
>> in scipy.stats.tests )
>
> thanks for the hint; the pics I reported in my blog are from a run that
> takes in input the example data.frame of R for the equivalent waerden.test
> function (the data.frame is "sweetpotato") - with the only difference that
> my function uses numbers for groups instead of strings (maybe I can fix
> this..) but I'll do more tests, following numpy tests "standards" - thank
> you
>>
>> additional: docstring should follow the numpy standard, see examples
>> in current functions and
>>
>> https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt#docstring-standard
>
> ok, I'll fix these issues and come back in list ;-)
>
> Cheers,
> Annalisa
>
>
> [0]: https://svn.osgeo.org/grass/grass-addons/vector/v.strahler/
>
>
>>
>>
>> How to proceed if these issues are addressed:
>> Since it is a new function, it would be easy to integrate it into
>> scipy from the python files, but if you are familiar with git and
>> github, the integration will eventually be easier if you fork scipy
>> and add the function to scipy.stats and make a pull request.
>> A trac ticket would also be useful to keep track of the contribution
>> (I don't know if github will substitute for this).
>> At the beginning, it might be easiest to discuss this on the mailing list.
>>
>> Cheers, and thanks for any contributions,
>>
>> Josef
>>
>> >
>> > Best Regards,
>> > Annalisa Minelli
>> >
>> >
>> > [0]: http://en.wikipedia.org/wiki/Van_der_Waerden_test
>> > [1]:
>> > http://rss.acs.unt.edu/Rdoc/library/agricolae/html/VanderWarden.html
>> > [2]:
>> > http://amirror4thesun.blogspot.com/2011/05/van-der-waerden-test.html
>> >
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-dev
>> >
>> >
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>



More information about the SciPy-Dev mailing list