[SciPy-Dev] scipy.stats improvements

Ralf Gommers ralf.gommers at gmail.com
Fri Mar 6 11:52:33 EST 2015


Hi Abraham,


On Wed, Mar 4, 2015 at 8:08 PM, Abraham Escalante <aeklant at gmail.com> wrote:

> Hello,
>
> My name is Abraham Escalante. I would like to make a proposal for the
> "scipy.stats improvements" project for the Google Summer of Code. I am new
> to the Open Source community (although I do have experience with git and
> github) and this seems to me like a perfect place to start contributing.
>

Welcome!


> I forked the scipy/scipy project and I've been perusing some of the
> StatisticsCleanup issues since I would like to make my first contribution
> before I actually make my formal proposal (and I know it would be a great
> way for me to become acquainted with the code, guidelines, tests and the
> like).
>

That's definitely a good idea (and actually it's required).


> I have a few questions that I would like to trouble you with:
>
> 1) Most of the StatisticsCleanup open issues mention a "need for review"
> and also "StatisticsReview guidelines". *Could you refer me to the
> StatisticsReview guidelines?* (I have been looking but I have not been
> able to find it in the forked project nor the scipy documentation). *What
> does it mean to have an issue flagged as "review"?*
> see https://github.com/scipy/scipy/issues/693 for an example of what I
> mean.
>

Ah, this was a pre-Github wiki page that has disappeared after Trac was
disabled. I can't find the original anymore; I'll rewrite those guidelines
on the Github scipy wiki. Basically it comes down to checking (and
fixing/implementing if needed) the following:
- is the implementation correct?
  - needs checking against another implementation (R/Matlab) and/or a
reliable reference
  - this includes handling of small or empty arrays, and array_like (list,
tuple) inputs
- is the docstring complete?
  - at a minimum should include a good summary line, parameters, returns
section and needed details to understand the algorithm
  - preferably also References and Examples sections
- is the test coverage OK?


For some functions that have StatisticsReview issues it's a matter of
checking and making a few tweaks, for others it may be a complete rewrite
(see https://github.com/scipy/scipy/pull/4563 for a recent example).


> 2) I am currently going through the code (using the StatisticsCleanup
> issues as a guide) and starting to read the SciPy statistics tutorial. *Do
> you have any suggested reading* to get more familiarised with SciPy (the
> statistics part in particular), Numpy or to brush up on my statistics
> knowledge? (pretty much anything to get me up the learning curve would be
> useful).
>

The tutorial you started on is good, for a broad intro to numpy/scipy this
is also a quite good tutorial: http://scipy-lectures.github.io/. Regarding
books on statistics, there's an almost infinite choice, I'm not going to
try to make  recommendation. Maybe the real statisticians on this list will
give you their favorites:)

When starting to work on scipy, reading the developer guidelines at
http://docs.scipy.org/doc/numpy-dev/dev/ is also a good idea.

Cheers,
Ralf



> Thanks in advance,
> Abraham Escalante.
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20150306/24321a23/attachment.html>


More information about the SciPy-Dev mailing list