[SciPy-Dev] scipy.stats

josef.pktd at gmail.com josef.pktd at gmail.com
Tue Jun 1 04:12:06 EDT 2010


On Tue, Jun 1, 2010 at 12:54 AM, Travis Oliphant <oliphant at enthought.com> wrote:
>
> On May 31, 2010, at 9:16 AM, josef.pktd at gmail.com wrote:
>
>> Since Travis seems to want to take back control of scipy.stats, I am
>> considering my role as inofficial maintainer as ended.
>
> Obviously I've offended you.   That has never been my intent.   I apologize if my enthusiasm for getting some changes that I wanted to see into SciPy stepped on an area you felt ownership of.     I do not mind if people add changes to code that I've written and I assume that others feel the same.   That has always been the development mode of SciPy.   We clearly have different development styles.    I think we can find a way to work together.   I think the move to github will help.
>
> I did not understand that you felt such ownership of scipy.stats.  I have certainly appreciated your input.
>
> I do like a more "free-wheeling" style to code development than one that is bogged down with "rules" and "procedures".     This clearly is not your style.   For me, it comes down to time to spend.   I love working on SciPy and NumPy.    I don't have a lot of time to do it.   When I see quick changes I can make that add value I like to be able to do it.   I think we both want the same thing while we may disagree about the best way to get there.
> In my mind, discussion doesn't end when a check-in is made --- it just begins.   You should never interpret my checking something in as the final word.   We clearly have a different view of "trunk"
>
> I certainly don't want my approach to open source development to offend others or chase them away.  If I check in something you don't like, then tell me and let's talk about it.    If you need to vent and call me names, a private email to me or others can go a long way.
>
> What do we need to do to keep you around?   Is there specifically something you didn't like about my recent check-ins?
>
> In this case, the features added were not terribly extensive.   The current unit tests helped ferret out major problems.  Yes, I could write more tests and documentation, and you have been a model of writing tests and documentation.   I have been particularly impressed by the amount of quality documentation you have written.
>
> While you seem to dismiss the episode as problematic, I actually think curve_fit was a good example of how something very positive can emerge quickly when people are open and willing to work together.
>
> While formal, strict test-driven development is easy to point to for salvation -- it does have its costs.   I've always used informal test-driven development.   Just because I don't *always* add formal unit tests for every piece of code written does not mean the code that is currently in SciPy is un-tested and useless.   Such an approach leaves me open to criticism, which I acknowledge.  But, I think there have been far too many dismissive comments about the state of the code.
>
> I would argue that the problem with scipy.stats does not lie mainly in distributions.py or the lack of test-driven-development --- but in the lack of certain easy to use features.    Quality code comes out of people who care --- not out of procedure.
>
> I think you are someone who cares and your code reflects that.    We would all benefit from your staying part of the main development.

(not answering inline to keep thoughts together)

I think the main disagreements are about the quality control of the
trunk and whether scipy development is a community effort or not.

I think most of us write code in spurts as we find time and some idea
bites us, and I have a written a lot of code. However, this is *not*
trunk code, this is sandbox code.

As Skipper described, in statsmodels almost all development occurs in
the sandbox and in branches, and it is only included in the "official"
core of statsmodels after it has been verified and tests have been
added. sandbox code is everything from first draft version to almost
finished code.
And one of Skippers task in his gsoc is to clean out the sandbox.
Once it is in trunk (core) any further refactoring follows very strict rules.

*Every* new function or method needs test before going into trunk or
right after. And I hope the test coverage of scipy goes towards that
goal. This also applies to trivial functions, because they might be
victims of some later refactoring.  I have seen a lot of stranded
non-functional code in scipy.stats, stats.models and in other parts of
scipy.

Review before or after commit
I think (non-minor) changes, especially new functions, methods and
classes need to be offered to the mailing list for comments, review
before being committed. (Plus to make it feasible, we have an implied:
"If nobody voices disagreement, then I will commit".) The git mirror
has been working for a long time, and most development in scipy seems
to follow this policy.

curve_fit is a good example, Travis committed the changes, without
mentioning it on the mailing list. I saw the commit, commented that
the statistics of the new function is incorrect and we changed after
several rounds until it was verified. I don't think it has any tests
yet.

Specific to stats: I want a reference for any function where the
explanation cannot be found with a Wikipedia search with one of the
terms in the docstring. One or a few weeks ago, scipy.stats gained a
new function, my asking on the mailing list what it is supposed to be,
didn't receive any reply. (besides the problem that the function had
the same name as an existing function).

Dumping new code into scipy trunk, without any review and tests,
hoping that someone else looks for the problems is not an approach
that I find acceptable. And personally, I refuse now being "dumped
at". And I will *not* spend my time in the next three days writing
missing tests and verifying code that has been committed to trunk this
weekend.

Asking me if I have commit rights, shows at least some disconnect from
the development of scipy in the last three years, since I have been
pretty (too) noisy about it on the mailing lists.

Josef









>
> Sincere regards,
>
> -Travis
>
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>



More information about the SciPy-Dev mailing list