[SciPy-Dev] scipy.stats

Mon May 31 12:06:51 EDT 2010

On Mon, May 31, 2010 at 11:50 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Mon, May 31, 2010 at 9:32 AM, Skipper Seabold <jsseabold at gmail.com>
> wrote:
>>
>> On Mon, May 31, 2010 at 10:38 AM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> > On Mon, May 31, 2010 at 8:23 AM, Charles R Harris
>> > <charlesr.harris at gmail.com> wrote:
>> >>
>> >>
>> >> On Mon, May 31, 2010 at 8:16 AM, <josef.pktd at gmail.com> wrote:
>> >>>
>> >>> Since Travis seems to want to take back control of scipy.stats, I am
>> >>> considering my role as inofficial maintainer as ended.
>> >>>
>> >>> I would have appreciated his help almost 3 years ago, when I started
>> >>> to learn numpy, scipy, and started to submit patches for
>> >>> scipy.stats.distributions.
>> >>>
>> >>> But by now, I have pretty strong opinions about statistics in python,
>> >>> after almost  three years, I'm a bit tired of cleaning up the mess of
>> >>> others (and want to clean up my own mess), and there are obviously big
>> >>> philosophical differences for the development process between me and
>> >>> Travis (no discussion, no review, no tests).
>> >>> http://projects.scipy.org/scipy/log/trunk/scipy/stats/tests
>> >>>
>> >>> Watching the scipy changelog and checking any function that Travis
>> >>> quietly commits is no fun (see mailing list for the introduction of
>> >>> curve_fit or ask Stefan).
>> >>>
>> >>> I said early on that I would like to trust the results that
>> >>> scipy.stats produces (although I don't find the mailing list thread
>> >>> any more).
>> >>>
>> >>> I considered scipy to go into a stable direction like Python is,
>> >>> kitchen sink for scientific programming, which might be slow-moving
>> >>> but with high standards, and not a sandbox.
>> >>>
>> >>> Details are at
>> >>> http://mail.scipy.org/pipermail/scipy-dev/2010-April/014058.html
>> >>>
>> >>> After my initial scipy.stats.distributions cleanup, test coverage was
>> >>> at 91%, I have no idea where it is after this weekend.
>> >>>
>> >>> This is more about the process then the content, distributions was
>> >>> Travis's baby (although unfinished), and most of his changes are very
>> >>> good, but I don't want to look for the 5-10% (?) typos anymore.
>> >>>
>> >>
>> >> Ah Josef, there are easier ways to lodge complaints than resignation ;)
>> >> I
>> >> agree that it was rude of Travis to make those changes without running
>> >> them
>> >> through the list, and he does tend to toss stuff in that others have to
>> >> clean up, the same with c-code. But maybe we can manage to get him
>> >> housebroken without all moving out.
>> >>
>> >
>> > I think a policy of mandatory review will solve these sorts of problems,
>> > and
>> > that is probably a good argument for moving to github where review is
>> > much
>> > easier. On stats, we probably need an additional policy of rigorous
>> > testing
>> > to make sure that things are working right, the stat tests are more
>> > difficult by their very nature. I think Travis is amenable to such
>> > processes, but we do need to start a discussion. If you do feel strongly
>> > about the recent changes maybe they can be reverted and added back in
>> > after
>> > review.
>> >
>>
>> I am perhaps wading out of my depth here, but I agree with the
>> concerns and having the proposed dialogue, as I think having Josef's
>> input on the direction of scipy.stats is important.
>>
>> This does dovetail with the move to DVCS/github and having a review
>> and discussion policy in place before things go into trunk.  I don't
>> recall there being a time frame set up for the move (?) though there
>> was little dissent in actually making the move.  Perhaps we could
>> start hashing out concrete plans for review and a renewed policy for
>> testing standards so that the discussions can focus more on design and
>> as little energy as possible is spent uncovering precision errors,
>> typos, and niggling bugs.  Does it make sense to do this before the
>> move maybe as part of the docs marathon?  Of course there were also
>> those in favor of shoot first, sort it out as we go along because this
>> is a problem that has been solved before.
>>
>> Re:testing, the things that go into stats must be as test driven as
>> possible given that there are plenty of choices of where to turn to do
>> statistics work.  The econometricians that I have talked to who
>> develop in R tell me Python is a "dark horse" for choice of language
>> and having undiscovered precision errors etc., to say nothing about
>> actual design, does not help our case.
>>
>
> With this in mind, perhaps it would be best to revert the changes so that
> there is a clean starting point; we can keep them somewhere else for
> review.  The discussion of process can then take place without dealing with
> the specifics of the recent commits.

Or someone writes the tests for them and fixes possible problems, then
I don't think it's a problem to keep them.

Josef

>
> Chuck
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>