[SciPy-dev] Scipy workflow (and not tools).

Thu Feb 26 10:18:49 EST 2009

On Thu, Feb 26, 2009 at 3:38 AM, Robin <robince at gmail.com> wrote:

>
> As another long time lurker I would also support everything Neil said.
>
> I also wanted to add the point, that what stops me recommending scipy
> more widely to my colleagues is not that there is not enough code in
> it - it is that it is not stable enough to rely on for their work.
> That is perhaps a bit harsh, but I am sure that the first time one of
> my colleagues lost 1/2 a day because of a scipy bug (as I have done
> quite a few times) they would be back to MATLAB.
>
> So I would agree with Stefan and the others that the priority is not
> getting more code in per se, but improving the quality and frequency
> of releases to get a platform whose stability compares with MATLAB
> before adding more stuff.
>

I think we are not seeing enough trac tickets about missing test
with tests included as patches.

For a user, that is familiar with the a part of scipy, it would be relatively
easy to provide a test. This would reduce the chance that parts get
broken by accident and signal in any refactoring that the interface should
be change only with proper depreciation warning. So the users could
contribute to scipy and making it more stable for their own work.

Similarly, when I was working my way through parts of scipy, I found
 that examples (or tests that can be used as examples) are missing.
This makes it often difficult to figure out what the
exact format of the call parameters and limitation of the functions are.

Example: signal.ltisys: no tests, no examples, good general description
For someone not familiar with the matlab signal toolbox, it is not clear
what the exact requirements for the matrices of the state space
representation is.
But for users of this, it might be much easier to come up with examples and
tests than for me, who has to work trough the exceptions that are raised and the
source code.

I also agree, with Neil. This is exactly the situation I was in, half
a year ago.
Before, getting commit access I had several local copies of files and finally a
bzr branch to keep track of my changes. A more systematic workflow for this
would a big improvement. But for rewriting relatively confined
parts (which is most of stats but may not apply to other parts), I
still prefer to
work with stand-alone scripts (under my own local version control),
and integrate
them into scipy when they are ready.

The review of my changes by Per Brodtkorb was very helpful. However, my
main quality control was to increase the test coverage for
stats.distributions from
around 50% to above 90%, with statistical tests that made sure that the numbers
are at least approximately correct. (up to statistical noise and
numerical precision.)

Since I was also relatively new to numpy, I might not have coded
everything in the
most efficient way, but at least I felt relatively sure that each
change I made passed
the basic (statistical) tests.

And I'm still reluctant to apply any bug fixes without full
verification and testing. This
slows down the bug fixing and enhancements but lowers the chance that
we introduce
 new bugs. High test coverage would also make it easier to apply new patches or
enhancements since we don't have to wait for the next round of bug
reports to verify
that everything still works.
I think once scipy has a reasonable test coverage, the development and release
process would go quite a bit faster

Using nose testing is a huge improvement in the testing workflow. And I wish
that we see lots of trac tickets with patches for missing tests.

Josef