[SciPy-dev] the state of scipy unit tests

Sun Nov 23 22:55:43 EST 2008

On Sun, Nov 23, 2008 at 9:56 PM, Nathan Bell <wnbell at gmail.com> wrote:
> In the past, I would always run 'nosetests scipy' before committing
> changes to SVN.  Due to the current state of the unit tests, I don't
> anymore, and I suspect I'm not alone.
>
> Here are the main offenders on my system:
>
> scipy.stats.
> I appreciate the fact that rigorous testing on this module takes time,
> but 4 minutes on a 2.4GHz Core 2 system is unreasonable.  IMO 20
> seconds is a reasonable upper bound.  Essential tests that don't meet
> this time constraint should be filtered out of the default test suite.
>

I agree that it is pretty painful, I usually just run nosetests on the
module or package level, e.g. 'nosetests scipy.stats'  before commit
and specific test files while correcting individual functions. For my
distributions tests, I use additional tests that are renamed in svn so
that nose doesn't pick them up.

Is it possible to use an exclude option with nose that excludes for
example all tests in scipy stats or specific test files?

My problem, that I raised already once on the mailing list, is that I
am testing now essentially all methods of close to 100 distributions,
some of which require a lot of numerical integration and optimization.
I wrote the tests pretty fast, for bug hunting and to get one thorough
round of testing during the next beta release. But for everyday usage
they are too much.

I haven't done any profiling to see which are the most offending
distributions, and since there are so many distributions and all tests
are generators, it is difficult to special case individual time
consuming methods and distributions. Another problem are tests based
on random numbers, if the sample size and power of statistical tests
are too small (as was the case in scipy until a few months ago), then
it doesn't catch many bugs, if the statistical tests should have some
power, then they require larger samples and more calculation.

My initial attempts to use decorators were not very successful, since
nose doesn't allow to decorate test generators. One option would be to
label most of my test functions with slow, but I haven't tried this
yet. In the old test system, it was possible to assign levels to the
tests. I don't know if or how it is possible to label my tests so that
a few basic ones are run on a low level and the other ones only at
higher levels.

Triaging my tests will be quite a bit of work, but the short term
solution is to find a way how to exclude most of them for everyday use
but keep them available for beta testing.

BTW. Is there a way to profile the tests itself (test yielded by
generator not the test function)?

Josef