Probabilistic unit tests?

Thu Jan 10 22:34:38 EST 2013

On Thu, 10 Jan 2013 17:59:05 -0800, Nick Mellor wrote:

> Hi,
> 
> I've got a unit test that will usually succeed but sometimes fails. An
> occasional failure is expected and fine. It's failing all the time I
> want to test for.

Well, that's not really a task for unit testing. Unit tests, like most 
tests, are well suited to deterministic tests, but not really to 
probabilistic testing. As far as I know, there aren't really any good 
frameworks for probabilistic testing, so you're stuck with inventing your 
own. (Possibly on top of unittest.)

> What I want to test is "on average, there are the same number of males
> and females in a sample, give or take 2%."
> 
> Here's the unit test code:
> import unittest
> from collections import counter
> 
> sex_count = Counter()
> for contact in range(self.binary_check_sample_size):
>     p = get_record_as_dict()
>     sex_count[p['Sex']] += 1
> self.assertAlmostEqual(sex_count['male'],
>                        sex_count['female'],
>                        delta=sample_size * 2.0 / 100.0)

That's a cheap and easy way to almost get what you want, or at least what 
I think you should want.

Rather than a "Succeed/Fail" boolean test result, I think it is worth 
producing a float between 0 and 1 inclusive, where 0 is "definitely 
failed" and 1 is "definitely passed", and intermediate values reflect 
some sort of fuzzy logic score. In your case, you might look at the ratio 
of males to females. If the ratio is exactly 1, the fuzzy score would be 
1.0 ("definitely passed"), otherwise as the ratio gets further away from 
1, the score would approach 0.0:

if males <= females:
    score = males/females
else:
    score = females/males

should do it.

Finally you probabilistic-test framework could then either report the 
score itself, or decide on a cut-off value below which you turn it into a 
unittest failure.

That's still not quite right though. To be accurate, you're getting into 
the realm of hypotheses testing and conditional probabilities:

- if these random samples of males and females came from a population of 
equal numbers of each, what is the probability I could have got the 
result I did?

- should I reject the hypothesis that the samples came from a population 
with equal numbers of males and females?

Talk to a statistician on how to do this.

> My question is: how would you run an identical test 5 times and pass the
> group *as a whole* if only one or two iterations passed the test?
> Something like:
> 
>     for n in range(5):
>         # self.assertAlmostEqual(...)
>         # if test passed: break
>     else:
>         self.fail()
> 
> (except that would create 5+1 tests as written!)

Simple -- don't use assertAlmostEqual, or any other of the unittest 
assertSomething methods. Write your own function to decide whether or not 
something passed, then count how many times it passed:

count = 0
for n in range(5):
    count += self.run_some_test()  # returns 0 or 1, or a fuzzy score
if count < some_cut_off:
    self.fail()

-- 
Steven