Ordering dependent test failures

Sun Sep 20 04:58:53 EDT 2015

paul.anton.letnes at gmail.com wrote:

> Fascinated by the concept of ordering dependent test failures [0], I've
> run the python test suite [1] with 256 different random seeds (took a
> little more than 12 hours). The results vary a lot - for instance, the
> number of tests reported as OK varies, the number of skips varies, etc.
> Since I'm not sure how to report or interpret them, I'll just post a
> summary below.
> 
> The test suite was run on arch linux [2] with gcc 5.2.0, with the source
> code taken from a clone of the python repo yesterday [3].
> 
> What could I do with all this in order to make more sense of it, and could
> it be of any help what so ever to python development? I'll gladly make the
> full log files available to whomever is interested, in whatever format is
> convenient. In the meantime I'll run more random seeds, because why not.

Given that the most likely interference is between two tests instead of the 
shotgun approach you could run the unit tests in such a way that for any 
pair of tests a, b the tests are run at least once with a before b and with 
a after b. Assuming that a test c does not "destroy" the interference when 
run between a and b for three tests this could be achieved with

1 abc covers ab, ac, bc
2 cba covers ba, ca, cb

If b fails in run 1 you'll assume a to be the culprit.
If c fails in run 1 you'll follow up with

1.1 ac
1.2 bc

to identify the predecessor causing the failure. Of course there are some 
complications, e. g. 

- Neither 1.1 nor 1.2 or both may fail
- The typical test suite comprises more than three tests
- Should the sequence aa be covered, i. e. a test case interfering with
  itself?