[SciPy-User] runstest and distribution of run lengths

Fri Dec 24 17:19:21 EST 2010

Does anyone know the distribution of run lengths in a sequence of
bernoulli trial?

I thought, I can implement a runstest as a quick exercise, but I got
(kind of) stuck.

I implemented the Wald-Wolfowitz runs test (plus one and two sample
versions) according to Wikipedia
http://en.wikipedia.org/wiki/Wald%E2%80%93Wolfowitz_runs_test
and the SAS manual. This test only looks at the total number of runs,
and the SAS manual has both the exact distribution for small samples
and the normal approximation for large sample. So, this went ok.

But the runstest in the NIST manual and in dataplot, has the entire
distribution of run lengths
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm
They mention a book, Bradley, 1968,  that I don't have, but they don't
say what the formulas and distribution for the expected values and
standard deviation that they use are.

Does anyone have an idea or knows a more easily accessible reference?

Josef