coverage.py: "Statement coverage is the weakest measure of code coverage"

Sun Oct 28 22:20:13 EDT 2007

On Oct 28, 11:56 pm, Ben Finney <b... at benfinney.id.au> wrote:
> Howdy all,
>
> Ned Batchelder has been maintaining the nice simple tool 'coverage.py'
> <URL:http://nedbatchelder.com/code/modules/coverage.html> for
> measuring unit test coverage.
>
> On the same site, Ned includes documentation
> <URL:http://nedbatchelder.com/code/modules/rees-coverage.html> by the
> previous author, Gareth Rees, who says in the "Limitations" section:
>
>     Statement coverage is the weakest measure of code coverage. It
>     can't tell you when an if statement is missing an else clause
>     ("branch coverage"); when a condition is only tested in one
>     direction ("condition coverage"); when a loop is always taken and
>     never skipped ("loop coverage"); and so on. See [Kaner 2000-10-17]
>     <URL:http://www.kaner.com/pnsqc.html> for a summary of test
>     coverage measures.
>
> So, measuring "coverage of executed statements" reports complete
> coverage incorrectly for an inline branch like 'foo if bar else baz',
> or a 'while' statement, or a 'lambda' statement. The coverage is
> reported complete if these statements are executed at all, but no
> check is done for the 'else' clause, or the "no iterations" case, or
> the actual code inside the lambda expression.
>
> What approach could we take to improve 'coverage.py' such that it
> *can* instrument and report on all branches within the written code
> module, including those hidden inside multi-part statements?

I used to write once a coverage tool ( maybe I can factor this out of
my tool suite some time ) which is possibly transformative. Currently
it generates measurement code for statement coverage and i'm not sure
it has more capabilities than coverage.py because I was primary
interested in the code generation and monitoring process, so I didn't
compare.

Given it's nature it might act transformative. So a statement:

if a and b:
    BLOCK

can be transformed into

if a:
    if b:
        BLOCK

Also

if a or b:
    BLOCK

might be transformed into

if a:
   BLOCK
elif b:
   BLOCK

So boolean predicates are turned into statements and statement
coverage keeps up. This is also close to the way bytecode works
expressing "and" | "or" predicates using jumps. I'm not sure about
expressions yet, since I did not care about expression execution but
traces.

The underlying monitoring technology needs to be advanced. I used a
similar approach for an even more interesting purpose of feeding
runtime type information back into a cloned parse tree of the initial
tree which might be unparsed to type annotated source code after
program execution. But that's another issue.

The basic idea of all those monitorings is as follows: implement an
identity function with a side effect. I'm not sure how this monitoring
code conflicts with rather deep reflection ( stacktrace inspection
etc. )

Kay