code coverage

Sun Jul 11 02:47:39 EDT 1999

Hello,

  After the discussion a couple weeks back about code coverage
utilites for Python, I decided to look into them because I want
to ensure that our regression tests really do execute all (or at
least most) of the code in our packages.

  The two mentioned in the thread were:
 trace.py - http://www.musi-cal.com/~skip/python/trace.py
      by Skip Montanaro

 pycover - ftp://ftp.python.org/pub/python/contrib/All/pycover-0.1.tar.gz
      by Andrew Csillag

  Neither of these fit my needs, so I ended up highly modifying
trace.py.  For those interested, the new version is at
   ftp://starship.python.net/pub/crew/dalke/trace.py

I would appreciate people taking a look at it and testing it out;
It's in a pretty complete state, but needs review and suggestions
for improvements.

  The major features I added are:

     a more extensive command-line interface using getopt and
reporting a full --help message

     support for packages.  trace.py would use only basename of
the __file__ when generating a coverage log.  Thus, if there were
two submodules with the same name, one would overwrite the other.
Now the results are saved to the file named __name__ + '.covered'

     able to exclude coverage on a module based on its __file__
and/or on its __name__ (so you can exclude all the system modules
using sys.prefix).  pycover, for example, has built-in exclusion
of /usr/local, which isn't always appropriate for our installs.

     a big problem I had with both of the coverage programs was
they didn't do a very good job at telling me what executable lines
were not covered.  For example, one of my test scripts is:

=====
import daylight.Smiles

def main():
        mol = daylight.Smiles.smilin("COON")
        if len(mol.atoms) > 9:
                print "Hello" #pragma: NO COVER
                print "there"
        else:
                print "Small"
        print mol.cansmiles()

if __name__ == "__main__":
        main()
=====

when run through pycover, the un-executed lines start with a '!'

======
> import daylight.Smiles

> def main():
>       mol = daylight.Smiles.smilin("COON")
>       if len(mol.atoms) > 9:
!               print "Hello" #pragma: NO COVER
!               print "there"
!       else:
>               print "Small"
>       print mol.cansmiles()

> if __name__ == "__main__":
>       main()
======

As you see, the "else:" is marked as unexecuted, and it always will
be even if both branches of the if statment are taken, because else
generates no SET_LINENO instruction.  The basic problem is in
determining if a line should have been executed or not, and the
programs do it by looking for non-empty, non-comment ("""detection
of multiline quotes is also a problem
for these two programs""")

Now, nearly exery executable line, when compiled for the PVM, contains
code listing the line number.  So I wrote some code (based on dis.py)
to disassemble a module's code objects, look for the SET_LINENO,
and use those numbers to tell which executable lines haven't been
covered.

This trick doesn't work for everything, but is better than nothing.

So, if you want, please take a gander at the code, at
   ftp://starship.python.net/pub/crew/dalke/trace.py

						Andrew
						dalke at bioreason.com