Code correctness, and testing strategies

Sat May 24 08:34:36 EDT 2008

Hi list.

What strategies do you use to ensure correctness of new code?

Specifically, if you've just written 100 new lines of Python code, then:

1) How do you test the new code?
2) How do you ensure that the code will work correctly in the future?

Short version:

For (1) I thoroughly (manually) test code as I write it, before
checking in to version control.

For (2) I code defensively.

Long version:

For (2), I have a lot of error checks, similar to contracts (post &
pre-conditions, invariants). I've read about Python libs which help
formalize this[1][2], but I don't see a great advantage over using
regular ifs and asserts (and a few disadvantages, like additional
complexity). Simple ifs are good enough for Python built-in libs :-)

[1] PEP 316: http://www.python.org/dev/peps/pep-0316/
[2] An implementation:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/436834

An aside: What is the correct situation in which to use assert
statements in Python? I'd like to use them for enforcing 'contracts'
because they're quick to type, but from the docs:

"Assert statements are a convenient way to insert debugging assertions
into a program:"

So to me it sounds like 'assert' statements are only useful while
debugging, and not when an app is live, where you would also
(especially!) want it to enforce contracts. Also, asserts can be
removed with -O, and you only ever get AssertionError, where
ValueError and the like might be more appropriate.

As for point 1 (how do you test the new code?):

I like the idea of automated unit tests. However, in practice I find
they take a long time to write and test, especially if you want to
have good coverage (not just lines, but also possible logic branches).

So instead, I prefer to thoroughly test new code manually, and only
then check in to version control. I feel that if you are disciplined,
then unit tests are mainly useful for:

1) Maintenance of legacy code
2) More than 1 person working on a project

One recent personal example:

My workstation is a Debian Unstable box. I like to upgrade regularly
and try out new library & app versions. Usually this doesn't cause
major problems. One exception is sqlalchemy. It's API seems to change
every few months, causing warnings and breakage in code which used the
old API. This happened regularly enough that for one project I spent a
day adding unit tests for the ORM-using code, and getting the unit
tests up to 100% coverage. These tests should allow me to quickly
catch and fix all sqlalchemy API breakages in my app in the future.
The breakages also make me want to stop using ORM entirely, but it
would take longer to switch to SQL-only code than to keep the unit
tests up to date :-)

My 'test code thoroughly before checkin' methodology is as follows:

1) Add "raise 'UNTESTED'" lines to the top of every function
2) Run the script
3) Look where the script terminated
4) Add print lines just before the exception to check the variable values
5) Re-run and check that the values have expected values.
6) Remove the print and 'raise "UNTESTED"' lines
7) Add liberal 'raise "UNTESTED"' lines to the body of the function.
8.1) For short funcs, before every line (if it seems necessary)
8.2) For longer funcs, before and after each logic entry/exit point
(blocks, exits, returns, throws, etc):

eg, before:

if A():
    B()
    C()
    D()
E()

after:

raise 'UNTESTED'
if A():
    raise 'UNTESTED'
    B()
    C()
    D()
    raise 'UNTESTED'
raise 'UNTESTED'
E()

8.2.1) Later I add "raise 'UNTESTED'" lines before each line in the
blocks also, if it seems necessary.

9) Repeat steps 2 to 8 until the script stops throwing exceptions
10) Check for 'raise "UNTESTED"' lines still in the script
11) Cause those sections of code to be run also (sometimes I need to
temporarily set vars to impossible values inside the script, since the
logic will never run otherwise)

And here is one of my biggest problem with unit tests. How do you unit
test code which almost never runs? The only easy way I can think of is
for the code to have 'if <some almost impossible condition> or <busy
running test case XYZ> lines'. I know I'm meant to make 'fake' testing
classes which return erroneous values, and then pass these objects to
the code being tested. But this can take a long time and even then
isn't guaranteed to reach all your error-handling code.

The above methodology works well for me. It goes fairly quickly, and
is much faster than writing and testing elaborate unit tests.

So finally, my main questions:

1) Are there any obvious problems with my 'correctness' strategies?

2) Should I (regardless of time it takes initially) still be adding
unit tests for everything? I'd like to hear what XP/agile programming
advocates have to say on the subject.

3) Are there easy and fast ways to do write and test (complete) unit tests?

4) Any other comments?

Thanks for your time.

David.