Code correctness, and testing strategies

Sat May 24 21:28:30 EDT 2008

David <wizzardx at gmail.com> writes:

> Is it considered to be cheating if you make a test case which always
> fails with a "TODO: Make a proper test case" message?

I consider it so.

What I often do, though, is write a TODO comment in the unit test
suite:

    # TODO: test_frobnitz_produces_widget_for_znogplatz_input(self):

At which point I've got:

* A unit test suite that has the same number of tests.

* A comment that my editor will highlight, and that many tools are
  tuned to look for and flag for my attention (the "TODO:"
  convention).

* A description of what the test should be testing, as the name of the
  function that will implement the test. (This also forces me to think
  of exactly what it is I want to assert about the code's behaviour,
  by writing a test case name that embodies that assertion.)

* The beginnings fo the test case function itself, simply by removing
  the "# TODO: " part when I get around to that test case.

Then I get back to the work I was doing when that idea struck me,
knowing thaat it's recorded so I don't have to focus on it now.

> For example: sanity tests. Functions can have tests for situations
> that can never occur, or are very hard to reproduce. How do you unit
> test for those?

That depends, of course, on how hard they are to reproduce. But one
common technique is isolation via test doubles
<URL:http://xunitpatterns.com/Test%20Double.html>.

If you want to see how a specific unit of code behaves under certain
conditions, such as a disk failure, you're not interested in testing
the disk subsystem itself, only your code's behaviour. So you isolate
your code by having the unit test (often in a fixture) provide a "test
double" for the disk subsystem, and rig that double so that it will
predictably provide exactly the hard-to-get event that you what your
code to respond to correctly.

> A few examples off the top of my head:
> 
> * Code which checks for hardware defects (pentium floating point,
> memory or disk errors, etc).
> 
> * Code that checks that a file is less than 1 TB large (but you only
> have 320 GB harddrives in your testing environment).
> 
> * Code which checks if the machine was rebooted over a year ago.

All of these can, with the right design, be tested by providing a test
double in place of the subsystem that the code under test will
exercise, and making that test double provide exactly the response you
want to trigger the behaviour in your code.

> Also, there are places where mock objects can't be used that easily.
> 
> eg 1: A complicated function, which needs to check the consistency of
> it's local variables at various points.

That's a clear sign the function is too complicated. Break out the
"stages" in the function to separate functions, and unit test those so
you know they're behaving properly. Then, if you need to, you can
easily provide test doubles for those functions; or you may find you
don't need to once you know they're fully test covered.

> It *is* possible to unit test those consistency checks, but you may
> have to do a lot of re-organization to enable unit testing.

This results in more modular code with looser coupling (fewer
interdependencies between components). This is a good thing.

> In other cases it might not be appropriate to unit test, because it
> makes your tests brittle (as mentioned by another poster).
> 
> eg: You call function MyFunc with argument X, and expect to get result Y.
> 
> MyFunc calls __private_func1, and __private_func2.

If you've got name-mangled functions, that's already a bad code smell.
Not necessarily wrong, but that's the way to bet.

> You can check in your unit test that MyFunc returns result Y, but you
> shouldn't check __private_func1 and __private_func2 directly, even if
> they really should be tested (maybe they sometimes have unwanted side
> effects unrelated to MyFunc's return value).

Don't write functions with unwanted side effects. Or, if those side
effects are actually caused by other subsystems beyond your control,
test those functions with test doubles in place of the external
subsystems.

> eg: Resource usage.
> 
> How do you unit test how much memory, cpu, temporary disk space, etc
> a function uses?

This is beyond the scope of unit tests. You should be testing resource
usage by profiling the entire application, not in unit tests.

> eg: Platforms for which unit tests are hard to setup/run.
> 
>  - embedded programming. You would need to load your test harness into
> the device, and watch LED patterns or feedback over serial. Assuming
> it has enough memory and resources :-)
>  - mobile devices (probably the same issues as above)

Yes, it's unfortunate that some systems don't yet have good tools to
enable unit testing. Fix that, or pay others to fix it, because it's
in your best interests (and your customers's interests) to have unit
testing be as easy as feasible on all platforms that you target.

> eg: race conditions in multithreaded code: You can't unit test
> effectively for these.

Another good reason to avoid multithreaded code where possible. I find
that "it's really hard to deterministically figure out what's going
on" is reason enough to avoid it. If it's so hard to know the current
state of the system, it's therefore a poor choice, because I have no
good way to report to my customer how close I am to finishing the
implementation.

Others may have successfully achieved unit tests for conditions that
only arise in multithreaded code, but that's outside my experience.

-- 
 \     "How many people here have telekenetic powers? Raise my hand."  |
  `\                                                    -- Emo Philips |
_o__)                                                                  |
Ben Finney