Is code duplication allowed in this instance?

Fri Jul 3 09:11:00 EDT 2009

On Fri, 03 Jul 2009 03:46:32 -0700, Klone wrote:

> Hi all. I believe in programming there is a common consensus to avoid
> code duplication, I suppose such terms like 'DRY' are meant to back this
> idea. Anyways, I'm working on a little project and I'm using TDD (still
> trying to get a hang of the process) and am trying to test the
> functionality within a method. Whoever it so happens to verify the
> output from the method I have to employ the same algorithm within the
> method to do the verification since there is no way I can determine the
> output before hand.
> 
> So in this scenario is it OK to duplicate the algorithm to be tested
> within the test codes or refactor the method such that it can be used
> within test codes to verify itself(??).

Neither -- that's essentially a pointless test. The only way to 
*correctly* test a function is to compare the result of that function to 
an *independent* test. If you test a function against itself, of course 
it will always pass:

def plus_one(x):
    """Return x plus 1."""
    return x-1  # Oops, a bug.

# Test it is correct:
assert plus_one(5) == plus_one(5)

The only general advice I can give is:

(1) Think very hard about finding an alternative algorithm to calculate 
the same result. There usually will be one.

(2) If there's not, at least come up with an alternative implementation. 
It doesn't need to be particularly efficient, because it will only be 
called for testing. A rather silly example:

def plus_one_testing(x):
    """Return x plus 1 using a different algorithm for testing."""
    if type(x) in (int, long):
        temp = 1
        for i in range(x-1):
            temp += 1
        return temp
    else:
        floating_part = x - int(x)
        return floating_part + plus_one_testing(int(x))

(The only problem is, if a test fails, you may not be sure whether it's 
because your test function is wrong or your production function.)

(3) Often you can check a few results by hand. Even if it takes you 
fifteen minutes, at least that gives you one good test. If need be, get a 
colleague to check your results.

(4) Test failure modes. It might be really hard to calculate func(arg) 
independently for all possible arguments, but if you know that func(obj) 
should fail, at least you can test that. E.g. it's hard to test whether 
or not you've downloaded the contents of a URL correctly without actually 
downloading it, but you know that http://example.com/ should fail because 
that domain doesn't exist.

(5) Test the consequences of your function rather than the exact results. 
E.g. if it's too difficult to calculate plus_one(x) independently:

assert plus_one(x) > x  # except for x = inf or -inf
assert plus_one( -plus_one(x) ) == x  # -(x+1)+1 = x

(6) While complete test coverage is the ideal you aspire to, any tests 
are better than no tests. But they have to be good tests to be useful. 
Even *one* test is better than no tests.

Hope this helps.

-- 
Steven