When to use assert

Fri Oct 24 03:49:47 EDT 2014

Dan Stromberg wrote:

> I like to use assertions and "if cond: raise ValueError('foo')" a lot.
> 
> I think Eiffel may be the poster-child for a language with
> pre-conditions, post-conditions and assertions.

Yes. I don't think Eiffel is the only language with explicit support for
testing invariants, but it's probably the best known one.

https://archive.eiffel.com/doc/online/eiffel50/intro/language/invitation-07.html

Others include Ada2012, Mercury, D, Perl6, Cobra, Clojure, and others. Cobra
is especially interesting as the syntax is heavily influenced by Python's.

http://cobra-language.com/trac/cobra/wiki/Contracts

> I think you're in good company - a lot of developers don't use assertions
> much.

"You" in this context is Chris Angelico.

> I like assertions, because they tend to stop bugs pretty quickly.  If
> you have 3 functions, one calling another calling another, assertions
> in each can keep you from having to backtrack among them when
> debugging, instead going directly to the problem's source.

Yes, the purpose of assertions is to help errors be discovered as close to
the cause as possible, rather than somewhere much later on. Consider:

addresses = [get_address(name) for name in database]
# ... much later on ...
for i, address in enumerate(addresses):
    if some_condition():
        addresses[i] = modify(address)
# ... much later on ...
for address in list_of_addresses:
    process_non_empty_address(address)

where you can't easily modify or check the get_address() and modify()
functions. If you have a failure in process_non_empty_address, due to a
violation of the "address must not be empty" invariant, which function is
to blame?

You could wrap them and find out that way:

from somewhere import get_address as _get_addr

def get_address(*args, **kwargs):
    result = _get_addr(*args, **kwargs)
    if not result:
        raise RuntimeError("bug in get_address")
    return result

and under some circumstances that's a good strategy, but often you just want
to determine which function is violating the constraint, fix that one, and
leave the other unmodified.

addresses = [get_address(name) for name in database]
assert all(address for address in addresses)
# ... much later on ...
for i, address in enumerate(addresses):
    if some_condition():
        addresses[i] = modify(address)
        assert addresses[i]

will either identify the culprit, or at least prove that neither
get_address() nor modify() are to blame. Because you're using an assertion,
it's easy to leave the asserts in place forever, and disable them by
passing -O to the Python interpreter in production.

[Aside: it would be nice if Python did it the other way around, and require
a --debugging switch to turn assertions on. Oh well.]

>> This is the job of a test suite.
> 
> Test suites are great, and I can't really question your reliance on
> them.  I love having lots of automated tests.  But for the reason I
> described above, I still like having lots of assertions.

Assertions and test suites are complementary, not in opposition, like belt
and braces. Assertions insure that the code branch will be tested if it is
ever exercised, something test suites can't in general promise. Here's a
toy example:

def some_function(value):
    import random
    random.seed(value)
    if random.random() == 0.25000375:
        assert some_condition
    else:
        pass

Try writing a unit test that guarantees to test the some_condition
branch :-)

[Actually, it's not that hard, if you're willing to monkey-patch the random
module. But you may have reasons for wanting to avoid such drastic
measures.]

I don't know what value will cause some_function() to take the
some_condition branch, but I know that if it ever takes that branch, the
assert will guard it, regardless of whether or not I've written a unit test
to cover that situation.

>> You don't pepper your code with
>> assertions to the effect that "I just pushed something onto my queue,
>> it should now have this item in it"; you create a test case for it,
>> and verify your function there. In the rest of the code, you trust
>> that your test suite passes, and don't waste time with assertions.
> 
> I wouldn't test that a value was added to a queue immediately after
> adding it.  That's excessive, and may even require an abstraction
> violation.
> 
> But if, for example, I have a string with 3 known-good values, I'll
> if/elif/elif/else, and make the else raise an AssertionError.  The
> assertion should never fire, but if the code changes, it could, and if
> there's a typo somewhere, it could then too.

I like this style:

assert x in (a, b, c)
if x == a:
    do_this()
elif x == b:
    do_that()
else:
    assert x == c
    do_something_else()

Why do I prefer that? To defend against future code changes. Defensive
programming can defend not only against bugs in the current code, but also
bugs in *future* code. Consider this common case:

# x is either a or b.
if x == a:
    do_this()
elif x == b:
    do_that()
else:  # x must be c.
    do_something_else()

Comments lie, and people don't read them. The comment at the top of the
suite is already out of date. I'm sure every experienced programmer has
seen situations like that -- or even written code like that. What happens
when the requirements change and x can also take the value d?

With an eye to defensive programming, the first improvement is:

# x is a, b or c.
if x == a:
    do_this()
elif x == b:
    do_that()
elif x == c:
    do_something_else()
else:
    # This cannot possibly happen, but just in case it does...
    raise RuntimeError('an unexpected error occurred')

except that still relies on the next maintainer reading the comment. It's
vulnerable to somebody "fixing" the code by removing the dead code and the
redundant test for x == c. Besides, it's embarrassing to see "an unexpected
error occurred" messages in production, and you know you will.

A better solution is to use assertions as checked comments. If the
requirements change and you forget to update this part of the code, the
assertions will fail:

assert x in (a, b, c)
if x == a:
    do_this()
elif x == b:
    do_that()
else:
    assert x == c
    do_something_else()

>> Or is that insufficiently paranoid?
> 
> With good tests, you're probably fine.

Is it possible to be too paranoid when it comes to tests?

:-)

-- 
Steven