Test-driven development of random algorithms

Mon Nov 13 22:36:51 EST 2006

Steven D'Aprano wrote:
> I'm working on some functions that, essentially, return randomly generated
> strings. Here's a basic example:
> 
> def rstr():
>     """Return a random string based on a pseudo 
>     normally-distributed random number.
>     """
>     x = 0.0
>     for i in range(12):
>         x += random.random()
>     return str(int(x)+6))
> 
> I want to do test-driven development. What should I do? Generally, any
> test I do of the form
> 
> assert rst() == '1'
> 
> will fail more often than not (about 85% of the time, by my estimate). An
> easy work around would be to do this:
> 
> assert rstr() in [str(n) for n in range(-6, 6)]
> 
> but (1) that doesn't scale very well (what if rstr() could return one of
> a billion different strings?) and (2) there could be bugs which only show
> up probabilistically, e.g. if I've got the algorithm wrong, rstr() might
> return '6' once in a while.
> 
> Does anyone have generic advice for the testing and development of this
> sort of function?

"Design for Testability". In library code, never call the functions in the
random module. Always take as an argument a random.Random instance. When
testing, you can seed your own Random instance and all of your numbers will be
the same for every test run.

This kind of design is A Good Thing(TM) outside of unit tests, too. They aren't
the only places where one might want to have full control over the sequence of
random numbers.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco