constructor classmethods

Steve D'Aprano steve+python at pearwood.info
Tue Nov 8 21:25:37 EST 2016


On Wed, 9 Nov 2016 10:01 am, teppo.pera at gmail.com wrote:

> Generally, with testing, it would be optimal to test outputs of the system
> for given inputs without caring how things are implemented.

I disagree with that statement.

You are talking about "black-box testing" -- the test code should treat the
code being tested as a completely opaque, black box where the inner
workings are invisible. Tests are only written against the interface.

The alternative is "white-box testing", where the test code knows exactly
what the implementation is, and can test against the implementation, not
just the interface.

White-box testing is more costly, because any change in implementation will
cause a lot of code churn in the tests. Change the implementation, and
tests will disappear. Change it back, and the tests need to be reverted.

But that churn only applies to one person: the maintainer of the test. It
doesn't affect users of the code. It is certainly a cost, but it is
narrowly focused on the maintainer, not the users.

White-box testing gives superior test coverage, because the test author is
aware of the implementation. Let's suppose that I was writing some
black-box tests for Python's list.sort() method. Knowing nothing of the
implementation, I might think that there's only a handful of cases I need
to care about:

- an empty list []
- a single item list [1]
- an already sorted list [1, 2, 3, 4, 5]
- a list sorted in reverse order [5, 4, 3, 2, 1]
- a few examples of unsorted lists, e.g. [3, 5, 2, 4, 1]

And we're done! Black-box testing makes tests easy.

But in fact, that test suite is completely insufficient. The implementation
of list.sort() uses two different sort algorithms: insertion sort for short
lists, and Timsort for long lists. My black-box test suite utterly fails to
test Timsort.

To be sufficient, I *must* test both insertion sort and Timsort, and I can
only guarantee to do that by testing against the implementation, not the
interface.

Black-box testing is better than nothing, but white-box testing is much more
effective.


> That way, any 
> changes in implementation won't affect test results. Very trivial example
> would be something like this:
> 
> def do_something_important(input1, input2, input3)
>     return  # something done with input1, input2, input3
> 
> Implementation of do_something_important can be one liner, or it can
> contain multiple classes, yet the result of the function is what matters.

Certainly. And if you test do_something_important against EVERY possible
combination of inputs, then you don't need to care about the
implementation, since you've tested every single possible case.

But how often do you do that?


[...]
> Then comes the next step, doing the actual DI. One solution is:
> 
> class Example:
>     def __init__(self, queue=None):
>         self._queue = queue or Queue()

That's buggy. If I pass a queue which is falsey, you replace it with your
own default queue instead of the one I gave you. That's wrong.

If I use queue.Queue, that's not a problem, because empty queues are still
truthy. But if I use a different queue implementation, then your code
breaks.

The lessen here is: when you want to test for None, TEST FOR NONE. Don't
use "or" when you mean "if obj is None".


> Fine approach, but technically __init__ has two execution branches and
> someone staring blindly coverages might require covering those too. 

And here we see the advantage of white-box testing. We do need tests for
both cases: to ensure that the given queue is always used, and that a new
Queue is only used when no queue was given at all. A pure blackbox tester
might not imagine the need for these two cases, as he is not thinking about
the implementation.



> Then we can use class method too.
> 
> class Example:
>     def __init__(self, queue):
>         self._queue = queue
> 
>     @classmethod
>     def create(cls):
>         q = Queue()
>         # populate_with_defaults
>         # Maybe get something from db too for queue...
>         return cls(q)
> 
> As said, create-method is for convenience. it can (and should) contain
> minimum set of arguments needed from user (no need to be 15 even if
> __init__ would require it) to create the object. 

Why would __init__ require fifteen arguments if the user can pass one
argument and have the other fourteen filled in by default?

The question here is, *why* is the create() method a required part of your
API? There's no advantage to such a change of spelling. The Python style is
to spell instance creation:

    instance = MyClass(args)

not 

    instance = MyClass.create(args)

Of course you can design your classes using any API you like:

    instance = MyClass.Make_Builder().build().create_factory().create(args)

if you insist. But if all create() does is fill in some default values for
you, then it is redundant. The __init__ method can just as easily fill in
the default values. All you are doing is changing the spelling:

    MyClass(arg)  # insert ".create" before the left parenthesis

and that's just being different for the sake of being different:

    list.sort()  # insert ".sorting_factory" before the left parenthesis

You're not actually adding any new functionality or new abilities, or making
testing easier. You're just increasing the amount of typing needed.


> It creates the fully 
> functioning Example object with default dependencies. Do notice that tests
> I wrote earlier would still work. Create can contain slow executing code,
> if needed, but it won't slow down testing the Example class itself.

Of course it will. You are testing the create() method aren't you? If you're
not testing it, how do you know it is working? If you are testing it, then
it will be just as slow in the tests as it will be slow for the poor users
who have to call it.

I think "slow code" is a red herring. If the code is slow, its slow whether
you put it in the __init__ or create method.

For that matter, testing is a red herring too. It doesn't matter whether
your public API is:

    Example(queue)
    Example()

or:

    Example(queue)
    Example.create()

you still have to test both cases, regardless of whether you are doing
white-box or black-box testing. These are two separate APIs:

    create an instance from a specified queue
    create an instance without a specified queue

so regardless of whether you look inside the implementation or not, you
still have to test both cases.


> Finally, if you want to be tricky and write own decorator for object
> construction, Python would allow you to do that.
> 
> @spec('queue')  # generates __init__ that populates instance with queue
> given as arg 
> class Example: 
>     @classmethod
>     def create(cls):
>         return cls(Queue())

I don't think that this decorator example makes any sense. At least I cannot
understand it. Why on earth would you write a decorator to inject an
__init__ method into a given class? Unless you have many classes with
identical __init__ methods, what's the point?




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list