Dispatch table of methods with various return value types

Fri Nov 20 00:08:28 EST 2020

On 19/11/2020 02:13, Loris Bennett wrote:
> dn <PythonList at DancesWithMice.info> writes:
> 
> Firsty, thanks for taking the time to write such a detailed reply.

Bitte!

>>>>> I have a method for manipulating the membership of groups such as:
>>>>>
>>>>>        def execute(self, operation, users, group):
>>>>>            """
>>>>>            Perform the given operation on the users with respect to the
>>>>>            group
>>>>>            """
>>>>>
>>>>>            action = {
>>>>>                'get': self.get,
>>>>>                'add': self.add,
>>>>>                'delete': self.delete,
>>>>>            }
>>>>>
>>>>>            return action.get(operation)(users, group)
>>>>>
>>>>> The 'get' action would return, say, a dict of users attribute, whereas
>>>>> the 'add/delete' actions would return, say, nothing, and all actions
>>>>> could raise an exception if something goes wrong.
>>>>>
>>>>> The method which calls 'execute' has to print something to the terminal,
>>>>> such as the attributes in the case of 'get' and 'OK' in the cases of
>>>>> 'add/delete' (assuming no exception occurred).
>>>>>
>>>>> Is there a canonical way of dealing with a method which returns different
>>>>> types of data, or should I just make all actions return the same data
>>>>> structure so that I can generate a generic response?
>>>>
>>>>
>>>> Is the problem caused by coding the first step before thinking of the overall
>>>> task? Try diagramming or pseudo-coding the complete solution (with multiple
>>>> approaches), ie the operations AND the printing and exception-handling.
>>>
>>> You could have a point, although I do have a reasonable idea of what the
>>> task is and coming from a Perl background, Python always feels a bit
>>> like pseudocode anyway (which is one of the things I like about Python).
>>
>> +1 the ease of Python, but can this be seductive?
>>
>> Per the comment about Perl/Python experience, the operative part is the
>> "thinking", not the tool - as revealed in responses below...
>>
>> Sometimes we design one 'solution' to a problem, and forget (or 'brainwash'
>> ourselves into thinking) that there might be 'another way'.
>>
>> It may/not apply in this case, but adjusting from a diagram-first methodology,
>> to the habit of 'jumping straight into code' exhibited by many colleagues,
>> before readjusting back to (hopefully) a better balance; I felt that
>> coding-first often caused me to 'paint myself into a corner' with some
>> 'solutions, by being too-close to the code and not 'stepping back' to take a
>> wider view of the design - but enough about me...
>>
>>
>>>> Might it be more appropriate to complete not only the get but also its
>>>> reporting, as a unit. Similarly the add and whatever happens after that; and the
>>>> delete, likewise.
>>>
>>> Currently I am already obtaining the result and doing the reporting in
>>> one method, but that makes it difficult to write tests, since it
>>> violates the idea that one method should, in general, just do one thing.
>>> That separation would seem appropriate here, since testing whether a
>>> data set is correctly retrieved from a database seems to be
>>> significantly different to  testing whether the
>>> reporting of an action is correctly laid out and free of typos.
>>
>> SRP = design thinking! +1
> 
> I knew the idea, but I didn't now the TLA for it ;-)

Yes, there are plenty of those!

You may be interested in reading about "Clean Code", instigated (IIRC) 
by "Uncle Bob" (Robert Martin). NB Python may/not be used for 
book-examples. Just the other day I came across "Clean Code in Python", 
Mariano Anaya, PacktPub, 2018. I have yet to read it, but the contents 
page seemed to 'tick all the boxes'. The book is two years old, and IIRC 
he presented at EuroPython a few years before that (YouTube videos 
on-line - in case you prefer that medium, or want to gain a flavor 
before spending money...). All of these TLAs, and others comprising the 
"SOLID Principles" appear in the ToC, along with plenty of others, eg 
YAGNI and EAFP; plus some specific to Python, eg MRO.

>> TDD = early testing! +1
>>
>> Agreed: The tasks are definitely separate. The first is data-related. The second
>> is about presentation.
>>
>> In keeping with the SRP philosophy, keep the split of execution-flow into the
>> three (or more) functional-tasks by data-process, but turn each of those tasks
>> into two steps/routines. (once the reporting routine following "add" has been
>> coded, and it comes time to implement "delete", it may become possible to repeat
>> the pattern, and thus 're-use' the second-half...)
>>
>> Putting it more formally: as the second-half is effectively 'chosen' at the same
>> time as the first, is the reporting-routine "dependent" upon the data-processor?
>>
>> 	function get( self, ... )
>> 		self.get_data()
>> 		self.present_data()
>>
>> 	function add( self, ... )
>> 		self.add_data()
>> 		self.report_success_fail()
>>
>> 	...
>>
>> Thus, the functional task can be tested independently of any reporting follow-up
>> (for example in "get"); whilst maintaining/multiplying SRP...
> 
> The above approach appeals to me a lot.  Slight downsides are that
> such 'metafunctions' by necessity non-SRP functions and that, as there
> would be no point writing tests for such functions, some tools which try
> to determine test coverage might moan.

First comes (Python) 'duty': the word "meta", perhaps more in the 
context of "meta-classes" has particular meaning in Python, that may not 
align with expectations generated by understanding the term "meta" in 
other contexts!

Second, we return to earlier comments about up-front design. Consider 
"Stepwise Decomposition" 
(https://en.wikipedia.org/wiki/Top-down_and_bottom-up_design) and how 
solving a 'large problem' is likened to pealing an onion, ie one 'layer' 
at a time. Thus there is a sub-problem, eg report on xyz; this divides 
into smaller problems: (i) from where do I find the data on xyz, and 
(ii) how do I present this.

If you code top-down, then it may be that there are three subroutines 
(functions in Python) which implement the three of these. Further, that 
only the two "smaller" routines appear to be doing any 'work'. However, 
remember that the function names both document the solution and 
reproduce the specification. Thus three well-chosen names will add value 
and ease understanding for you/us, six months later...

If you code bottom-up and have TDD-built the two "smaller" functions, 
then adding the 'higher' function as an 'umbrella' will tie them 
together - for the reasons/results mentioned above.

There are different types of testing. Unit testing is straightforward 
with pytest or similar. This takes care of tests such as 'did "get" 
realise the correct data?' and 'after "delete" does this data exist?'. 
These are likely tests of functions at the lowest and/or lower levels of 
the implementation - hence the name.

When it comes to reporting, life becomes more complicated. Whereas 
pytest will capture and allow testing of sysout, when we move to Qt, 
gtk, or some other UI took-kit, we need to find a compatible testing 
tool. If presentation is HTML, then web-page testing is accomplished 
with the likes of Selenium.

If we are talking UX (User Experience) testing, then the 
information-presented is almost-irrelevant. If you have a user on the 
dev.team (see also Agile teams), then (s)he will perform such 'testing' 
manually (and give rapid feedback). Thus, no tool required, as such.

NB If you are concerned about the actual information being presented, 
surely that will have already been tested as accurate by the unit test 
mentioned earlier?

Regarding the comment about "moan[ing]" tools. Who's in-charge here? 
When it is helping you it is a "tool". What is something that is getting 
in your way, causing you frustration, or otherwise interfering with your 
happiness and productivity?

Pointy-headed managers [a reference to the Dilbert cartoons] have often 
tried to create/impose 'rules' on developers. One of my favorites is: 
"there will be at least one comment for every ten lines of code". Do you 
need to strain-the-brain to know what happens?

     # this code has a comment
     ...

     # add one to x
     x += 1

I'm afraid the idea of 100% code-coverage is a nirvana that is probably 
not worth seeking. See also @Ned's comments (about his own coverage.py 
tool) 
https://nedbatchelder.com/blog/200710/flaws_in_coverage_measurement.html

The car's speedo might tell you that it can motor-along at some 
incredible speed, but using the information sensibly might attract less 
attention from the Highway Patrol!

>>>> Otherwise the code must first decide which action-handler, and later,
>>>> which result-handler - but aren't they effectively the same decision?
>>>> Thus, is the reporting integral to the get (even if they are in
>>>> separate routines)?
>>>
>>> I think you are right here.  Perhaps I should just ditch the dispatch
>>> table.  Maybe that only really makes sense if the methods being
>>> dispatched are indeed more similar.  Since I don't anticipate having
>>> more than half a dozen actions, if that, so an if-elif-else chain
>>> wouldn't be too clunky.
>>
>> An if...elif...else 'ladder' is logically-easy to read, but with many choices it
>> may become logistically-complicated - even too long to display at-once on a
>> single screen.
>>
>> Whereas, the table is a more complex solution (see 'Zen of Python') that only
>> becomes 'simple' with practice.
>>
>> So, now we must balance the 'level(s)' of the team likely to maintain the
>> program(me) against the evaluation of simple~complex. Someone with a ComSc
>> background will have no trouble coping with the table - and once Python's
>> concepts of dictionaries and functions as 'first-class objects' are understood,
>> will take to it like the proverbial "duck to water". Whereas, someone else may
>> end-up scratching his/her head trying to cope with 'all the above'.
> 
> The team?  L'équipe, c'est moi :-) Having said that I do try to program
> not only with my fictitious replacement in mind, should I be hit by the
> proverbial bus, but also my future self, and so tend to err on the side
> of 'simple'.

+1 "simple"
+1 "ego-less programming"

German, English, and now French?

>> Given that Python does not (yet) have a switch/case construct, does the table
>> idea assume a greater importance? Could it be (reasonably) expected that
>> pythonista will understand such more readily?
>>
>>
>> IMHO the table is easier to maintain - particularly 'six months later', but
>> likely 'appears' as a 'natural effect' of re-factoring*, once I've implemented
>> the beginnings of an if-ladder and 'found' some of those common follow-up
>> functions.
>> * although, like you, I may well 'see' it at the design-stage, particularly if
>> there are a number (more) cases to implement!
>>
>> Is functional "similar"[ity] (as above) the most-appropriate metric? What about
>> the number of decision-points in the code? (ie please re-consider "effectively
>> the same decision")
>>
>> 	# which data-function to execute?
>> 	if action==get
>> 		do get_data
>> 	elif action == add
>> 		do add_data
>> 	elif ...
>>
>> 	...
>>
>> 	# now 'the work' has been done, what is the follow-through?
>> 	if action=get
>> 		do present_data
>> 	elif action == add
>> 		report success/fail
>> 	...
> 
> In my current case this is there is a one-to-one relationship between
> the 'work' and the 'follow-through', so this approach doesn't seem that
> appealing to me.  However I have other cases in which the data to be
> displayed comes from multiple sources where the structure above might
> be a good fit.

I hope to have addressed this above.

To help (I hope), consider if, in the proverbial six-months time, you 
are asked to add logging to each of these various actions. Now, there 
are three tasks: 'work', 'follow-through', and 'logging'; to be done for 
each of the n-action choices.

Would an 'umbrella function' which acts as both the single destination 
for an action-choice, and as a 'distributor' for the various specific 
tasks that must be executed, start to appear more viable?

> Having said that, I do prefer the idea of having a single jumping off
> point, be it a dispatch table or a single if-then-else ladder, which
> reflects the actions which the user can take and where the unpleasant
> details of, say, how the data are gathered are deferred to a lower level
> of the code.

+1

>> Back to the comment about maintainability - is there a risk that an extension
>> requested in six months' time will tempt the coding of a new "do" function AND
>> induce failure to notice that there must be a corresponding additional function
>> in the second 'ladder'?
>>
>> This becomes worse if we re-factor to re-use/share some of the follow-throughs,
>> eg
>>
>> 	...
>> 	elif action in [ add, delete, update]
>> 		report success/fail
>> 	...
>>
>> because, at first glance, the second 'ladder' appears to be quite dissimilar -
>> is a different length, doesn't have the condition-clause symmetry of the first,
>> etc! So, our fictional maintainer can ignore the second, correct???
>>
>> Consider SRP again, and add DRY: should the "despatch" decision be made once, or
>> twice, or... ?
> 
> With my non-fictional-maintainer-cum-six-month-older-self hat on I think
> you have made a good case for the dispatch table, which is my latent
> preference anyway, especially in connection with the 'do/display'
> metafunctions and the fact that in my current case DRY implies that the
> dispatch decision should only be made once.

+1 Definitely!

See also @Wulfraed's point about OOP (Object-Oriented Programming)! If 
we were talking about people, then I'd expect to find a class Person, or 
similar, in the code. That means that "get" and "delete" might refer to 
database transactions. Hence, they should be part of the Person class, 
rather than functions 'in the wild'. Thus, where we have used the term 
"function" (or even "subroutine"), we should have said "method". A class 
is a (very good) way to 'collect' related functionality and keep it 
'together'!

Another aspect, following-on from UI comments (above). If you are using 
a framework, the presentation code will largely fit within those 
objects. Therefore, logically-separate from manipulating the 
source-object. Another consideration (maybe) for how to structure and 
relate the routines...

As a general rule, I keep print() out of functions which 'calculate' - 
even, out of the Person class. This facilitates re-use, where the next 
use may want to present the results differently, or merely to use the 
calculation as 'input' and not present 'it' at all!

> Thanks again for the input!

It is good that you are reviewing your code and considering alternate 
approaches! Many others 'here' will have benefited from considering your 
points/questions...

You may like to pass some information about the Free University:
- is Python the primary teaching language
- is Python widely used within various schools/departments
- is instruction in English, or...
- what does "Free" mean
- is it also $free
- is it open to (non-German) foreigners

Tschüss!
-- 
Regards =dn