[Tutor] How (not!) lengthy should functions be?

Thu Apr 16 23:03:48 CEST 2015

On 16/04/15 17:47, boB Stepp wrote:

> things too far? For instance I have a collection of functions that do
> simple units conversions such as:
>
> def percent2Gy(dose_percent, target_dose_cGy):
>     """
>     Convert a dose given as a percent of target dose into Gy (Gray).
>     """
>     dose_Gy = cGy2Gy((dose_percent / 100.0) * target_dose_cGy)
>     return dose_Gy

I tend to draw the line at single liners like the above because by 
introducing the function you are increasing the complexity of the 
program. However, very often this kind of thing is justified if:
1) it aids unit testing
2) it makes the code significantly more readable
3) You perform error detection or validation of parameters.

Note in this case you could just create an alias

percent2GY = cGy2Gy

although you then need to do the math when you call it
rather than inside the function.

> My current understanding of function length best practice is that: 1)
> Each function should have preferably ONE clearly defined purpose.

By far the most important criteria. It trounces all other arguments.

2) I have seen varying recommendations as to number of lines of code

Most of these come from the days when we worked on dumb terminals with 
24 line screens. Actual measurements has shown that function length 
(within reason!) is not a major factor in comprehension or reliability. 
In COBOL or C it is not unreasonable to have functions over 50 lines 
long, sometimes over a hundred. But in Python that would be very 
unusual. So I'd treat the advise to limit length to about 20 lines of 
executable code to still be valid, if you exceed it treat it as a red 
flag to check that you really need it to be that long.

But splitting a function just to keep the line count down is a
terrible idea and more likely to introduce bugs than fix them.
The single purpose rule is the guide.

> equally well to methods.

In practice methods are usually a little bit shorter on average.
That's mainly because the data tends to live in object attributes
and be pre-formed, whereas a non method often spends significant
time reformatting input to the best shape for the function. Also
the object attributes were hopefully validated on creation so
each method can trust them, again saving lines.

> Am I on-track or am I getting carried away?

Probably on track.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos