[Tutor] pylint(too-many-nested-blocks)

dn PyTutor at DancesWithMice.info
Sun Nov 28 03:45:50 EST 2021


On 28/11/2021 16.51, Phil wrote:
> "https://pycodequ.al/docs/pylint-messages/r1702-too-many-nested-blocks.html"
> describes this error as:
> 
> "Used when a function or method has too many nested blocks. This makes
> the code less understandable and maintainable."
> 
> The following is one of several functions that fits this description and
> I'm wondering how I might reduce the number of for-loops. If the number
> of for-loops cannot be reduced then I suppose the function should be
> broken up into smaller functions. The problem is, as I see it, little of
> the code is reusable in other functions that also have too many nested
> blocks.
> 
>     def pointingPairRow(self):
>         """
>         If a candidate is present in only two cells of a box, then it
> must be the
>         solution for one of these two cells. If these two cells belong
> to the same row,
>         then this candidate can not be the solution in any other cell of
> the same row.
>         """
>         box_start = [(0, 0), (0, 3), (0, 6),
>                     (3, 0), (3, 3), (3, 6),
>                     (6, 0), (6, 3), (6, 6)
>                     ]
> 
>         for x, y in box_start:
>             pairs = Counter()
> 
>             for number in range(1, 10):
>                 number= str(number)
> 
>                 for row in range(x, x + 3):
>                     for col in range(y, y + 3):
>                         if len(self.solution[row][col]) > 1 and number
> in self.solution[row][col]:
>                             pairs[number] += 1
> 
>             for item, count in pairs.items():
>                 if count == 2:  # candidate
>                     for row in range(x, x + 3):
>                         icount = 0
>                         col_list = []
> 
>                         for col in range(y, y + 3):
>                             if item in self.solution[row][col]:
>                                 icount += 1
>                                 col_list.append(col)
> 
>                             if icount == 2:
>                                 for c in range(self.num_cols):
>                                     if  len(self.solution[row][c]) > 1
> and item in self.solution[row][c] and c not in col_list:
>                                         self.solution[row][c] -= set(item)
> 
> Also, the 3 for-loops before "if count == 2" don't need to be repeated
> once the candidate has been removed from the row with
> "self.solution[row][c] -= set(item)". Setting a boolean switch here to
> prevent the for-loops from continuing does the job but it adds yet
> another "if" statement.
> 
> The function works and so it's logically correct but it's messy and
> inefficient.


Does Python itself see this as an error and stop working, or is it only
the code-checker s/w?

If Python, then I've never reached that point, which makes you better
than me - or does it?


Were we conducting a Code Review, this function would definitely attract
criticism. You have already recognised the complexity, but seem somewhat
satisfied - on the  grounds that the whole works. (which may be as good
a measure as any other - depending upon application!)

The two (easily agreed) criticisms are "messy" and "inefficient".

Here are two more ideas: "cyclomatic complexity" and "readability".

Some say that code should be written with the view that it is more
important that it can be read by humans than by computers. This seems
contra-indicated when the process of programming is to 'instruct the
computer'. However, consider the life-time of a program[me]. Also, that
computers 'read' code with a very narrow and dogmatic point-of-view,
compared with us-peoples.

We also talk about your future-self, meaning you in (a notional)
six-months' time. In other words, someone who is reading this for the
first time ever, or you, after sufficient time has passed that a lot of
your 'thinking' that went into the construction of this code has
disappeared from even the back of your mind.

Question: is it/will it be easy to read?

Here's where things become rather subjective. You (no criticism) seem to
have a mathematical view of programming, as revealed by your choice of
example-problems and love of abbreviated identifiers. That said, a
docstring describes the function - so gold-star for that - even if
"candidate" and "solution" are beyond my ken.

Perhaps the point to invite some thought into comparing the 'how' and
the 'why' of a piece of code - two important questions in the mind of a
reader/reviewer!

I have been criticised (but exhibit little remorse) for breaking things
down into "smaller" units of code than others feel necessary. Maybe
you'd agree with that. Like I said "subjective"! Hasn't "stepwise
decomposition" been mentioned before?


Now is a good time pick-up the testing tool's feedback: "understandable
and maintainable". It's not merely a matter of
"readability"/"understandability", but also maintainability - and that
of the objectives of the tool: "testability".

How can you prove to me that this code actually does do 'what it says on
the tin [/can]'? Let's say that there is a suite of testing code, and
one of those indicates that there is an error somewhere 'around' the
innermost/most deeply nested (and last) for-loop. How can you (or I, or
A.N.Other) proceed to correct ("maintain") this code? How much larger
will the problem seem, if it is not revealed by the test-suite, but is
'discovered' by some (not-so) innocent user in the proverbial
six-month's time?

How easy will it be to see if this notional problem (I'm not saying
there is one!) if 'inside' that for-loop, or is somewhere in the code
which manipulates the data which is subsequently 'fed' into the loop?

Another consideration: if the preceding steps (prior to that inner
for-loop) are 'expensive', particularly in terms of time, how many
test-cycles, checks, and detection-runs can you afford? If it were
'cheaper'/faster to test, how many now?


I like the old saw: that if "debugging" is taking the bugs out of code,
then programming must be the process of putting them in! Accordingly,
there is a lesson many find hard to learn/admit to themselves - it is
better to program[me] 'defensively' (in the expectation of errors), that
to code arrogantly (overly-impressed with one's own abilities,
assumptions of perfection, feelings of invincibility,  ...).
- or maybe that's an admission that I'm just not very good at it?

If the for-loop is detached from the rest of the code, can you give it a
name? If-then, you can also test it in isolation from the rest of the
functionality. This morning I needed to add the ability to remove a
single item (pop) from a multi-dimensional data-structure, and further
functionality to remove an entire 'row' (sound familiar?). I could have
'tested' these 'within' the application which uses the data-structure.
Why not? That's where the code is actually being used - and if it works
there, it must 'work'! Right? When the two test harnesses were
constructed and the code seemed to be working, one of them failed at the
last test-step. It had 'passed' before! It failed because of a 'fix',
installed to cover another previously-failing test-step. Oops!
(yes, big surprise, I make programming errors - but don't bother to
circle this date in the calendar, you'll quickly run out of ink, and
obliterate your calendar should you monitor me that closely!)

The reason why the small(er) code-units were worth what may have seemed
like 'more effort than necessary', was because it was so much quicker
and easier to test the code as I wrote it*, it was easy to find/be shown
errors as they were made, and it was easy to see that a 'fix' in one
place, caused an issue under another set of circumstances. All while the
actual code was fresh in my mind, and not when I would be more
interested in the application doing what I expect of it.

Incidentally, should a 'six-month' embarrassment appear, the test-suite
will still be available - and will continue to ensure any future 'fixes'
don't cause similar regression-errors!)

* in fact, the tests were written before the code - but considering
all-else I'm throwing at you, "Test-Driven Development" may be one
(slightly less-topical) reference, too many

There has been plenty of research into the cost of errors. All really
boil-down to the idea that the 'later' an error is discovered, the more
expensive will be 'the fix'!

There are many 'guides' claiming to know how long is the 'right' length
for a function/procedure/block of code. However, the number of lines of
code in a function is no guide to its complexity. A list-comprehension
might fit into one line of code, but it is more complex than its two,
three, or even four line 'basic' equivalent. A better measure is "McCabe
Complexity" which is like establishing a 'reading level' for code
(instead of books for children of different ages/language abilities).
The wider term is "Cyclomatic Complexity".

Cyclomatic Complexity measures the number of coding-constructs there are
in a unit of code, eg if statements, loops, etc. The more there are, the
more logical 'paths' exist between 'start' and 'end'. This is the
concern of your static-test tool. The more paths, the harder it is to
test that each possibility has been tested/checked/proven, and the more
combinations of events and conditions that must (all?) be considered.
Here we can mention the KISS principle (not a principle of programming
per-se). Nuff said?


Earlier the idea of naming a chunk of code was mentioned. "Chunking" is
a term from psychology. It is a unit of thought. When coding a routine,
that unit might be a single line-of-code (it will vary with experience
and expertise). When a bunch of lines can be wrapped up into a single
unit - and that unit named, now we have a much larger 'chunk', and
reduced psychological/mental effort to remember what it does and where
it 'fits in'!

A related thought: If an author cannot describe what a section of code
is doing, there's a problem no amount of Python will fix!

If that name includes a conjunction such as "and", eg "choose which row
and wipe its data", then this indicates (at least to this author) that
the function is trying to do (two) too much. This assessment comes from
a programming ideal known as SRP - the "Single Responsibility
Principle". The idea that a piece of code has one objective also means
that it becomes much easier to locate where in a mass of code a problem
is likely to have arisen. It also makes the block very easy to test -
either it does ("what it says on ...") or it fails!

There are also ComSc concepts of "dependent" and "independent" linkages.
If code-unit "b" can't run unless unit "a" has previously executed,
these two are not independent. Certainly "b" is not "reusable" as and by
itself! High dependence is one enemy of "testability" and thus probably
also maintainability, etc, etc.

Think about list.append(). It does require that there is a list, and
some data to add - but they are inputs, not dependencies. It will add
what is 'input' to what is there already. There is no requirement that
the list be anything more than an empty list. Similarly, there is no
problem if the list already contains many, many items. The 'new stuff'
is added to the existing 'end' of the list. There is no requirement that
we tell the list how long it is - and thus 'where' to add the new stuff.
So, we can use list.append() without first using Len( list ). Similarly,
if a unit of code performs some action on one or more of the rows of
your data, does it need to know of previous calculations or arrangements
performed on those rows of data previously? Probably not - and if it
currently does, the code can likely be rearranged to obviate such
dependence.

Now, before dreams of nirvana over-take us, the cyclomatic complexity
and/or a replacement concept still exists in the way one must network or
'stitch-together' all of the smaller code-modules. However, (back to
"chunking") if one has a function called "search_row_for_palindromes()",
the thinking required to fit that into a series of similarly well-named
routines takes place at a much 'higher level' of problem-solving, than
the thought-processes involved in writing an if-statement inside a
for-loop which calls a palindrome-detection function!

Which all goes to show: there's no silver bullet!


(I'll leave you to research the topics-mentioned that may interest you,
as you see fit...)
-- 
Regards,
=dn


More information about the Tutor mailing list