Web Frameworks Excessive Complexity
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Tue Nov 20 20:43:57 EST 2012
On Tue, 20 Nov 2012 20:07:54 +0000, Robert Kern wrote:
> The source of bugs is not excessive complexity in a method, just
> excessive lines of code.
Taken literally, that cannot possibly the case.
def method(self, a, b, c):
do_this(a)
do_that(b)
do_something_else(c)
def method(self, a, b, c):
do_this(a); do_that(b); do_something_else(c)
It *simply isn't credible* that version 1 is statistically likely to have
twice as many bugs as version 2. Over-reliance on LOC is easily gamed,
especially in semicolon languages.
Besides, I think you have the cause and effect backwards. I would rather
say:
The source of bugs is not lines of code in a method, but excessive
complexity. It merely happens that counting complexity is hard, counting
lines of code is easy, and the two are strongly correlated, so why count
complexity when you can just count lines of code?
Keep in mind that something like 70-80% of published scientific papers
are never replicated, or cannot be replicated. Just because one paper
concludes that LOC alone is a better metric than CC doesn't necessary
make it so. But even if we assume that the paper is valid, it is
important to understand just what it says, and not extrapolate too far.
The paper makes various assumptions, takes statistical samples, and uses
models. (Which of course *any* such study must.) I'm not able to comment
on whether those models and assumptions are valid, but assuming that they
are, the conclusion of the paper is no stronger than the models and
assumptions. We should not really conclude that "CC has no more
predictive power than LOC". The right conclusion is that one specific
model of cyclic complexity, McCabe's CC, has no more predictive power
than LOC for projects written in C, C++ and Java.
How does that apply to Python code? Well, it's certainly suggestive, but
it isn't definitive.
It's also important to note that the authors point out that in their
samples of code, they found very high variance and large numbers of
outliers:
[quote]
Modules where LOC does not predict CC (or vice-versa) may indicate an
overly-complex module with a high density of decision points or an overly-
simple module that may need to be refactored.
[end quote]
So *even by the terms of this paper*, it isn't true that CC has no
predictive value over LOC -- if the CC is radically high or low for the
LOC, that is valuable to know.
> LoC is much simpler, easier to understand, and
> easier to correct than CC.
Well, sure, but do you really think Perl one-liners are the paragon of
bug-free code we ought to be aiming for? *wink*
--
Steven
More information about the Python-list
mailing list