how to avoid spaghetti in Python?

Chris Angelico rosuav at gmail.com
Tue Jan 21 15:58:19 EST 2014


On Wed, Jan 22, 2014 at 7:38 AM, CM <cmpython at gmail.com> wrote:
> 1) One of my main "spaghetti problems" is something I don't know what to ever call.  Basically it is that I sometimes have a "chain" of functions or objects that get called in some order and these functions may live in different modules and the flow of information may jump around quite a bit, sometimes through 4-5 different parts of the code, and along the way variables get modified (and those variables might be child objects of the whole class, or they may just be objects that exist only within functions' namespaces, or both).  This is hard to debug and maintain.
>

Rule of thumb: Every function should be able to be summarized in a
single line. This isn't Python-specific, but in the case of Python,
it's part of the recommendations for docstrings [1]. When one function
calls another function calls another and so on, it's not a problem if
each one can be adequately described:

def is_positive(item):
    """Ascertain whether the item is deemed positive.

    Per business requirement XYZ123, items are
    deemed positive at 90% certainty, even though
    they are deemed negative at only 75%.
    """
    return item.certainty >= 0.9 and item.state > 0

def count_positive(lst):
    """Return the number of deemed-positive items in lst."""
    return sum((1 for item in lst if is_positive(item)))

Each of these functions has a clear identity. (Okay, they're a little
trivial for the sake of the example, but you can see how this would
work.) Each one makes sense on its own, and it's obvious that one
should be deferring to the other. If business requirement XYZ123 ever
changes, count_positive's behaviour should change, ergo it calls on
is_positive to make the decision.

Rule of thumb: Anything that changes state should make sense. Neither
of the above functions has any right to *modify* lst or item (except
for stats, maybe - "time since last queried" could be reset). You
mention "variables getting modified", and then go on to use some
rather non-Pythonic terminology; I'm not entirely sure what you mean
there, so I'll ignore it and just say something that may or may not
have any relevance to your case: the function's one-line summary
should normally make it clear whether state is to be changed or not. A
function that queries something shouldn't usually change that state
(except when you read from a stream; there's a bit of a grey area with
retrieving the first element of a list, which shouldn't change the
list, vs retrieving the top element of a stack/queue/heap, which
possibly should, but then you'd call it "pop" to be clear).

Tip: Adding one-line descriptions to all your functions is a great way
to figure out (or force yourself to figure out) what your code's
doing. Having someone *else* add one-line descriptions to all your
functions is an awesome way to figure out where your function names
are unclear :) I had someone go through one of my open source projects
doing exactly that, and it was quite enlightening to see which of his
docstrings were majorly incorrect. Several of them ended up triggering
renames or code revamps to make something more intuitive.

ChrisA

[1] See PEP 257, http://www.python.org/dev/peps/pep-0257/



More information about the Python-list mailing list