Python paradigms

Tue Apr 11 09:24:12 EDT 2000

Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
> In article <8ctkm5$bko$2 at newshost.accu.uu.nl>,
> Martijn Faassen <m.faassen at vet.uu.nl> wrote:
>>Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
>>[if your function gets longer than a screenfull, it's time to start
>>thinking about splitting it up]
>>
>>> It really is quite amusing to see these same old myths coming back
>>> every few decades :-)
>>
>>You think splitting a function up is a myth? I find it helps 
>>me and my code. It becomes more readable and writeable. I'm genuinely
>>surprised by your assertion this is an 'old myth'.

> The myth is that splitting functions up necessarily improves clarity.
> Sometimes it does; sometimes it doesn't.

It depends on how you do the splitting, I'd still say. I think the
cases are rare that one long complicated function is more readable
than two or three or more *well-chosen* shorter ones.

>>> That particular dogma was very popular back in the late 1960s and
>>> early 1970s, when practical programmers pointed out that it was a
>>> virtually impossibility for a normal person to keep track of 50
>>> trivial functions in their head, but it was quite easy to look
>>> at 60 lines of sequential code.  That didn't get listened to
>>> then, either!
>>
>>Of course if you go about randomly chopping up your function into
>>multiple ones, that's not going to improve matters. The idea is
>>to conceptually split your function into multiple smaller ones
>>that make more sense, are more easy to adapt, are coherent by 
>>themselves, and make the original big function more readable, as
>>by writing all the smaller functions you've hopefully also given
>>them names. It's called refactoring, of course. :)

> The dogma is that there is a maximum size of function that should
> ever be written,

Putting a strict limit on the number of lines is of course silly.

> and that complex functions necessarily split up
> cleanly into smaller sections.

I think they do; perhaps not multiple functions per se, but multiple
classes, definitely. Particularly in a language like Python!

> Sometimes they do; sometimes they
> don't.  I have seen codes where the median size of functions was
> below 5 lines, and you had to keep the specification of over 50 in
> your head just to read the simplest function!

Didn't the functions have readable names then? Of course it's silly
to take all this to ridiculous extremes, but if the naming was
done well you have a lot of contextual information when you're reading
a function which you wouldn't have otherwise.

>>My guidelines are:
>>
>>If your function is long and complicated, too many concepts
>>and abstractions are mingling in the same place. Do a better job
>>of abstracting stuff, and you'll end up with shorter, clearer
>>functions.

> Sometimes.  I have very often done that and backed off, because the
> complex function was clearer.  Another good rule is, if auxiliary
> functions need to access more arguments and artificial global
> variables than they have lines of code, consider whether they would
> be better written inline.

If you run into that, use records or classes. OO is pretty good at 
this; you can still bundle a set of methods into an object. The
object can do complicated things, but its parts are simple.
I don't recall cases where this was impossible, but then again my experience
may be too limited.

>>If your function is long and repetitive, you're not being a lazy
>>enough programmer. Write some smaller functions that generate the
>>stuff, read it from a file or database, whatever.

> If the repetitiveness is not quite regular enough to make that easy,
> you are making unnecessary work for yourself.  Sometimes that is the
> right solution; sometimes it isn't.

The repetitiveness is always regular enough to abstract it into
a bunch of classes, make a dictionaries of lists or whatnot. If there
is no regularity at all repetitiveness isn't the problem.

>>Are you saying these guidelines are based on a myth?

> No - the myth is that they invariably improve things.  Life is not
> that simple.

No, they don't improve anything if you don't think well before you
refactor. But I do believe it's always possible to split up longer
complicated functions into smaller ones (or objects), given enough
thought, and that this makes the code more clear.

>>> My current infliction is a accounting file format, and the the
>>> function to check the input (you DO check your input, don't you?)
>>> needs to perform 100 tests in series.  Yes, I am inventing
>>> trivial functions, but it would be clearer to put more of the
>>> tests inline.
>>
>>Why not split this up into a bunch of validation objects or 
>>something, and then check the input that way? And aren't at least
>>some of these tests similar enough to abstract into a function
>>(such as tests for maximum various lengths of strings). Of course
>>the accounting file format may be so baroque your function just
>>has to be baroque, but that sort of thing *ought* to be rare. :)

> That is, of course, what I am doing.  But designing in enough
> commonality takes a lot longer (and is more complex) than simply
> repeating the very simple tests.  After all, it gains you nothing
> by calling an auxiliary that then has a massive switch statement
> with special code for each of its calls - and some people do write
> code like that!

> In this particular case, there are 50-60 entries, which need
> 40-50 separate test conditions.  By thinking, I can common up
> quite a lot of that code, but there is some which is not amenable
> to that.

Yes, but even that code could probably be split out into some
way so that it's clear that something special, uncommon is happening, 
right? And that would likely be clearer than it just being thrown
in a big function, I think? More work, yes, perhaps. At least more
up-front work.

> If you remember, I asked whether there were any equivalent to a
> couple of very common paradigms, that have been used to clarify
> code for over 30 years.

Obfuscate code at least as often as clarify.

> The answer appears to be "no", so I have
> to use somewhat less satisfactory ones.  This is not a major
> problem - been there, done that, in dozens of languages.

In some cases, perhaps you'll have to use less satisfactory ones. I think
in most cases the alternatives are *more* satisfactory, however.

> What isn't acceptable, however, is putting up with people saying
> that it is heretical to admit that such requirements exist!  Fads
> are one thing, but dogma is harmful.

Fads, dogma, but what about wisdom, though? Python's idea is that ?: and
assignment-in-expression both tend to obfuscate the code. It'd be nice
to have something replacing the assignment-in-expression in the example
in your first post, but I disagree that your ?: construction makes the
code more easy to read than than various alternatives. I can grok C's
(?:), but its main advantage is that it makes code more writeable, not
more readable, in my opinion. Likewise any replacement to enable the
niceness assignment-in-expression allows (in 'if' and 'while' loops) will
*not* be assignment-in-expression but something more readable and 
less general.

In general, the pattern in Python is that it avoids several handy but
easy to convolute constructs; for instance we have no C style 
for(;;) loop either, while it's clearly the general case. Instead in
Python you tend to work with more specific idioms like its own 'for' loop, 
'range', and lists. This appears to make the code more readable.

People have been showing you some examples of special case idioms that
in part take over the functionality of (?:). I think the idea is that
such special case idioms help in making the code more readable, and catch
90% or more of the cases.

No 'heretical'. Just trade-offs. Python's idioms for writing Python. The
strategy does have a pay-off: Python code tends to be fairly readable.

Regards,

Martijn
-- 
History of the 20th Century: WW1, WW2, WW3?
No, WWW -- Could we be going in the right direction?