What makes code "readable"? (was Re: Python vs. Perl, which is better to learn?)

Mon May 6 11:22:08 EDT 2002

Mark McEahern wrote:
> When I was looking at that "readable" Perl
> code which George pointed us to (some comments
> probably forthcoming from me in a reply to his
> post, but in any event thanks George! :)  I started
> to ponder that question.  
> 
> "What makes code readable?"
> 
> (Actually it was "why do I still not find this highly
> readable?")
> 
> One thing that occurred to me was that the Perl code had a
> very high number of "transitions" between punctuation and
> text.
> 
> Effectively every line, and sometimes literally
> a dozen times within a line, text and symbols are
> mixed.  Not just the odd parenthesis or period, but
> great streams of that infamous Perl "line noise".
> 
> I think a high "symbol-set-transition rate" (please
> offer a better term) leads to low readability.
> 
> Punctuation itself also inherently lowers
> readability, I believe, which is a reason I find
> assembly easier to read than Perl, though clearly
> less productive.

Oh I dont know I think its just possible though sometimes
difficult for me to articulate if articulate is the right
word which it might be but I digress lets get back to the
point I think punctuation actually does lend a lot of 
readability in both programs and prose Indeed I always find
it pretty annoying when people leave punctuation out of
their email posts and I suspect you know what I mean by
now This is particularly relevant when there are many twists
and turns to be followed as in this passage with parenthetical
clauses and in highly nested code

;-)

Seriously, though I think that punctuation aides readability.

The real issue with Perl is probably the degree to which
punctuation characters are not used *as* punctuation, but
rather as *words* in their own right.  Python, of course, does
use punctuation too, but it is more like the traditional
uses: '.' divides methods and attribs from their class,
':' demarks a block of code -- though the most potent form
of punctuation in Python is the use of spaces and tabs
which create a strong visual grouping of the code.  You
don't really have to think of these symbols individually --
they are the "negative space" around the tokens you are
thinking about (though they do affect your understanding
of the relationships between tokens).

For me (very visually oriented), this is really useful. If
I want to find something in my code, I often don't really
read it -- I remember what the code "looks like" which 
more or less boils down to the indenting pattern.

This is precisely why I'm so picky about indenting and
bracket placement in C.  With Python, the code pretty much
"looks" the same, regardless of who writes it, and that's
very helpful.

Some people are really into long variable names for
clarity,  but I think there's actually a tradeoff.  Sometimes
a short symbol greatly increases readability. If I'm
going to use a variable a lot (especially in an equation
or set of them), it's often convenient to use short
symbols like:

c # the speed of light
i # index counter

etc.  Instead of "lightspeed" or "count" or something.  It
also reduces the number of "lihgtspeed" errors I have to
fix. :-D

In fact, if I use "c" for the speed of light, I'll be understood
by physics-oriented programmers in any language, whereas the
word is language-specific.

Long names are mostly useful when the definition creeps
far away from the usage, or when it's a concept that isn't
very familiar, or has no common abbreviation.

Short names are particularly good in mathematically-intense
blocks, where long names would interfere with the flow
of the equations, or obscure the structure that you're
trying to describe.

Another useful criterion (probably a bigger deal for more
auditory-oriented folks) is "can you pronounce it?".

Seriously, if you can read the code aloud (especially in
an unambiguous way) this may be very helpful.  The
punctuation characters tend to have no fixed pronunciation,
and even if you could pronounce it, it would often
be essentially gibberish.  If you are thinking of it
as a mathematical equation, that *might* be okay (though
I think this is a factor in mathematics, too), but most
algorithms lend themselves to be described as a process,
as if you were reading a cookbook to someone. If you
can do this aloud, then you can also read it in your
head with less cognitive dissonance (e.g. not switching
between verbal and non-verbal thought all the time).

In reading the "line noise" bits of some programs I 
find myself thinking along the lines of "... then we
compare _hmm_ with _hmm_ and then _hmm_ _hmm_ _hmm_ ..."
where each "_hmm_" is me losing my verbal focus, and
substituting some unpronounceable concept. Since it's
not associated with a word, it's harder to keep track
of in your head, so it's probably easier to forget
what you're doing.  Most of the "words" in a Python
program, on the other hand, are pronounceable words,
which means you can stick a verbal tag on them, and
keep track of it better. Even if it's "pee-open" and
"eff-stat" and other non-English words, you still
tend to develop a fixed pronounciation for them.

Just IMHO, of course,

Terry

-- 
------------------------------------------------------
Terry Hancock
hancock at anansispaceworks.com       
Anansi Spaceworks                 
http://www.anansispaceworks.com 
------------------------------------------------------