Long names are doom ?

Andrew Dalke dalke at acm.org
Fri May 25 21:42:05 EDT 2001


00001111 <00001111 at aol.com> wrote:
>  Anybody use variables/names longer than 31 character
>and finds it really useful ?

I've a couple 32+ character names.  One is
  write_sigfile_from_molecule_list

(This was found with: egrep '[^a-zA-Z_][a-zA-Z_]{32,}' *.py)

There are a lot of ways to access molecular information
(from a list, from a database, from a filename) and a
lot of formats to which it can be written so finding
a shorter name (like "write_sigfile" or "write_molecule_list")
is not really possible.

There is also an internal coding convention to use "_list"
instead of "s" and spell out "molecule" instead of "mol"
else I would use "write_sigfile_from_mols".

The words in this expression are also very common, which
means that while it is long, muscle memory helps in typing
because each individual word will frequently be used in any
program which uses "write_sigfile_from_molecule_list".

>- "never seen a well written, legible program
>  that uses any identifiers longer than 18-20 characters..".

Some programs are auto generated and not designed for
human readability.  They may use an encoding scheme to
describe certain variables.

>- "long variables names are *hard* to read.  And, you have to
>  read though all the characters of every instance of them...".

Only if there are multiple words with the same shape.  For
example:
  write_sigfile_from_molecule_list

is likely to be confused with
  write_sigfile_tron_molecule_list

but not with
  write_sigfile_from_database

If the word shapes are sufficiently different, then the
fact that the code works (or that the IDE understands it)
is sufficient to know it isn't mispelled.

Compare this to code I've worked with (FORTRAN and elsewhere)
which does things like:
  char *l;
  char **ll;
  char *l1, *l2;

so short words have identical problems - and with less ability
to use the word shape as a clue.

>- "it degrades the legibility of a program to use identifiers that
>  can't be easily remembered...."

  write_sigfile_from_molecule_list is harder to memorize than
say, "wsffml"?

But in truth choosing a short easily remember name is best,
if possible.  The example I gave is used perhaps once every
dozen programs and not worth the effort to find a good name,
since people will forget it and go the the index and search
for "write sigfile".

Go get a copy of "Code Complete" by Steve McConnell.  Chapter
9 (pp 185 - 213) is about "The Power of Data Names."  It
includes the following sections and gives reasons pro and
con for different styles.

  9.1 Considerations in Choosing Good Names
  9.2 Naming Specific Types of Data
  9.3 The Power of Naming Conventions
  9.4 Informal Naming Conventions
  9.5 The Hungarian Naming Convention
  9.6 Creating Short Names That Are Readable
  9.7 Kinds of Names to Avoid

For that matter, read the whole book.  He goes into readable
details on many of the practical aspects of writing code.

>As a result, despite 90% of computer languages have long, very
>long or 'infinite' identifiers, fortran folks seems plan to stay
>with their 6...aargh ...sorry this was just not far ago... 31 character
>limit intil year 3000.

That's a minimum implementation requirement, correct?  A vendor
could provide a compiler which allows variables with 84325 characters
in them, as I understand it.  If you've the need and have money, pay
one of the vendors for such a version.  If you have time, work
with g77.  If neither, do you really have a need?

                    Andrew
                    dalke at acm.org






More information about the Python-list mailing list