The Importance of Terminology's Quality

Robert Maas, jaycx2.3.calrobert at spamgourmet.com.remove
Mon Sep 1 04:45:51 EDT 2008


> From: r... at rpw3.org (Rob Warnock)
> In the LGP-30, they used hex addresses, sort of[1], but the
> opcodes (all 16 of them) had single-letter mnemonics chosen so that
> the low 4 bits of the character codes *were* the correct nibble for
> the opcode!  ;-}

That's a fascinating design constraint! It would be an interesting
puzzle to find the most efficient design whereby:
- The single-character mnemonics are as easy to memorize as possible;
- The instructions produce as efficient code as possible;
- The mnemonics really do accurately express what the instruction does;
- Minimize total number of instructions needed, maybe fewer than 16;
- With the low-order-four-bits rule of course.
- See also the avoid-ambiguous-sound criterion later below.

By the way, do you remember exactly all the 16 opcodes, or have a
Web reference available?

> [Or you could type in the actual hex digits, since the low 4 bits
>  of *their* character codes were also their corresponding binary
>  nibble values... "but that would have been wrong".]

Moreso because some of the sounds would be ambiguous, which I
recognized when I first started to use the IBM hexadecimal
standard. See below.

> The LGP-30 character code was defined before the industry had
> yet standardized on a common "hex" [sic, "hexadecimal", base 16
> not base 6!!!] character set,

Before IBM had decided to use hexadecimal in their abend coredumps
from their System/360 and by their 800-pound-gorilla status they
got everyone else to use that ABCDEF system.

> they used "0123456789fgjkqw".

That doesn't make sense. The low-order four bits of those letters
aren't consecutive ascending values from 9+1 to 9+6. Did you make a
typo, or did you explain something wrong?

(map 'list #'char-code "0123456789fgjkqw")
=> (48 49 50 51 52 53 54 55 56 57 102 103 106 107 113 119)
(loop for n in * collect (rem n 16))
=> (0 1 2 3 4 5 6 7 8 9 6 7 10 11 1 7)
Now if you used this sequence of letters instead:
(map 'list #'char-code "0123456789jklmno")
=> (48 49 50 51 52 53 54 55 56 57 106 107 108 109 110 111)
(loop for n in * collect (rem n 16))
=> (0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
Unfortunately o looks like 0, and l looks like 1.

Anyway, what I hated about IBM's hexadecimal notation was:
- A0 would be pronounced "Ay-tee", which sounds too much like "Eighty".
- 1A would be pronounced "Ay-teen", which sounds too much like "Eighteen".
- On the lineprinters where we get our abend listings, the capital
   D and digit 0 (which didn't have any diagonal slash) looked almost
   identical when the ribbon was going bad, as it always was.
- Likewise B and 8 looked nearly identical.
- Likewise E and F often looked nearly identical of lower part of E
   was hitting bad part of ribbon.

Now for single-character mnemonics for four bits of instruction
opcode, to avoid any two characters that look too similar:
   0 1 2 3 4 5 6 7 8 9
     A B C D E F G H I J K L M N O
   P Q R S T U V W X Y Z
We obviously have no choice for KLMNO, so because of look-alike we
can't use 0 or D or Q, so we have to use P instead of 0, so our choices
look like:
     1 2 3 4 5 6 7 8 9
     A B C   E F G H I J K L M N O
   P   R S T U V W X Y Z
If we have an opcode that sets a register to 1, or clears
register#1, we might use mnemonic "1" for that instruction, but
otherwise we must avoid using digits, use letters only.
     1
     A B C   E F G H I J K L M N O
   P   R S T U V W X Y Z
We can't use both U and V, and we can't use both E and F, and we
can't use both 1 and I, but we're stuck using both M and N, sigh.
At this point I can't decide which branches of the search to
discard and which to fix, so I'll stop analysing this puzzle.
(With lower-case characters different combinations were mutually
 exclusive, such as l and 1, but I was doing upper case here.)

If punctuation is allowed, then + - * / = < > would make dandy
mnemonic opcodes for the obvious instructions.
If characters that look like arrows can be used for push and pop,
then we have V for push and ^ for pop.
If characters that look like arrows can be used for moving
left/right in RAM, or shifting bits in a register, then we have <
for left and > for right.

I wonder if solving this puzzle will yield yet another esoteric
programming language?



More information about the Python-list mailing list