ASCII and Unicode [was Re: Managing Google Groups headaches]

Chris Angelico rosuav at gmail.com
Fri Dec 6 18:42:13 EST 2013


On Sat, Dec 7, 2013 at 6:00 AM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
>     - character 33 was permitted to be either the exclamation
>       mark ! or the logical OR symbol |
>
>     - consequently character 124 (vertical bar) was always
>       displayed as a broken bar ¦, which explains why even today
>       many keyboards show it that way
>
>     - character 35 was permitted to be either the number sign # or
>       the pound sign £
>
>     - character 94 could be either a caret ^ or a logical NOT ¬

Yeah, good fun stuff. I first met several of these ambiguities in the
OS/2 REXX documentation, which detailed the language's operators by
specifying their byte values as well as their characters - for
instance, this quote from the docs (yeah, I still have it all here):

"""
Note:   Depending upon your Personal System keyboard and the code page
you are using, you may not have the solid vertical bar to select. For
this reason, REXX also recognizes the use of the split vertical bar as
a logical OR symbol. Some keyboards may have both characters. If so,
they are not interchangeable; only the character that is equal to the
ASCII value of 124 works as the logical OR. This type of mismatch can
also cause the character on your screen to be different from the
character on your keyboard.
"""
(The front material on the docs says "(C) Copyright IBM Corp. 1987,
1994. All Rights Reserved.")

It says "ASCII value" where on this list we would be more likely to
call it "byte value", and I'd prefer to say "represented by" rather
than "equal to", but nonetheless, this is still clearly distinguishing
characters and bytes. The language spec is on characters, but
ultimately the interpreter is going to be looking at bytes, so when
there's a problem, it's byte 124 that's the one defined as logical OR.
Oh, and note the copyright date. The byte/char distinction isn't new.

ChrisA



More information about the Python-list mailing list