PEP 3131: Supporting Non-ASCII Identifiers

Alex Martelli aleax at mac.com
Sun May 13 18:35:15 EDT 2007


Bruno Desthuilliers <bdesth.quelquechose at free.quelquepart.fr> wrote:

> > Disallowing this does *not* guarantee in any way that
> > identifiers are understandable for English native speakers.
> 
> I'm not an English native speaker. And there's more than a subtle 
> distinction between "not garantying" and "encouraging".

I agree with Bruno and the many others who have expressed disapproval
for this idea -- and I am not an English native speaker, either (and
neither, it seems to me, are many others who dislike this PEP).  The
mild pleasure of using accented letters in code "addressed strictly to
Italian-speaking audiences and never intended to be of any use to
anybody not speaking Italian" (should I ever desire to write such code)
pales in comparison with the disadvantages, many of which have already
been analyzed or at least mentioned.

Homoglyphic characters _introduced by accident_ should not be discounted
as a risk, as, it seems to me, was done early in this thread after the
issue had been mentioned.  In the past, it has happened to me to
erroneously introduce such homoglyphs in a document I was preparing with
a word processor, by a slight error in the use of the system- provided
way for inserting characters not present on the keyboard; I found out
when later I went looking for the name I _thought_ I had input (but I
was looking for it spelled with the "right" glyph, not the one I had
actually used which looked just the same) and just could not find it.

On that occasion, suspecting I had mistyped in some way or other, I
patiently tried looking for "pieces" of the word in question, eventually
locating it with just a mild amount of aggravation when I finally tried
a piece without the offending character.  But when something similar
happens to somebody using a sufficiently fancy text editor to input
source in a programming language allowing arbitrary Unicode letters in
identifiers, the damage (the sheer waste of developer time) can be much
more substantial -- there will be two separate identifiers around, both
looking exactly like each other but actually distinct, and unbounded
amount of programmer time can be spent chasing after this extremely
elusive and tricky bug -- why doesn't a rebinding appear to "take", etc.
With some copy-and-paste during development and attempts at debugging,
several copies of each distinct version of the identifier can be spread
around the code, further hampering attempts at understanding.


Alex



More information about the Python-list mailing list