PEP 3131: Supporting Non-ASCII Identifiers

Sun May 13 19:45:56 EDT 2007

On Sun, 13 May 2007 10:52:12 -0700, Paul Rubin wrote:

> "Martin v. Löwis" <martin at v.loewis.de> writes:
>> This is a commonly-raised objection, but I don't understand why people
>> see it as a problem. The phishing issue surely won't apply, as you
>> normally don't "click" on identifiers, but rather type them. In a
>> phishing case, it is normally difficult to type the fake character
>> (because the phishing relies on you mistaking the character for another
>> one, so you would type the wrong identifier).
> 
> It certainly does apply, if you're maintaining a program and someone
> submits a patch.  In that case you neither click nor type the
> character.  You'd normally just make sure the patched program passes
> the existing test suite, and examine the patch on the screen to make
> sure it looks reasonable.  The phishing possibilities are obvious.

Not to me, I'm afraid. Can you explain how it works? A phisher might be
able to fool a casual reader, but how does he fool the compiler into
executing the wrong code?

As for project maintainers, surely a patch using some unexpected Unicode
locale would fail the "looks reasonable" test? That could even be
automated -- if the patch uses an unexpected "#-*- coding: blah" line, or
includes characters outside of a pre-defined range, ring alarm bells.
("Why is somebody patching my Turkish module in Korean?")

-- 
Steven