[Python-3000] PEP 3131: what are the risks?

Mon Jun 11 16:43:35 CEST 2007

On 6/10/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> Still, what is the risk being estimated? Is it that somebody
> maliciously tries to provide patches that use look-alike
> characters? I honestly don't know what risks you see.

Here are the top three that I see; note that none of these concerns
say "Don't use non-ASCII ids".  They do all say "Don't use ids from a
script the user hasn't said to expect".

(1)  Malicious user is indeed one risk.  A small probability, but a
big enough loss that I want a warning when the door is unlocked.

(2)  Typos is another risk.  Even in mono-lingual environments, it is
possible to get a wrong letter.  If you're expecting ì, it is fine.
If you're not, then it shouldn't pass silently.

(3)  "Reados".  When doing maintenance later, if I wasn't expecting ì,
I may see it as a regular i, and code that way.  Now I have two
doppelganger/döppelganger variables (or inherited methods) serving the
same purpose, but using different memory locations.

Ideally, the test cases will catch this.  In real life, even the
python stdlib has plenty of modules with poor test coverage.  I can't
expect better of random code, particularly given that it has chosen to
ignore the style-guide (and history) about sticking to ASCII for
distributed code.  (Learning to store your tests generally comes long
after picking up the basic style guidelines.)

-jJ