PEP 3131: Supporting Non-ASCII Identifiers

Tue May 15 18:00:07 EDT 2007

On May 15, 10:18 am, René Fleschenberg <r... at korteklippe.de> wrote:
> Carsten Haese schrieb:
>
> > Allowing people to use identifiers in their native language would
> > definitely be an advantage for people from such cultures. That's the use
> > case for this PEP. It's easy for Euro-centric people to say "just suck
> > it up and use ASCII", but the same people would probably starve to death
> > if they were suddenly teleported from Somewhere In Europe to rural China
> > which is so unimaginably different from what they know that it might
> > just as well be a different planet. "Learn English and use ASCII" is not
> > generally feasible advice in such cultures.
>
> This is a very weak argument, IMHO. How do you want to use Python
> without learning at least enough English to grasp a somewhat decent
> understanding of the standard library? Let's face it: To do any "real"
> programming, you need to know at least some English today, and I don't
> see that changing anytime soon. And it is definitely not going to be
> changed by allowing non-ASCII identifiers.
snip

Another way of framing this discussion could be, "should
Python continue to maintain a barrier to it's use by non-English
speakers if it is not necessary?"

Virtually every guide to programming style I have ever read stresses
the importance of variable naming.  For example, the Wikipedia article
"programming style" mentions variable naming right after layout
(indentation, etc) in importance:

  "Appropriate choices for variable names are seen as the keystone
for good style. Poorly-named variables make code harder to read
and understand"

Even when English-as-non-native-language speakers can understand
English words, the level and speed of compression is often far below
that of their native language.  Denying the ability to use native
language
identifiers puts these people at a significant disadvantage compared
to English speakers with regard to reading (their own!) code.
And the justification for this is the hypothetical case that someone
who doesn't understand that language *might* *someday* have to
read it.  Besides the large number of programs that will never be
public (far larger than most of the worriers think is my guess), even
in public programs this is not necessarily a disaster.  A public
application
written in "Chinese Python" might work perfectly and be completely
usable by me, even if it is difficult for me to understand.  And why
should my difficulty count for more than a Chinese person's
difficultly
in understanding my "English Python" application?

That Python keywords are English is unimportant -- they are a small
finite set that can be memorized.  Identifiers are a large unbounded
set that can't be.

That the standard library code and documentation is in English
is irrelevant.  One shouldn't need to read the standard library code
to use it.  (That one sometimes has to is a Python flaw that should
be fixed -- not bandaided by requiring Python programmers to
know English).

There is no need to understand english to use the standard library.
Documentation has and will (as Python becomes more popular) be
translated into native languages.  Here is Python standard library
documentation in Japanese:

  http://www.python.jp/doc/release/lib/

While encouraging English in shared/public code is fine,
trying by enforce it by continuing to enforce ascii-only identifiers
smacks to me of a "whites only country club" mentality.

Making Python more accessible to the world (the vast majority
of whom do not speak English) can only advance Python.