Multibyte Character Surport for Python

Magnus Lyckå magnus at thinkware.se
Wed May 15 10:52:02 EDT 2002


Jacob Hallen wrote:

> I am Swedish and English is not my first language.


Me too. :)


> My view is that Python source code should be UTF-8, so that you can represent
> multilingual strings in a readable way. However, I still think that
> identifiers should be limited to ASCII. 


I agree with Alex in that it's usually best to use English in
programming for many reasons. Although one could argue whether
it should be policy or a python limitation that determine how you
write identifiers. I think it might be better if we could decide
such things through discussions and project policies, rather than
through programming language limitations.

I do see two reasons to allow national identifiers:

1. The closer a piece of source code describes the actual problem
    it is meant to handle, the better it is. We all know the virue
    of choosing good variable names etc.

    In customer specific, large software dvelopment projects,
    there are often a lot of project specific terminology, and big
    glossaries that are used to describe parts of the problem
    domain in detail. For instance I've worked in state authorities
    that have a lot of laws and regulations that control the things
    that the programs will do. In these situation, running the
    entire project in English is not an option. The Swedish law
    is still written in Swedish ;). Translating the whole project
    glossary for the sake of a programming language is not a good
    thing. It's enough to learn the terminology of the end users if
    you are a programmer. If you also have to learn a second set of
    synonyms that doesn't make sense to the end users, you will be
    bound to make more mistakes in the code. For Swedish it's not a
    huge problem. You write åäö like aao, and it looks a bit ugly,
    but most of the time you understand what is meant. In the worst
    case you write aa, ae, oe instead. In other languages, US ASCII
    might be much more limiting and problematic.

2. I think a real programmer must know English. It's as simple
    as that. Just as en good electronics engineer needs English,
    and doctors need some Latin. Without it you are isolated from
    your peers. But if we want to see Python as a tool for
    non-programmers and beginners, things are a bit different. For
    me it's just been annoying that Swedish Excel macros used "om"
    and "medan" instead of "if" and "while", but we have to realize
    that we can't place the same requirement on a beginner or a
    casual user, as we can on a pro. Again, I don't think it's a
    big issue in Sweden and similar countries, both because the
    alphabet is so similar, and because most people here know
    English, but for many people, lack of English knowledge will
    be the big hurdle in learning Python. To be able to REALLY
    utilize python, they would need to know enough english to
    understand how the library modules are used, so allowing
    Unicode identifiers is just a small relief, but I can sell
    imagine that it would matter for people like Steven's Japanese
    students...

The main issue in my opinion has nothing to do with Python, but
with operating systems, internet standards, editors and all this
other junk that assumes that anything except US ASCII is some kind
of obscure things that we can disregard.

If our general computer environments were adapted to Unicode and
able to let us enter and display all symbols on all computers, the
issue of Python identifiers might be an issue to consider. Before
that is solved, it would probably create more problems than it
solves.

There is a general problem of "cultural imperialism" to consider
here, and I think we all need to understand the cultural and
social problems involved in "forcing" people to use English,
whatever we personally think about it. It's certainly easier for
everybody to learn one second common language, than to all learn
all the other languages, and I don't think Esperanto or Interlingua
will have a chance in beating English as a Lingua Franca. Still,
many people resist and try to fight anglification--on different
levels.


> Jacob Hallén


BTW, thanks for the effort to form Python Business Forum. I
hope it will make some impact. (You don't remember me do you?
GothCon IV to ..., CD 88-90 etc... No, I didn't think so... ;^)

/Magnus Lyckå




More information about the Python-list mailing list