PEP 3131: Supporting Non-ASCII Identifiers

John Nagle nagle at animats.com
Sun May 13 13:30:11 EDT 2007


Martin v. Löwis wrote:
> PEP 1 specifies that PEP authors need to collect feedback from the
> community. As the author of PEP 3131, I'd like to encourage comments
> to the PEP included below, either here (comp.lang.python), or to
> python-3000 at python.org
> 
> In summary, this PEP proposes to allow non-ASCII letters as
> identifiers in Python. If the PEP is accepted, the following
> identifiers would also become valid as class, function, or
> variable names: Löffelstiel, changé, ошибка, or 売り場
> (hoping that the latter one means "counter").


> All identifiers are converted into the normal form NFC while parsing;
> comparison of identifiers is based on NFC.

     That may not be restrictive enough, because it permits multiple
different lexical representations of the same identifier in the same
text.  Search and replace operations on source text might not find
all instances of the same identifier.  Identifiers should be required
to be written in source text with a unique source text representation,
probably NFC, or be considered a syntax error.

     I'd suggest restricting identifiers under the rules of UTS-39,
profile 2, "Highly Restrictive".  This limits mixing of scripts
in a single identifier; you can't mix Hebrew and ASCII, for example,
which prevents problems with mixing right to left and left to right
scripts.  Domain names have similar restrictions.

				John Nagle



More information about the Python-list mailing list