PEP 3131: Supporting Non-ASCII Identifiers

Thu May 17 07:08:42 EDT 2007

> A possible modification to the PEP would be to permit identifiers to
> also include \uxxxx and \Uxxxxxxxx escape sequences (as some other
> languages already do).

Several languages do that (e.g. C and C++), but I deliberately left
this out, as I cannot see this work in a practical way. Also,
it could be added later as another extension if there is an actual
need.

> I think this would remove several of the objections: such as being
> unable to tell at a glance whether someone is trying to spoof your
> variable names,

If you are willing to run a script on the patch you receive, you
can perform that check even without having support for the \u
syntax in the language - either you convert to the \u notation,
and then check manually (converting back if all is fine), or you
have an automated check (e.g. at commit time) that checks for
conformance to the style guide.

> or being unable to do minor maintenance on code using
> character sets which your editor doesn't support: you just run the
> script which would be included with every copy of Python to restrict the
> character set of the source files to whatever character set you feel
> happy with. The script should also be able to convert unrepresentable
> characters in strings and comments (although that last operation
> wouldn't be guaranteed reversible). 

Again, if it's reversible, you don't need support for it in the
language. You convert to your editor's supported Unicode subset,
edit, then convert back.

However, I somewhat doubt that this case "my editor cannot display
my source code" is likely to occur: if the editor cannot display
it, you likely have a ban on those characters, anyway.

Regards,
Martin