[Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers

Stephen J. Turnbull stephen at xemacs.org
Sat May 3 20:34:32 CEST 2014


Steven D'Aprano writes:

 > If I name a variable "x2", what is the "one simple or obvious 
 > interpretation" that such an identifier presumably has? If standard, 
 > ASCII-only identifiers don't have a single interpretation, why should 
 > identifiers like σ² be held to that requirement?

Because subscripts and superscripts are syntactic constructs, and
naturally decompose into two identifiers in a specific relationship
(even if that relationship cannot be further specified without going
deep into some domain of discourse) -- and that is is much of the
motivation for wanting to use them.  "x2" does not carry that load.

Note that Unicode itself considers them *compatibility* characters and
says:

    Superscripts and subscripts have been included in the Unicode
    Standard only to provide compatibility with existing character
    sets.  In general, the Unicode character encoding does not attempt
    to describe the positioning of a character above or below the
    baseline in typographical layout.

In other words, Unicode is reluctant to guarantee that x2, x², and x₂
are actually different identifiers!  It's considered bad practice to
treat them as the same, but not actually forbidden.

At least 2 technical reports (#20 and #25) discourage their use except
in the case where they are letter-like (phonetic transcriptions use
several such letters, where they have different meaning from their
compatibility equivalents).

The more I look into this, the more I think it is really problematic.



More information about the Python-ideas mailing list