[Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers

Sat May 3 14:27:31 CEST 2014

Steven D'Aprano writes:
 > On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote:
 > > Steven D'Aprano wrote:
 > > >Particularly for mathematically-focused code, I think it would be useful 
 > > >to be able to use identifiers like (say) σ² for variance,
 > > 
 > > Having σ² be a variable name could be confusing. To a
 > > mathematician, it's not a distinct variable, it's
 > > just σ ** 2.
 > 
 > Actually, not really. A better way of putting it is that the standard 
 > deviation is "just" the square root of σ². Variance comes first (it's 
 > defined from first principles), and then the standard deviation is 
 > defined by taking the square root.

Thank you for writing that better than I could have. :-)

 > But really, it doesn't matter which is derived from which. To a 
 > mathematician, x² is just as much a legitimate variable as x. One can 
 > say that f is a function of x² just as well as saying that it is a 
 > function of y, where y happens to equal x².

We part company here.  x² (in the usage "function of x²") is not a
variable, it's an expression.  I don't think I've even seen the usage
"f(x²) = ..." in a *definition* of "f", with the single exception of
the use of "f(μ,σ²) = ..." in defining the distribution of a random
variable, and even then that's unusual (σ is almost always more
convenient, even for test statistics).  I'd consider that the
exception that proves the rule....  Especially in a case like
z(x,μ,σ²) = (x - μ)/σ!

To put it another way, I suspect you would get rather upset if I used
both x and x² in such a context and treated them as I would x and y.
Or, if in real analysis I ignored the fact that x² is necessarily
non-negative.  I could go on, but I think the point is clear:
*linguistically* these are expressions, not variables -- they are
constructed syntactically, and their semantics can be deduced from the
syntax.

Of course in mathematics you can treat them as variables (as
statisticians do σ²), but that works because in mathematics no symbols
or syntax have fixed semantics, not π, not even 0.  If you can get a
version of Python that has "where ..." clauses in it that can define
semantics for sub- and superscript syntax past Guido, I'd be all for
this.  But I really don't think that's going to happen.<wink/>

 > Is it useful enough to make up for the (minor) issues that others
 > have already mentioned? I think so, but I will understand if others
 > disagree. I think that the ability to distinguish between x² and
 > x₂ can be important,

Which, I suspect, means these notations don't pass the "generalized
grit on Tim's monitor" test.

 > and both x2 and x_2 are poor substitutes.

In programming (as opposed to the chemistry of nuclear fusion), if you
need to distinguish x² from x₂, and x**2 and x[2] don't do the trick,
I suspect your notation has real readability problems no matter how
you arrange things spatially.  I guess that use cases where such usage
is in good taste are way too rare to justify this.