Invalid identifier claimed to be valid by docs (methinks)

Sun Sep 23 18:57:53 EDT 2012

On Sun, Sep 23, 2012 at 4:24 PM, Joshua Landau
<joshua.landau.ws at gmail.com> wrote:
> The docs describe identifiers to have this grammar:
>
> identifier   ::=  xid_start xid_continue*
> id_start     ::=  <all characters in general categories Lu, Ll, Lt, Lm, Lo,
> Nl, the underscore, and characters with the Other_ID_Start property>
> id_continue  ::=  <all characters in id_start, plus characters in the
> categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property>
> xid_start    ::=  <all characters in id_start whose NFKC normalization is in
> "id_start xid_continue*">
> xid_continue ::=  <all characters in id_continue whose NFKC normalization is
> in "id_continue*">
>
> So I would assume that
>     exec("a{} = None".format(char))
> would be valid if
>    unicodedata.normalize("NFKC", char)  == "1"
> as
>    exec("a1 = None")
> is valid.
>
> BUT "a¹ = None" is not valid*.
>
> *a<superscript 1>, accessible through <ALT-GR>+1 if your keyboard's set up
> to do that stuff.
>
> Thank you for your times.

Or if you don't have a keyboard for that, you can do the same thing via:

exec("x\u00b9 = None")  # U+00B9 is superscript 1

On the other hand, this does work:

exec("x\u2071 = None")  # U+2071 is superscript i

So it seems to be only an issue with superscript and subscript digits.
 Looks like a compiler bug to me.