Invalid identifier claimed to be valid by docs (methinks)

Joshua Landau joshua.landau.ws at gmail.com
Mon Sep 24 05:53:43 EDT 2012


On 24 September 2012 03:42, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/23/2012 6:57 PM, Ian Kelly wrote:
>
>> On Sun, Sep 23, 2012 at 4:24 PM, Joshua Landau
>> <joshua.landau.ws at gmail.com> wrote:
>>
>>> The docs describe identifiers to have this grammar:
>>>
>>> identifier   ::=  xid_start xid_continue*
>>> id_start     ::=  <all characters in general categories Lu, Ll, Lt, Lm,
>>> Lo,
>>> Nl, the underscore, and characters with the Other_ID_Start property>
>>> id_continue  ::=  <all characters in id_start, plus characters in the
>>> categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property>
>>> xid_start    ::=  <all characters in id_start whose NFKC normalization
>>> is in
>>> "id_start xid_continue*">
>>>
>>
> xid_start is a subset of id_start
>
>
>  xid_continue ::=  <all characters in id_continue whose NFKC
>>> normalization is
>>> in "id_continue*">
>>>
>>
> xid_continue is a subset of id_continue.
>
>
>  So I would assume that
>>>      exec("a{} = None".format(char))
>>> would be valid if
>>>     unicodedata.normalize("NFKC", char)  == "1"
>>>
>>
> Read more carefully the definition of xid_continue. The un-normalized
> character must also be in id_continue.
>
Correct. Thank you for your time.


>  as
>>>     exec("a1 = None")
>>> is valid.
>>>
>>> BUT "a¹ = None" is not valid*.
>>>
>>
> >>> ud.category("\u00b9")
> 'No'
>
> Category No is *not* in id_continue, and therefore not in xid_continue.
>
>
> exec("x\u00b9 = None")  # U+00B9 is superscript 1
>>
>> On the other hand, this does work:
>>
>> exec("x\u2071 = None")  # U+2071 is superscript i
>>
>> So it seems to be only an issue with superscript and subscript digits.
>>   Looks like a compiler bug to me.
>>
>
> The problem, if there were one, would be in the tokenizer that finds
> identifiers. However,
>
>
> >>> exec("x\u00b9 = None")
> ...
>     x¹ = None
>       ^
> SyntaxError: invalid character in identifier
>
> this is correct.


Thank you both for helping. The bug is officially closed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20120924/d7e6202d/attachment.html>


More information about the Python-list mailing list