[issue21765] Idle: make 3.x HyperParser work with non-ascii identifiers.

Ezio Melotti report at bugs.python.org
Sat Jun 21 10:33:39 CEST 2014


Ezio Melotti added the comment:

> _ID_FIRST_CATEGORIES = {"Lu", "Ll", "Lt", "Lm", "Lo", "Nl",
>                         "Other_ID_Start"}
> _ID_CATEGORIES = _ID_FIRST_CATEGORIES | {"Mn", "Mc", "Nd", "Pc",
>                                          "Other_ID_Continue"}

Note that "Other_ID_Start" and "Other_ID_Continue" are not categories -- they are properties -- and that unicodedata.category() won't return them, so adding them to these set won't have any effect.  I don't think there's a way to check if chars have that property, but as I said in my previous message it's probably safe to ignore them (nothing will explode even in the unlikely case that those chars are used, right?).

> def is_id_char(char):
>    return char in _ASCII_ID_CHARS or (
>        ord(char) >= 128 and

What's the reason for checking if the ord is >= 128?

>        category(normalize(char)[0]) in _ID_CATEGORIES
>    )

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue21765>
_______________________________________


More information about the Python-bugs-list mailing list