[issue46520] `ast.unparse` produces syntactically illegal code for identifiers that look like reserved words

Wed Jan 26 06:56:35 EST 2022

Batuhan Taskaya <isidentical at gmail.com> added the comment:

Technically, this is a bug on the fact that it breaks the only guarantee of ast.unparse:

> Unparse an ast.AST object and generate a string with code that would produce an equivalent ast.AST object if parsed back with ast.parse().

But I am not really sure if it should be handled at all, since we don't have access to the original form of the identifier in the AST due to the parser's normalization behavior.

If we want to only create a source that would give the same AST, abusing the fact that original keywords are always basic ASCII we could embed a map of characters that convert ASCII 'a', 'b', 'c', ... to their most similar unicode versions (https://util.unicode.org/UnicodeJsps/confusables.jsp). But I feel like this is a terrible idea, with no possible gain (very limited use case) and very prone to a lot of confusions. 

I think just adding a warning to the documentation regarding this should be the definite resolution, unless @pablogsal has any other idea.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue46520>
_______________________________________