[issue42729] tokenize, ast: No direct way to parse tokens into AST, a gap in the language processing pipiline

Paul Sokolovsky report at bugs.python.org
Thu Dec 24 05:54:42 EST 2020


Paul Sokolovsky <pfalcon at users.sourceforge.net> added the comment:

> What prevents you from using ast.parse(tokenize.untokenize(token_stream))?

That's exactly the implementation in the patch now submitted against this issue. But that's the patch for CPython, the motive of the proposal here is to establish a standard API call for *Python*, which different implementation can implement how they like/can/need.

> Also, tokens -> AST is not the only disconnected part in the underlying compiler.

We should address them, one by one.

> Stuff like AST -> Symbol Table 

Kinda yes, again, based on CPython implementation history, we have only source -> Symbol table (https://docs.python.org/3/library/symtable.html). Would be nice to address that (separately of course).

> AST -> Optimized AST

Yes. PEP511 touched on that, but as it-as-a-whole was rejected, any useful sub-ideas from it don't seem to get further progress either (like, being able to disable some optimizations, and then maybe even exposing them as standalone passes).

> I'd also suggest moving the discussion to the Python-ideas, for a much greater audience.

That's how I usually do, but I posted too much there recently. I wanted to submit a patch right away, but noticed that standard commit message format is "bpo-XXXXX: ...", so I created a ticket here to reference in the commit.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue42729>
_______________________________________


More information about the Python-bugs-list mailing list