[Python-ideas] Hooking between lexer and parser

Nick Coghlan ncoghlan at gmail.com
Mon Jun 8 05:03:09 CEST 2015


On 8 June 2015 at 12:47, Neil Girdhar <mistersheik at gmail.com> wrote:
> You're right.  And as usual, Nick, your analysis is spot on.  My main
> concern is that the idealized way of parsing the language is not precluded
> by any change.  Does adding token manipulation promise forwards
> compatibility?  Will a Python 3.9 have to have the same kind of token
> manipulator exposed.  If not, then I'm +1 on token manipulation. :)

That may have been the heart of the confusion, as token manipulation
is *already* a public feature:
https://docs.python.org/3/library/tokenize.html

The tokenizer module has been a public part of Python for longer than
I've been a Pythonista (first documented in 1.5.2 in 1999):
https://docs.python.org/release/1.5.2/lib/module-tokenize.html

As a result, token stream manipulation is already possible, you just
have to combine the tokens back into a byte stream before feeding them
to the compiler. Any future Python interpreter would be free to fall
back on implementing a token based API that way, if the CPython code
generator itself were to gain a native token stream interface.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list