[Python-ideas] Hooking between lexer and parser

Nick Coghlan ncoghlan at gmail.com
Mon Jun 8 07:01:06 CEST 2015


On 8 June 2015 at 14:23, Neil Girdhar <mistersheik at gmail.com> wrote:
> Yes, but in this case the near term "problem" was as far as I can tell just
> parsing floats as decimals, which is easily done with a somewhat noisy
> function call.  I don't see why it's important.

No, the problem to be solved is making it easier for people to "play"
with Python's syntax and try out different ideas in a format that can
be shared easily.

The more people that are able to tinker and play with something, and
share the results of their work, the more opportunities there are for
good ideas to be had, and shared, eventually building up to the point
of becoming a coherent proposal for change.

The 3.4 dis module included several enhancements to make playing with
bytecode easier and more fun:
https://docs.python.org/3/whatsnew/3.4.html#dis

3.4 also added the source_to_code() hook in importlib to make it easy
to tweak the compilation pass without having to learn all the other
intricacies of the import system:
https://docs.python.org/3/whatsnew/3.4.html#importlib

MacroPy and Hylang are interesting examples of ways to manipulate the
AST in order to use the CPython VM without relying solely on the
native language syntax, while byteplay and Numba are examples of
manipulating things at the bytecode level.

> The way that CPython does parsing is more than just annoying.  It's a mess
> of repetition and tests that try to make sure that all of the phases are
> synchronized.  I don't think that CPython is the future of Python.  One day,
> someone will write a Python interpreter in Python that includes a clean
> one-pass parser.  I would prefer to make that as easy to realize as
> possible.  You might think it's far-fetched.  I don't think it is.

While the structure of CPython's code generation toolchain certainly
poses high incidental barriers to entry, those barriers are trivial
compared to the *inherent* barriers to entry involved in successfully
making the case for a change like introducing a matrix multiplication
operator or more clearly separating coroutines from generators through
the async/await keywords (both matrix multiplication support and
async/await landed for 3.5).

If someone successfully makes the case for a compelling change to the
language specification, then existing core developers are also ready,
willing and able to assist in actually making the change to CPython.

As a result, making that final step of *implementing* a syntactic
change in CPython easier involves changing something that *isn't the
bottleneck in the process*, so it would have no meaningful impact on
the broader Python community.

By contrast, making *more steps* of the existing compilation process
easier for pure Python programmers to play with, preferably in an
implementation independent way, *does* impact two of the bottlenecks:
the implementation of syntactic ideas in executable form, and sharing
those experiments with others. Combining that level of syntactic play
with PyPy's ability to automatically generate JIT compilers offers an
extraordinarily powerful platform for experimentation, with the
standardisation process ensuring that accepted experiments also scale
down to significantly more constrained environments like MicroPython.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list