[Python-ideas] Hooking between lexer and parser

Neil Girdhar mistersheik at gmail.com
Sat Jun 6 20:27:14 CEST 2015


Right.

On Sat, Jun 6, 2015 at 1:52 PM, Ryan Gonzalez <rymg19 at gmail.com> wrote:

>
>
> On June 6, 2015 12:29:21 AM CDT, Neil Girdhar <mistersheik at gmail.com>
> wrote:
> >On Sat, Jun 6, 2015 at 1:00 AM, Nick Coghlan <ncoghlan at gmail.com>
> >wrote:
> >
> >> On 6 June 2015 at 12:21, Neil Girdhar <mistersheik at gmail.com> wrote:
> >> > I'm curious what other people will contribute to this discussion as
> >I
> >> think
> >> > having no great parsing library is a huge hole in Python.  Having
> >one
> >> would
> >> > definitely allow me to write better utilities using Python.
> >>
> >> The design of *Python's* grammar is deliberately restricted to being
> >> parsable with an LL(1) parser. There are a great many static analysis
> >> and syntax highlighting tools that are able to take advantage of that
> >> simplicity because they only care about the syntax, not the full
> >> semantics.
> >>
> >
> >Given the validation that happens, it's not actually LL(1) though.
> >It's
> >mostly LL(1) with some syntax errors that are raised for various
> >illegal
> >constructs.
> >
> >Anyway, no one is suggesting changing the grammar.
> >
> >
> >> Anyone actually doing their *own* parsing of something else *in*
> >> Python, would be better advised to reach for PLY
> >> (https://pypi.python.org/pypi/ply ). PLY is the parser underlying
> >> https://pypi.python.org/pypi/pycparser, and hence the highly regarded
> >> CFFI library, https://pypi.python.org/pypi/cffi
> >>
> >> Other notable parsing alternatives folks may want to look at include
> >> https://pypi.python.org/pypi/lrparsing and
> >> http://pythonhosted.org/pyparsing/ (both of which allow you to use
> >> Python code to define your grammar, rather than having to learn a
> >> formal grammar notation).
> >>
> >>
> >I looked at ply and pyparsing, but it was impossible to simply parse
> >LaTeX
> >because I couldn't explain to suck up the right number of arguments
> >given
> >the name of the function.  When it sees a function, it learns how many
> >arguments that function needs.  When it sees a function call
> >\a{1}{2}{3},
> >if "\a" takes 2 arguments, then it should only suck up 1 and 2 as
> >arguments, and leave 3 as a regular text token. In other words, I
> >should be
> >able to tell the parser what to expect in code that lives on the rule
> >edges.
>
> Can't you just hack it into the lexer? When the slash is detected, the
> lexer can treat the following identifier as a function, look up the number
> of required arguments, and push it onto some sort of stack. Whenever a left
> bracket is encountered and another argument is needed by the TOS, it
> returns a special argument opener token.
>

Your solution is right, but I would implement it in the parser since I want
that kind of generic functionality of dynamic grammar rules to be available
everywhere.


>
> >
> >The parsing tools you listed work really well until you need to do
> >something like (1) the validation step that happens in Python, or (2)
> >figuring out exactly where the syntax error is (line and column number)
> >or
> >(3) ensuring that whitespace separates some tokens even when it's not
> >required to disambiguate different parse trees.  I got the impression
> >that
> >they wanted to make these languages simple for the simple cases, but
> >they
> >were made too simple and don't allow you to do everything in one simple
> >pass.
> >
> >Best,
> >
> >Neil
> >
> >
> >> Regards,
> >> Nick.
> >>
> >> --
> >> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> >>
> >
> >
> >------------------------------------------------------------------------
> >
> >_______________________________________________
> >Python-ideas mailing list
> >Python-ideas at python.org
> >https://mail.python.org/mailman/listinfo/python-ideas
> >Code of Conduct: http://python.org/psf/codeofconduct/
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150606/f8ddd8d7/attachment-0001.html>


More information about the Python-ideas mailing list