How to write a language parser ?

mbg1708 at planetmail.com mbg1708 at planetmail.com
Fri Feb 22 20:25:32 EST 2013


On Friday, February 22, 2013 11:29:42 AM UTC-5, Timothy Madden wrote:
> Hello
> 
> 
> 
> I am trying to write a DBGp client in python, to be used for debugging 
> 
> mostly php scripts.
> 
> 
> 
> Currently the XDebug module for php allows me to set breakpoints on any 
> 
> line, include blank ones and lines that are not considered executable, 
> 
> resulting in breakpoints that will never be hit, even if program flow 
> 
> control appears to pass through the lines.
> 
> 
> 
> For that I would like to write a php parser, in order to detect the 
> 
> proper breakpoints line for statements spanning multiple lines.
> 
> 
> 
> Is there an (open-source) way to do to this in python code ? Most 
> 
> parsers I could see after a search are either too simple for a real 
> 
> programming language, or based on a python module written in C. My debug 
> 
> client is a Vim plugin, and I would like to distribute it as script 
> 
> files only, if that is possible. The generator itself my well be a C 
> 
> module, as I only distribute the generated output.
> 
> 
> 
> The best parser I could find is PLY, and I would like to know if it is 
> 
> good enough for the job. My attempt at a bison parser (C only) ended in 
> 
> about a hundred conflicts, most of which are difficult to understand, 
> 
> although I admit I do not know much about the subject yet.
> 
> 
> 
> Are there other parsers you have used for complete languages ?
> 
> 
> 
> Thank you,
> 
> Timothy Madden

Take a look at this whitepaper:
    http://www.cis.upenn.edu/~matuszek/General/recursive-descent-parsing.html

I needed a parser for a chunk of SQL syntax.  After trying PyBison and writing crude text analysis in Python, I found this very useful paper.  I used the advice in this paper to write my own recursive descent parser in pure Python.  The two steps were:

1.  Write a yacc syntax (without any action items).  This step allowed me to get rid of various shift and reduce conflicts in my grammar.

2.  Use the yacc grammar as a guide for the recursive descent parser.  Essentially I wrote one parser function in Python for each yacc production.

The process has the merit that the yacc syntax is known to be robust before you start coding, so the eventual Python code is based on a good design.

Good luck.



More information about the Python-list mailing list