[Compiler-sig] Seperating out the parser
Jeremy Hylton
jeremy@cnri.reston.va.us
Wed, 9 Feb 2000 12:56:02 -0500 (EST)
Congratulations! You've sent the first message on the compiler-sig.
I have been meaning to announce the sig on the main list and get
things started here, but I've been busy with other things.
>>>>> "IB" == Ian Bicking <bickiia@earlham.edu> writes:
IB> I'm interested in making Python run in MUQ <http://www.muq.org>.
IB> I was hoping to take the parser from Python and simply retarget
IB> the compiler. I know there are people who would like to do the
IB> same for Guile.
What is MUQ? I assume it is something that you want to generate code
for, but I'm not familiar with it. The Web page says something about
virtual worlds, so I'm a bit puzzled.
There was some discussion at SPAM8 on generating Scheme code from
Python. Personally, I think MzScheme is a better platform than
Guile.
IB> I've been reading through the source in Python-1.5.2/Parser, but
IB> I was hoping to get a little help about where the border between
IB> the compiler and the parser is. Just a few pointers as to the
IB> functions and datastructures that lie around that border, to
IB> give me a place to start from.
The border between the parser and compiler is one of the key issues
we need to sort out. The Python compiler (Python/compile.c) uses the
concrete parse tree generated by the parser. The parser is generated
from Grammar/Grammer using pgen. The compiler is a bit hard to read
because it's referencing the parse tree nodes used integer indexes.
I'm working on a Python compiler written in Python; it generates
essentially the same bytecode as compile.c, but it uses an abstract
syntax tree and it is a bit more readable. The AST I'm using is the
one from Python2C (P2C?). You can get it from the p2c CVS repository.
(I can't find a URL for it; perhaps Bill or Greg can point us at it.)
The AST is produced using the builtin Python parser; the transformer
module in p2c gets a parse tree using the parser module and converts
into an AST. I think this is a good starting point for developing a
compiler for a different target.
I expect to propose some changes to the AST when I'm done with the
compiler. It looks like nodes like Add, Mul, & Sub should be combined
into a BinaryOp node, and UnaryAdd, UnarySub, etc. should become
UnaryOp. I'm going to wait until I've got a working version before I
make too much noise about these changes though. I also plan to write
a new parser for JPython, using ANTLR to directly produce the AST. I
imagine that will also improve my perspective on the right AST.
I hope to have a snapshot of my code available from the Python CVS
server early next week.
Jeremy