[Compiler-sig] Seperating out the parser

Jeremy Hylton jeremy@cnri.reston.va.us
Wed, 9 Feb 2000 12:56:02 -0500 (EST)


Congratulations!  You've sent the first message on the compiler-sig.
I have been meaning to announce the sig on the main list and get
things started here, but I've been busy with other things.

>>>>> "IB" == Ian Bicking <bickiia@earlham.edu> writes:

  IB> I'm interested in making Python run in MUQ <http://www.muq.org>.
  IB> I was hoping to take the parser from Python and simply retarget
  IB> the compiler.  I know there are people who would like to do the
  IB> same for Guile.

What is MUQ?  I assume it is something that you want to generate code
for, but I'm not familiar with it.  The Web page says something about
virtual worlds, so I'm a bit puzzled.

There was some discussion at SPAM8 on generating Scheme code from
Python.  Personally, I think MzScheme is a better platform than
Guile.

  IB> I've been reading through the source in Python-1.5.2/Parser, but
  IB> I was hoping to get a little help about where the border between
  IB> the compiler and the parser is.  Just a few pointers as to the
  IB> functions and datastructures that lie around that border, to
  IB> give me a place to start from.

The border between the parser and compiler is one of the key issues
we need to sort out.  The Python compiler (Python/compile.c) uses the
concrete parse tree generated by the parser.  The parser is generated
from Grammar/Grammer using pgen.  The compiler is a bit hard to read
because it's referencing the parse tree nodes used integer indexes.

I'm working on a Python compiler written in Python; it generates
essentially the same bytecode as compile.c, but it uses an abstract
syntax tree and it is a bit more readable.  The AST I'm using is the
one from Python2C (P2C?).  You can get it from the p2c CVS repository.
(I can't find a URL for it; perhaps Bill or Greg can point us at it.)
The AST is produced using the builtin Python parser;  the transformer
module in p2c gets a parse tree using the parser module and converts
into an AST.  I think this is a good starting point for developing a
compiler for a different target.

I expect to propose some changes to the AST when I'm done with the
compiler.  It looks like nodes like Add, Mul, & Sub should be combined
into a BinaryOp node, and UnaryAdd, UnarySub, etc. should become
UnaryOp.  I'm going to wait until I've got a working version before I
make too much noise about these changes though.  I also plan to write
a new parser for JPython, using ANTLR to directly produce the AST.  I
imagine that will also improve my perspective on the right AST.

I hope to have a snapshot of my code available from the Python CVS
server early next week.

Jeremy