[Compiler-sig] progress on new AST

Finn Bock bckfnn@worldonline.dk
Thu, 11 Apr 2002 12:38:13 GMT


[Jeremy]

>I'd recommend you read Dan Wang's DSL 97 paper:
>http://www.cs.princeton.edu/~danwang/Papers/dsl97/dsl97-abstract.html.
>It's an easy read.  It describes the ASDL syntax and shows small
>examples of an AST and code generated for C and Java.  

Thanks.

>
>  FB> Keep in mind that I'm a newbie at reading asdl, but how is it
>  FB> expressed that a 'Module' contain a list of 'stmts', while a
>  FB> FunctionDef only contain one 'name'?
>
>Your question pointed out an embarassing bug in the python.asdl file
>:-).  If we take an example "constructor" (with fix applied):
>
>    stmt = ClassDef(identifier name, expr* bases, stmt* body)
>
>The lhs is the name of the type, the rhs is a constructor signature.
>The constructor takes three arguments.  The type is on the left, the
>name is on the right.  identifier is a builtin type.  expr and stmt
>are defined in python.asdl.  There are two type modifiers * and ?.
>The * means sequence of 0 or more.  The ? means optional.
>
>So a class has a single name, an arbitrary number of base class
>expressions, and an arbitrary number of stmts.  
>
>The bug is that Module, FunctionDef, and ClassDef were define to
>contain a single statement.  I'm sure that's what confused you.

Indeed, I couldn't quite make it add up. I then guess the same problem
still exists for the remaining uses of 'stmt' in For, While, If,
TryExcept and TryFinally?

Will the optional 'else:' part of For, While and If be handled as a zero
length list stmt's? Or maybe the optional '?' operator can be used for
an optional sequence?


>  >> I've also written a simple C code generator that turns the ast
>  >> definition into C code that defines structs and constructor
>  >> functions.
>
>  FB> I'm playing around with generating java code and all the needed
>  FB> information seems to be available, but I can't quite make sense
>  FB> of the basic idea behind the datastructures we are generating
>  FB> from. What is a Sum and what is a Product in this sense?
>
>A Sum is a set of type constructors -- so stmt is a sum type.  A
>Product is like listcomp -- a single unnamed constructor.  For a sum
>type, a value can be any one of the constructors.  For a product,
>there is only one constructor.

Thanks, that helped.

>The DSL paper represents a sum as a C union with a struct element for
>each constructor.  It is silent on products, but I've chosen to
>represent it as a single struct.


>Feel free to check in any Java-generating code in the sandbox.

Will do, eventually.

There are some restrictions on the java code, typically naming that I
have to deal with somehow and i'm not sure what can be changed in the
.asdl and what must be handled in my generator. 

- Would it be OK to rename the 'final' arg in TryFinally to f.ex
'finalbody'? 'final' is a java reserved word.

- Would it be OK to change the name of 'String' and 'Number'? Java
classes with these names already exists in the java.lang package and it
is annoying to work with userclasses with these names.


For example:

Index: python.asdl
===================================================================
RCS file: /cvsroot/python/python/nondist/sandbox/ast/python.asdl,v
retrieving revision 1.8
diff -u -r1.8 python.asdl
--- python.asdl 10 Apr 2002 23:03:32 -0000      1.8
+++ python.asdl 11 Apr 2002 12:34:31 -0000
@@ -24,7 +24,7 @@
              -- 'type' is a bad name
              | Raise(expr? type, expr? inst, expr? tback)
              | TryExcept(stmt body, except* handlers)
-             | TryFinally(stmt body, stmt final)
+             | TryFinally(stmt body, stmt finalbody)
              | Assert(expr test, expr? msg)

              -- may want to factor this differently perhaps excluding
@@ -59,8 +59,8 @@
                         expr? starargs, expr? kwargs)
             | Repr(expr value)
             | Lvalue(assign lvalue)
-            | Number(string n) -- string representation of a number
-            | String(string s) -- need to specify raw, unicode, etc?
+            | Num(string n) -- string representation of a number
+            | Str(string s) -- need to specify raw, unicode, etc?
             -- other literals? bools?

        -- the subset of expressions that are valid as the target of


regards,
finn