XML based programming language
Diez B. Roggisch
deets at nospam.web.de
Sun Mar 18 11:48:40 EDT 2007
stefaan schrieb:
> Hello,
>
> I have recently had to deal with an XML-based
> programming language. (The programs are
> generated programmatically.)
>
> XML leads to a "two-level" parsing problem: first
> parse the xml into tokens, then parse the tokens
> to grasp their meaning (based on the semantics
> of the programming language).
>
> Basically, I used elementtree as a sophisticated "lexer" and wrote a
> recursive descent parser to perform the semantic analysis and
> interpretation.
> (It works great.)
>
> But I keep wondering: do parser generator tools
> exist that I could have used instead of writing
> the recursive descent parser manually ?
You haven't written a recursive descent parser. At least not in the
sense of the word.
A parser (recursive descent or otherwise) will take a string written in
the language it accepts, and in the field of programming languages
usually returns an abstract syntax tree. On which one works - for
code-generation, interpretation, optimization.
What you wrote is usually called a reducer, the part that traverses the
tree, rewriting it, transforming it for interpretation and whatnot.
I've been working with tools that use a XML-Schema or DTD and generate
typed objects from it, that are capable of being deserialized from a
XML-stream. The better of these tools generate visitors and/or matchers,
which basically are objects that traverse the generated object tree in
document order, via typed methods. Something like this (java pseudocode):
class Visitor {
public visit(Object o) {
if(o instanceof Expr) {
visit((Expr)o);
else if(o instanceof SubExpr) {
visit((SubExpr)o);
}
public visit(Expr e) {
for(SubExpr se : e.subExpressions) {
visit(se);
}
}
public visit(SubExpr e) {
// not doing anything
}
}
This visitor you can then subclass, for example to create an interpreter.
All of this is theoretically possible in python, too. Using multimethods
one can create the dispatching, and so forth.
I'm just not too convinced that it really is worth the effort. A simple
tag-name-based dispatching scheme, together with the really nice
ElementTree-api suffices in my eyes. Then you could do something like this:
class Visitor(ojbect):
def visit(self, node):
descent = True
if getattr(self, "visit_%s" % node.tag):
descent = getattr(self, "visit_%s" % node.tag)(node)
if descent:
for child in node:
self.visit(child)
Then for an element "expr" you could define
class Foo(Visitor):
def visit_expr(self, node):
...
HTH,
Diez
More information about the Python-list
mailing list