[Compiler-sig] Visitor pattern

Finn Bock bckfnn@worldonline.dk
Mon, 15 Apr 2002 14:01:34 GMT


[Jeremy]

>I've been meaning to get the compiler package's visitor documented,
>but I'm not yet sure where to start.  (That is, I'm not sure which
>things are novel or unusual and which are obvious.)
>
>I think there are three key features:
>
>    - Embed the name of the class being visited in the visitor
>      method.  This wouldn't be necessary in Java, I assume, because
>      each method would have the same name buts its argument list
>      would be different.

Java can do that to some extend but lets ignore that and just decide
that the visitor have method like visitAssign() and visitIf().

>
>    - The AST walker exploits knowledge of the AST structure to
>      provide a default visit() method that will work on any node
>      type.   

I think it is the walker idea that threw me off. When I hear 'visitor'
together with trees I immediately thinks the visitor pattern in the GOF
book, and there is no walker in that pattern. 

>    - Responsibility for implementing the pattern is divided between a
>      visitor and a walker.

And with no responsibility placed in the nodes (other then the
getChildren() method), right ? That is OK, but it is a bit of a stretch
to call it a visitor (IMHO).

>The visitor implements a visitXXX() method for each node of interest.
>The walker takes a visitor instance and the root of a node graph.  It
>applies the visitor methods to the node graph starting with the root.
>
>The second seems to apply regardless of language and can be quite
>convenient.  If you're writing a simple symbol table visitor, you only
>care about a few of the node types.  The If stmt, e.g., has no effect
>on the symbol table, only its children do.  The default method makes
>it possible to write a visitor that only has methods for the nodes it
>cares about.
>
>If we have a full specification of the AST (we do: python.asdl) and we
>assume that the spec includes the children of each node in a "natural"
>order, then we can generate a visitor method automatically.  By
>natural, I mean a pre-order traversal of the tree, which has always
>been what I've needed.

That is interesting, I can think of only one times where I would need a
autotraversel: When detecting fastlocals. When generating code I need to
control the traversal order myself.

>In the compiler package, each node type has a method getChildren()
>that returns a tuple containing the children in the natural order.  
>
>class If:
>
>    def getChildren(self):
>        return self.test, self.body, self.orelse

Yuck! For performance reason in jython, I would much rather call a
method on the node that does the traversing:

class If:
    def traverse(self, walker):
        walker.dispatch(self.test)
        for stmt in self.body:
            walker.dispatch(stmt)
        for stmt in self.orelse:
            walker.dispatch(stmt)

That is because calling a method can be cheaper than creating a tuple.

I suppose that I will do something like that and then hide the
'traverse' method from python.

>We should be able to define these as well, since the python.asdl
>presents the children in the natural order.
>
>If we have a visitor that isn't interested in If nodes, then it simply
>doesn't define a visitIf() method.  If the AST contains an If node,
>the default method of the walker handles it instead.
>
>The default method on the walker does the following:
>
>    for child in node.getChildren():
>        if child is not None:
>            self.visit(child)

With my 'If' class above the 'default' method becomes:

    child.traverse(this)

>The visit() method is defined by the walker.  It takes an arbitrary
>node as its argument, looks up the appropriate method on the
>visitor, and calls it.  The visitor can also use this method to
>delegate responsibility for method lookup to the walker.
>
>Does that help?  

Yes it did, thanks.

A question still remain: if I have a visitTryExcept() method, how would
I then cause or prevent the default traversion of the children? In the
example below I assume that a visitXXX function must deal with its
children itself either one-by-one or by calling 'default()'.

>I'm not sure how much of this maps naturally from Python to Java.

I'm guessing the main problem will be whether the concrete visitor class
must inherit from some base class. I like that requirement, you probably
don't.

The requirement of a visitor base class is easily met if the visitor and
walker is joined into one class. That way an example visitor to find
potential fastlocals would look like this:


class FastLocalVisitor(ASTVisitor):
    def __init__():
        self.infunc = False
        self.mode = 'GET'

    def visitFuncDef(self, node):
        self.infunc = True
        self.default(node)
        self.infunc = False

    def visitName(self, node):
        if self.infunc and self.mode == 'SET':
            print node.id, "is a potential fastlocal"

    def visitTryExcept(self, node):
        self.mode = 'SET'
        for e in node.handlers:
            self.dispatch(e.name)
        self.mode = 'GET'
        for e in node.handlers:
            self.dispatch(e.type)
            self.dispatch(e.body)
        self.dispatch(node.body)
        self.dispatch(node.orelse)

    def visitAssign(self, node):
        self.mode = 'SET'
        for t in node.targets:
            self.dispatch(t)
        self.mode = 'GET'
        self.dispatch(node.value)

    def visitFor(self, node):
        self.mode = 'SET'
        self.dispatch(node.target)
        self.mode = 'GET'
        self.dispatch(node.iter)
        self.dispatch(node.body)
        self.dispatch(node.value)

FastLocalVisitor().visit(tree)


So the ASTVisitor class is publicly defined as:

class ASTVisitor:
    def default(self, node):  # Should be called traverse IMO
        """Visit each of the children of node."""

    def dispatch(self, node): # Should be called visit IMO
        """Visit the node (or call self.default())."""

    def visitXXX(self, node):
        """Visit the node of type XXX. Defined in subclass"""

if we want to get fancy, we can consider these methods:

    def post_visitXXX(self, node):
        """Visit the node of type XXX after the children have 
        been visited. Defined in subclass"""

    def unhandled_node(self, node):
        """Called when no visitXXX method exists for the node. Defined
        in subclass"""

    def open_level(self, node):
        """Called just before the visitXXX is called. Defined
        in subclass"""

    def close_level(self, node):
        """Called just after the post_visitXXX is called. Defined
        in subclass"""



I would also like to avoid documenting the use of cls.__name__ in the
dispatching. The cls.__name__ for a java class should not be depended on
because it looks like this: 'org.python.parser.ast.If'. I can deal with
this in the implementation of ASTVisitor.dispatch() but I wouldn't want
other application to depending too much on the __name__ of AST nodes. It
would be better if we added a class attribute to the nodes that
contained the official name.

regards,
finn