[Python-Dev] Re: AST mining (was Re: Direction of PyChecker)

Jeremy Hylton jeremy@zope.com
Tue, 14 Aug 2001 11:50:40 -0400 (EDT)


>>>>> "NN" == Neal Norwitz <neal@metaslash.com> writes:

  >> I'd be very interested in pooling efforts to make this easier. I
  >> know almost nothing about ASTs now, but that could change in a
  >> hurry :-).

  NN> Pooling efforts would be good.  I also don't know anything about
  NN> the ASTs/compiler, but am willing to work on it.

I'm on the hook for AST/compiler documentation, which I plan to work
on this week.  I will probably have more time for it later in the week
than I will today or tomorrow.

In the absence of documentation, here's a trivial example program that
extracts some information about methods and attributes from a class
and its methods.  I wasn't exhaustive here.  I'll get attributes
assigned to by "self.x" in a method body and I'll get method
definitions in the class body.  I don't deal with obvious things like
attributes defined at the class level.

Jeremy

from compiler import parseFile, walk, ast

class Class:

    def __init__(self, name, bases):
        self.name = name
        self.bases = bases
        self.methods = {}
        self.attributes = {}

    def addMethod(self, meth):
        self.methods[meth.name] = meth

    def addInstanceAttr(self, name):
        self.attributes[name] = name

    def getMethodNames(self):
        return self.methods.keys()

    def getAttrNames(self):
        return self.attributes.keys()

class Method:

    def __init__(self, name, args, defaults):
        self.name = name
        self.args = args
        self.defaults = defaults

    def getSelf(self):
        return self.args[0]

class ClassExtractor:

    classes = []

    def visitClass(self, node, klass=None, meth=None):
        c = Class(node.name, node.bases)
        self.visit(node.code, c)
        self.classes.append(c)

    def visitFunction(self, node, klass=None, meth=None):
        if klass is not None and meth is None:
            m = Method(node.name, node.argnames, node.defaults)
            klass.addMethod(m)
            self.visit(node.code, klass, m)
        else:
            self.visit(node.code)

    def visitAssAttr(self, node, klass=None, meth=None):
        if isinstance(node.expr, ast.Name) and meth is not None:
            if node.expr.name == meth.getSelf():
                klass.addInstanceAttr(node.attrname)
        else:
            self.visit(node.expr)

def main(py_files):
    extractor = ClassExtractor()
    for py in py_files:
        ast = parseFile(py)
        walk(ast, extractor)

    for klass in extractor.classes:
        print klass.name
        print klass.getMethodNames()
        print klass.getAttrNames()
        print

if __name__ == "__main__":
    import sys
    main(sys.argv[1:])