AST Manipulation and Common Subexpression Elimination

Robert Kern kern at ugcs.caltech.edu
Sat Oct 18 03:09:51 EDT 2003


I'm in the middle of writing a scientific program with a number of very, very
long formulae. You can see a typical one below. What I would like to do is to
parse this code to yield an AST, identify identical subtrees, replace them in
the AST with a dummy variable to which I will assign the common subexpression.

    v21c = ((1-2*nu)*((2*(1-nu)ctB*ctB-nu)*log(Ry3bar) - \
        (2*(1-nu)*ctB*ctB+1-2*nu)*cB*log(Rz3bar)) - \
        (1-2*nu)/Ry3bar*(y1*ctB*(1-2*nu-a/Rbar) + nu*ybar3 - a + \ 
            y2*y2/Ry3bar*(nu+a/Rbar)) - \
        (1-2*nu)*zbar3*ctB/Rz3bar*(cB+a/Rbar) - \
        a*y1*(ybar3-a)*ctB/Rbar3 + \
        (ybar3-a)/Ry3bar*(-2*nu+((1-2*nu)*y1*ctB-a)/Rbar + \
            y2*y2/Rbar/Ry3bar*(2*nu+a/Rbar)+a*y2*y2/Rbar3) + \
        (ybar3-a)/Rz3bar*(cB*cB-((1-2*nu)*zbar1*ctB+a*cB)/Rbar +
            a*ybar3*zbar1*ctB/Rbar3 - \
            (y2*y2*cB*cB-a*zbar1*ctB/Rbar*(Rbar*cB+ybar3))/Rbar/Rz3bar))/(4*pi*(
1-nu))

For example, I would like to globally replace (1-2*nu) with, say, the variable
nu12=(1-2*nu). Some of these variables are already the evaluations of
subexpressions I found convenient when typing in these formulae (or the original
authors when deriving them).

Each of these variables is either a float or a Numeric array (or the log()
function). No variables changes value during the computation, nor are there
branches or loops, so the CSE algorithm doesn't have to be sophisticated enough
to account for these.

Before you start on about "premature optimization," note that I'm more concerned
with keeping the formulae readable, manageable, and debuggable than anything
else. Heck, with the size of the arrays I'm thinking about, doing lots of CSE
will probably kill my memory.

My questions are:

1. Can I do better than brute-force traversing the AST and trying to match each
subtree to the other subtrees (using something like the match() function given
in the parser module example)?

2. Are there good tools out there for doing AST transformations like this? Are
there any GUI tools for messing with the AST? I know that I will want to
interactively select which subexpressions are eliminated in this fashion.

3. Is there any example code around showing how to output Python code from the
AST? I can probably figure out, then, how to do the same for
Maxima/Yacas/Maple/Mathematica as I'm sure I'm going to want to slap these
formulae into a CAS and play around with them.

4. And as a corollary to that last thought: are there any automatic
differentiation tools for Python that operate directly on the AST and generate
new code?

Googling hasn't come up with much for me, but my brain is fried from typing in
these $%&@! formulae. Pointers to introductory compiler technique references
(online or dead tree) or other useful resources are welcome.

Many thanks.

-- 
Robert Kern
kern at ugcs.caltech.edu

"In the fields of Hell where the grass grows high
 Are the graves of dreams allowed to die."
                     -- Richard Harter




More information about the Python-list mailing list