[Python-Dev] Rattlesnake progress

Tim Peters tim.one@comcast.net
Mon, 18 Feb 2002 18:51:52 -0500


[Neil Schemenauer]
> I've been working on Skip's rattlesnake and have made some progress.

Cool!  I encourage this.  That and 2 dollars will buy you a cup of coffee.

> Right now I'm trying to hack the compiler package and am looking for a
> good reference on code generation.  I have the "New Dragon book" as well
> as "Essentials of Programming Lanuages" but neither one seem to be
> telling me want I want to know.  Any suggestions?

Write it in Python because you won't be happy with your first 2 or 3
attempts.

Compiler books have historically been very heavy on front end issues, in
large part because the theory of parsing is well understood.  What
relatively little you can find on back end issues tends to be sketchy and
severely limited by the author's personal experience.  In large part this is
because almost no interesting optimization problem can be solved in linear
time (whether it's optimal instruction selection, optimal register
assignment, optimal instruction ordering, ...), so real-life back ends are a
mountain of idiosyncratic heuristics.

Excellent advice that almost nobody follows <0.5 wink>:  choose a flexible
intermediate representation, then structure all your transformations as
independent passes, such that the output of every pass is acceptable as the
input to every pass.  Then keep each pass focused, as simple as possible
(for example, if a transformation may create regions of dead code, don't
dare try to clean it up in the same pass, or contort the logic even a little
bit to try to avoid creating dead code -- instead let it create all the dead
code it wants, and (re)invoke a "remove dead code" pass afterwards).
*Because* back ends are a mountain of idiosyncratic heuristics, this design
lets you add new ones, remove old ones, and reorder them with minimal pain.
One compiler I read about (but didn't have the pleasure of using) actually
allowed you to specify the sequence of back end transformations on the
cmdline, using a regular expression notation where, e.g.

   (hoist dead)+

meant "run the hoist pass followed by the dead code removal pass, one or
more times, until a fixed point is reached".

Since none of that told you want to know either <wink>, what do you want to
know?  Sounds like a legit topic for python-dev.