[vm-papers-discuss] Recent favorite: Stochastic Superoptimization

Thu Jun 20 17:26:20 CEST 2013

This is veering a bit off course, into general compilery stuff. Tell me
if it's not topical enough.

Brit Butler wrote:
> Recently, I was excited to see this paper. [0] I like that they
> frame optimizing IR as a search problem. [...] In general, I would
> be interested in seeing more cross-pollination between the AI and
> Compiler communities. How else are we supposed to get a sufficiently
> smart compiler? ;)

I'm going to preach for my parish and suggest that you look for
cross-pollination with the operations research [*] community ;) This is
where register allocation as graph colouring, as integer linear
programming or as binary quadratic programming come from. More
importantly, this is also where efforts to integrate phases (e.g.
instruction selection and scheduling) find their tools. OR practicioners
regularly tackle an issue that's fundamental in compiler development:
ordering phases.

A native code compiler must decide how high-level constructs are
rewritten to improve performance and lowered into assembly language, and
how to schedule instructions and allocate registers (plus other
decisions, e.g. whether to represent data in boxed or machine native
format). Solving for all these decisions simultaneously is daunting and
doesn't seem feasible. The usual solution in compilers is to make a few
decisions at a time, in sequence. OR can offer mathematically grounded
(and practical) strategies to decompose large and complicated problems
into tractable components; from the exact method (i.e. my) camp,
Lagrangian and [generalised] Benders decomposition come to mind. I don't
really follow metaheuristics, but Threshold ascent on graph (a
single-player adaptation of Monte Carlo tree search) seems like a nice
approach for planning problems.

Sadly, it's hard to apply mathematical programming techniques without a
formal model of the problem we're trying to solve.

[*]: My PhD thesis is that it's practical to solve (not to optimality,
but well enough and with a proof of solution quality) location problems
to guide the daily operations of country-wide distribution networks.

> I tend to use Steel Bank Common Lisp for my hobby hacking but it's
> 400kloc! More and more I find myself thinking about ways to simplify
> or reduce complexity in our software systems.

Same here. Hopefully, we can find simple model that yield equally as
good binaries as our current tools. Figure 3 in Schkufza et al's
Stochastic Superoptimisation is encouraging. Moreover, the Stabilizer
team's <http://plasma.cs.umass.edu/emery/stabilizer> findings certainly
indicate that what is done in O3 isn't generally useful compared to O2 
(or that we should work better at placing code and data in memory).

> I'm not even sure how to start comprehending/hacking on [SBCL]
> something that size and out of my domain.

Short answer: Anywhere ;)

Paul Khuong