[Python-Dev] Re: new bytecode results

Fri, 28 Feb 2003 12:00:09 -0500

> >>c) ordering cases in the switch statements by usage frequency
> >>    (using average opcode usage frequencs obtained by
> >>    instrumenting the interpreter)
> >
> > I might try a little simulated annealing to generate layouts with high
> > frequency opcodes near the front and coorcurring opcodes near each
> > other.
>
> I did that by hand, sort of :-) The problem is that the
> scoring phases takes rather long, so you better start with
> a good guess.

Im wondering what good scoring scheme would look like.

I tried a scoring scheme in which layouts were scored thusly:

for (i = 0; i < MAXOP; i++)
    for (j = 0; j < MAXOP; j++)
        score += pairfreq[layout[i]][layout[j]] * (i < j ? j-i : i-j)

This works fine, but Im thinking that a simpler scoring scheme which looks
only at the frequencies of adjacent ops might be sufficient, and would
certainly be faster.

for (i = 1; i < MAXOP; i++)
    score += pairfreq[layout[i-1]][layout[i]]

The idea is that while caches favour locality of reference, because a cache
line is finite in size and relatively small (16 or 64 bytes), there arent
any long-range effects. In other words, caches favour adjacency of reference
rather than locality of reference.