[Python-Dev] PEP 203 Augmented Assignment
Guido van Rossum
guido@beopen.com
Fri, 28 Jul 2000 10:05:48 -0500
> Since we're at it, it's worth mentioning another conclusion we came across
> at the time: the cache effects in the main loop are significant -- it is
> more important to try keeping at best the main loop small enough, so that
> those effects are minimized.
Yes, that's what Tim keeps hammering on too.
> An experiment I did at the time which gave some delta-speedup:
> I folded most of the UNOP & BINOP opcodes since they differ only by the
> functon they call and they duplicate most of the opcode body. Like this:
[...]
> This reduced the code size of ceval.c, which resulted in less cache effects
> and gave more speed, despite the additional jumps. It possibly results in
> less page-faults too, although this is unlikely.
I expect this is wholly attributable to the reduced code size. Most
binary operators aren't used frequently enough to make a difference in
other ways. If you put the common code at the end of the code for
binary '+', that would optimize the most common operator.
> Which makes me think that, if we want to do something about cache effects,
> it is probably not a bad idea to just "reorder" the bytecodes in the big
> switch by decreasing frequency (we have some stats about this -- I believe
> Skip and MAL have discussed the opcodes' frequency and the charts lie
> somewhere in the archives). I remember Marc-Andre had done something in
> this direction and reported some perf improvements too. Since reordering
> the opcodes doesn't really hurt, if I'm about to do something with the
> main loop, it'll be only this.
Go for it -- sounds good!
--Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)