[Python-Dev] further optimising the micro-optimisations for cache locality (fwd)

Thomas Wouters thomas@xs4all.net
Fri, 28 Jul 2000 21:41:54 +0200


On Fri, Jul 28, 2000 at 09:24:36PM +0200, Vladimir Marangozov wrote:

> It would be interesting to collect some feedback on these ideas for
> popular combos. Who knows...

I like these ideas, though I think anything beyond 'further folded' requires
a seperate switch for the non-common operators and those that do a tad more
than call a function with a certain number of arguments and push the result
on the stack. Re-numbering the ops into fast-ops and slow-ops, as well as
argument-ops and nonargument-ops. (I hope all non-argument ops fall in the
'fast' category, or it might get tricky ;-P)

I'm also wondering whether they really speed things up. The confusion might
force the compiler to generate *less* efficient code. Then again, it removes
some of the burden from the compiler, too, so it probably depends very
heavily on the compiler whether this is going to have a positive effect.

> > // the function to call for this op
> > (void *)() op_func[] = { ..., PyNumber_Add, PyNumber_Multiply, ... };
> > 
> > // the kind of op this func describes
> > unsigned char op_type[] = { ..., DO_BINOP1, DO_BINOP2, DO_UNOP, ... };
> > 
> > // these two tables will be cached because of the frequency the are
> > // accessed
> > // ive used a table of pointers and a table of bytes to reduce the
> > // memory required because the tables are small, locality within and
> > // between the tables isnt important
> > // might be an idea to force the tables into contiguous memory somehow
> > // i could also have used a table of structs, but alignment would
> > // increase memory usage

I'm not so sure about any of these comments, given that we do jump to a
function right after accessing these tables. I suggest heavy testing, and I
can offer only two architectures myself (linux-i386 and solaris-sparc-gcc.)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!