[Numpy-discussion] numexpr: optimizing pow

Wed Mar 8 20:50:03 EST 2006

I just checked in some changes that do aggressive optimization on the 
pow operator in numexpr. Now all integral and half integral powers 
between [-50 and 50] are computed using multiples and sqrt. (Empirically 
50 seemed to be the closest round number to the breakeven point.)

I mention this primarily because I think it's cool. But also, it's the 
kind of optimization that I don't think would be feasible in numpy 
itself short of defining a whole pile of special cases, either separate 
ufuncs or separate loops within a single ufunc, one for each case that 
needed optimizing. Otherwise the bookkeeping overhead would overwhelm 
the savings of replacing pow with multiplies.

Now all of the bookkeeping is done in Python, which makes it easy; and 
done once ahead of time and translated into bytecode, which makes it 
fast. The actual code that does the optimization is included below for 
those of you interested enough to care, but not interested enough to 
check it out of the sandbox. It could be made simpler, but I jump 
through some hoops to avoid unnecessary mulitplies. For instance, 
starting 'r' as 'OpNode('ones_like', [a])' would simplify things 
signifigantly, but at the cost of adding an extra multiply in most cases.

That brings up an interesting idea. If 'mul' were made smarter, so that 
it recognized OpNode('ones_like', [a]) and ConstantNode(1), then not 
only would that speed some 'mul' cases up, it would simplify the code 
for 'pow' as well. I'll have to look into that tomorrow.

Regards,

-tim

        if True: # Aggressive
            RANGE = 50 # Approximate break even point with pow(x,y)
            # Optimize all integral and half integral powers in [-RANGE, 
RANGE]
            # Note: for complex numbers RANGE would be larger.
            if (int(2*x) == 2*x) and (-RANGE <= abs(x) <= RANGE):
                n = int(abs(x))
                ishalfpower = int(abs(2*x)) % 2
                r = None
                p = a
                mask = 1
                while True:
                    if (n & mask):
                        if r is None:
                            r = p
                        else:
                            r = OpNode('mul', [r,p])
                    mask <<= 1
                    if mask > n:
                        break
                    p = OpNode('mul', [p,p])
                if ishalfpower:
                    sqrta = OpNode('sqrt', [a])
                    if r is None:
                        r = sqrta
                    else:        
                        r = OpNode('mul', [r, sqrta])
                if r is None:
                    r = OpNode('ones_like', [a])
                if x < 0:
                    r = OpNode('div', [ConstantNode(1), r])
                return r