[issue8692] Use divide-and-conquer for faster factorials

Alexander Belopolsky report at bugs.python.org
Wed May 12 22:34:33 CEST 2010


Alexander Belopolsky <belopolsky at users.sourceforge.net> added the comment:

Here is one more datapoint.

$ ./python.exe -m timeit -s "import factorial4 as fm;
fm.partial_product = fm.partial_product; f = fm.factorial " "f(10000)"
10 loops, best of 3: 66.1 msec per loop
[32794 refs]
$ ./python.exe -m timeit -s "import factorial4 as fm;
fm.partial_product = fm.partial_product3; f = fm.factorial "
"f(10000)"
10 loops, best of 3: 63 msec per loop
[32794 refs]
$ ./python.exe -m timeit -s "import factorial4 as fm;
fm.partial_product = fm.partial_product2; f = fm.factorial "
"f(10000)"
10 loops, best of 3: 43.3 msec per loop

partial_product3 multiplies adjacent numbers instead of first by last.
 I am not sure it reproduces the order of multiplication in the
recursive version exactly, but it does show that the order of
multiplication matters a lot.

I wonder if one could write an elegant recursive version that would
multiply first by last in partial_product.

----------
Added file: http://bugs.python.org/file17313/factorial4.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8692>
_______________________________________
-------------- next part --------------
import functools
import operator

product = functools.partial(functools.reduce, operator.mul)

def naive_factorial(n):
    """Naive implementation of factorial: product([1, ..., n])

    >>> naive_factorial(4)
    24
    """
    return product(range(1, n+1), 1)

def factorial(n):
    """Implementation of Binary-Split Factorial algorithm

    See http://www.luschny.de/math/factorial/binarysplitfact.html

    >>> for n in range(20):
    ...     assert(factorial(n) == naive_factorial(n))
    >>> import math
    >>> assert(factorial(100) == math.factorial(100))
    """
    _, r = loop(n)
    return r << (n - count_bits(n))

def loop(n):
    p = r = 1
    for i in range(n.bit_length() - 2, -1, -1):
        m = n >> i
        if m > 2:
            p *= partial_product(((m >> 1) + 1) >> 1, (m - 1) >> 1)
            r *= p
    return p, r

def partial_product(j, i):
    if i == j:
        return j << 1 | 1
    if i == j + 1:
        return (j << 1 | 1) * (i << 1 | 1)
    l = i + j >> 1
    return partial_product(j, l) * partial_product(l + 1, i)

def partial_product1(j, i):
    return product((l << 1 | 1 for l in range(j, i + 1)), 1)

def partial_product2(j, i):
    a = [l << 1 | 1 for l in range(j, i + 1)]
    n = i - j + 1
    p = 1
    while n:
        n >>= 1
        for k in range(n):
            a[k] *= a[n-k]
    return a[0]

def partial_product3(j, i):
    a = [l << 1 | 1 for l in range(j, i + 1)]
    while 1:
        n = len(a)
        if n == 1:
            return a[0]
        a = [a[k<<1] * a[k<<1|1] for k in range(n>>1)] + a[(n >> 1)<<1:]


def count_bits(n):
    count = 0
    while n:
        n &= n - 1
        count += 1
    return count

if __name__ == '__main__':
    #print(partial_product(10, 20))
    import doctest
    doctest.testmod()


More information about the Python-bugs-list mailing list