[Python-ideas] Yet another sum function (fractions.sum)

Fri Aug 16 20:51:32 CEST 2013

The discussion around having a sum() function in the statistics module
reminded me that despite the fact that there are two functions in the
stdlib (and more in the third-party libraries I use) I have previously
rolled my own sum() function. The reason is that the stdlib does not
currently contain a function that can sum Fractions in linear time for
many inputs even though it is possible to implement a function that
achieves closer to linear performance in more cases. I propose that
there could be a sum() function or class-method in the fractions
module for this purpose. I would just raise this on the tracker but
seeing how many feathers were ruffled by statistics.sum I thought I'd
suggest it here first.

To demonstrate the problem I'll show how a quick and dirty mergesum()
can out-perform sum():

    $ cat tmpsum.py
    # tmpsum.py

    # Generate data
    from random import randint, seed
    from fractions import Fraction as F
    seed(123456789)  # Use the same numbers each time
    nums = [F(randint(-1000, 1000), randint(1, 1000)) for _ in range(100000)]

    # 1) mergesum() is more efficient than sum() with Fractions.
    # 2) It assumes associativity of addition since it reorders the sum.
    # 3) It performs the same number of __add__ operations as sum()
    # 4) A more complicated iterator version is possible.
    def mergesum(seq):
        while len(seq) > 1:
            new = [a + b for a, b in zip(seq[:-1:2], seq[1::2])]
            if len(seq) % 2:
                new.append(seq[-1])
            seq = new
        return seq[0]

    # Just a quick test
    assert mergesum(nums[:101]) == sum(nums[:101])

So now let's time sum() with these numbers:

    $ python -m timeit -s 'from tmpsum import mergesum, nums;
nums=nums[:10]' 'sum(nums)'
    1000 loops, best of 3: 206 usec per loop
    $ python -m timeit -s 'from tmpsum import mergesum, nums;
nums=nums[:100]' 'sum(nums)'
    100 loops, best of 3: 6.24 msec per loop
    $ python -m timeit -s 'from tmpsum import mergesum, nums;
nums=nums[:1000]' 'sum(nums)'
    10 loops, best of 3: 320 msec per loop
    $ python -m timeit -s 'from tmpsum import mergesum, nums;
nums=nums[:10000]' 'sum(nums)'
    10 loops, best of 3: 6.43 sec per loop

Each time we increase the size of the input by a factor of 10 the
computation time increases by a factor of about 30. This is caused by
computing the gcd() to normalise the result between each increment to
the total. As the numerator and denominator of the subtotal get larger
the time taken to compute the gcd() after each increment increases.
The result is that the algorithm overall costs somewhere between
linear and quadratic time [from the above maybe it's O(n**(3/2))].

Now let's see how mergesum() performs:

    $ python -m timeit -s 'from tmpsum import mergesum, nums;
nums=nums[:10]' 'mergesum(nums)'
    10000 loops, best of 3: 186 usec per loop
    $ python -m timeit -s 'from tmpsum import mergesum, nums;
nums=nums[:100]' 'mergesum(nums)'
    100 loops, best of 3: 2.16 msec per loop
    $ python -m timeit -s 'from tmpsum import mergesum, nums;
nums=nums[:1000]' 'mergesum(nums)'
    10 loops, best of 3: 24.6 msec per loop
    $ python -m timeit -s 'from tmpsum import mergesum, nums;
nums=nums[:10000]' 'mergesum(nums)'
    10 loops, best of 3: 256 msec per loop
    $ python -m timeit -s 'from tmpsum import mergesum, nums;
nums=nums[:100000]' 'mergesum(nums)'
    10 loops, best of 3: 2.59 sec per loop

(I didn't time sum() with a 100000 input to compare with that last run).

Notice that mergesum() out-performs sum() for all input sizes and that
the time scaling is much closer to linear i.e. 10x the input takes 10x
the time. It works by summing adjacent pairs of numbers halving the
size of the list each time. This has the benefit that the numerator
and denominator in most of additions are a lot smaller so that their
gcd() is computed faster. Only the final additions need to use the
really big numerator and denominator. The performance of both sum()
functions are sensitive to the distribution of inputs and in
particular the distribution of denominators but in my own usage a
merge-sum algorithm has always been faster.

I have found this useful for myself (using even larger lists of
numbers) when using the fractions module and I propose that something
similar could be added to the fractions module.

Oscar