sum for sequences?

Sun Mar 28 13:21:08 EDT 2010

On Mar 28, 9:56 am, "Alf P. Steinbach" <al... at start.no> wrote:
> * Steve Howell:
>> (talking about summing a list of lists)
>
>  From a more practical point of view, the sum efficiency could be improved by
> doing the first addition using '+' and the rest using '+=', without changing the
> behavior.

I like that approach, because you don't have to assume that the start
object is clonable, although perhaps you need to make that assumption
anyway for the case where the list is empty.

Here is code that might shed light on Alf's suggested approach:

    import timeit
    M = 10
    N = 8000

    def in_place(
        start = [],
        sublists = ([[None] * M]) * N
        ):
        # only macro-optimized
        i = 0
        for sublist in sublists:
            if i == 0:
               accum = start + sublist
               i += 1
            else:
                accum.extend(sublist)
        if i == 0:
            return 'whatever' # semantics here?
        return accum

    def with_intermediates(
        start = [],
        sublists = ([[None] * M]) * N
        ):
        accum = start
        for sublist in sublists:
            accum = accum + sublist
        return accum

    t1 = timeit.timeit('in_place()', 'from __main__ import in_place',
number=10)
    t2 = timeit.timeit('with_intermediates()', 'from __main__ import
with_intermediates', number=10)
    print M, N, t1, t2, int(t2 / t1)

For N=100, you get a significant speedup (x84):

 10 1000 0.00102496147156 0.0867249965668 84

More data:

 10 1000 0.00102496147156 0.0867249965668 84
 10 2000 0.00211381912231 0.172616004944 81
 10 4000 0.00491786003113 0.561800956726 114
 10 8000 0.0706541538239 19.9019489288 281
 10 16000 0.0161759853363 9.64171791077 596
 10 32000 0.0351569652557 43.98346591 1251