Is there a list comprehension for this?

Wed Nov 22 06:41:25 EST 2006

On Wed, 22 Nov 2006 02:28:04 -0800, John Machin wrote:

> 
> Steven D'Aprano wrote:
>> On Tue, 21 Nov 2006 21:00:02 -0800, John Machin wrote:
>>
>> > Steven D'Aprano wrote:
>> >
>> > [snip]
>> >
>> >> def running_sum(dw):
>> >>     """Return a list of the running sums of sequence dw"""
>> >>     rs = [0]*len(dw)
>> >>     for i in range(len(dw)):
>> >>         rs[i] = dw[i] + rs[i-1]
>> >
>> > Please explain to the newbies why there is no exception raised when
>> > rs[i-1] is executed for i == 0, and state whether you consider this is
>> > a Good Idea or a Filthy Trick or a Fortunate Accident.
>>
>> Because rs[0-1] is just rs[-1], which fetches the LAST item of the list. I
>> cunningly initialised the list to all zeroes, so adding zero to the first
>> term doesn't change the result value.
>>
>> It is a variation of the sentinel technique, where you add an extra value
>> to the beginning or end of a list so that you don't need to treat the
>> first or last item differently. In this specific case, I think it is a
>> combination of Good Idea and Fortunate Accident, but since the
>> meaning of list[-1] is both deliberate and well-documented, it is
>> certainly not a Filthy Trick.
>>
> 
> Fair enough. But it does make the concerned reader go back a couple of
> lines to see why it doesn't run amok.

Nobody said that every piece of code should be instantly comprehensible
just at a glance. Sometimes you do actually have to think about code to
understand it, even in Python :)

> Here's my attempt at a
> no-reader-backtracking solution:
> 
> def running_sum_2(dw):
>     rs = dw[:1]
>     for i in xrange(1, len(dw)):
>         rs.append(dw[i] + rs[-1])
>     return rs
> 
> Comments invited.

Even with xrange() instead of range, it is a little slower than my version
for largish input lists because you are repeatedly appending to a list.

>>> timeit.Timer("running_sum(L)", 
... "from __main__ import running_sum; L = range(500)").timeit(1000)
0.56354999542236328
>>> timeit.Timer("running_sum_2(L)", 
... "from __main__ import running_sum_2; L = range(500)").timeit(1000)
0.68534302711486816

Although the speed difference disappears (or even reverses) for small
enough lists. Either way, it isn't really a major speed difference --
Python's resizing of lists is very smart.

But why build a list of all the running sums when you probably only need
them one at a time? Here's a generator version that can take any iterable,
not just a sequence:

def running_sum_3(iterable):
    sum_ = 0
    for x in iterable:
        sum_ += x
        yield sum_

And it is considerably faster than either of the list-only versions:

>>> timeit.Timer("list(running_sum_3(L))", 
... "from __main__ import running_sum_3; L = range(500)").timeit(1000)
0.33915305137634277

-- 
Steve.