[Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Tim Peters tim.peters at gmail.com
Mon Apr 9 22:30:42 EDT 2018


[Tim]
>> while we have N numbers, there are N+1 slice indices.  So
>> accumulate(xs) doesn't quite work.  It needs to also have a 0 inserted
>> as the first prefix sum (the empty prefix sum(xs[:0]).
>>
>> Which is exactly what a this_is_the_initial_value=0 argument would do
>> for us.

[Greg Ewing <greg.ewing at canterbury.ac.nz>]
> In this case, yes. But that still doesn't mean it makes
> sense to require the initial value to be passed *in* as
> part of the input sequence.
>
> Maybe the best idea is for the initial value to be a
> separate argument, but be returned as the first item in
> the list.

I'm not sure you've read all the messages in this thread, but that's
exactly what's being proposed.  That. e.g., a new optional argument:

    accumulate(xs, func, initial=S)

act like the current

     accumulate(chain([S], xs), func)

Note that in neither case is the original `xs` modified in any way,
and in both cases the first value generated is S.

Note too that the proposal is exactly the way Haskell's `scanl` works
(although `scanl` always requires specifying an initial value - while
the related `scanl1` doesn't allow specifying one).

And that's all been so since the thread's first message, in which
Raymond gave a proposed implementation:

        _sentinel = object()

        def accumulate(iterable, func=operator.add, start=_sentinel):
            it = iter(iterable)
            if start is _sentinel:
                try:
                    total = next(it)
                except StopIteration:
                    return
            else:
                total = start
            yield total
            for element in it:
                total = func(total, element)
                yield total

> I can think of another example where this would make
> sense. Suppose you have an initial bank balance and a
> list of transactions, and you want to produce a statement
> with a list of running balances.
>
> The initial balance and the list of transactions are
> coming from different places, so the most natural way
> to call it would be
>
>    result = accumulate(transactions, initial = initial_balance)
>
> If the initial value is returned as item 0, then the
> result has the following properties:
>
>    result[0] is the balance brought forward
>    result[-1] is the current balance
>
> and this remains true in the corner case where there are
> no transactions.

Indeed, something quite similar often applies when parallelizing
search loops of the form:

     for candidate in accumulate(chain([starting_value], cycle(deltas))):

For a sequence that eventually becomes periodic in the sequence of
deltas it cycles through, multiple processes can run independent
searches starting at carefully chosen different starting values "far"
apart.  In effect, they're each a "balance brought forward" pretending
that previous chunks have already been done.

Funny:  it's been weeks now since I wrote an accumulate() that
_didn't_ want to specify a starting value - LOL ;-)


More information about the Python-ideas mailing list