[Python-Dev] Re: accumulator display syntax

Alex Martelli aleaxit at yahoo.com
Sat Oct 25 05:32:04 EDT 2003


On Thursday 23 October 2003 07:43, Guido van Rossum wrote:
   ...
>   <result> = <initializer>
>   for <variable> in <iterable>:
>       <result> = <expression>
   ...
> Concluding, I think the reduce() pattern is doomed -- the template is
> too complex to capture in special syntax.

I concur, particularly because the assignment in the pattern sketched
above is too limiting.  You point out that forcing augmented assignment
would lose power (e.g., Horner's polynomials need bare assignment),
but the inability to use it would imply inefficiencies -- e.g.,

flatlist = []
for sublist in listoflists:
    flatlist += sublist

or

    flatlist.extend(sublist)

is better than forcing a "flatlist = flatlist + sublist" as the loop body.

Indeed, that's a spot where even 'sum' can be a performance trap;
consider the following z.py:

lol = [ [x] for x in range(1000) ]

def flat1(lol=lol):
    return sum(lol, [])

def flat2(lol=lol):
    result = []
    for sub in lol: result.extend(sub)
    return result

and the measurements:

[alex at lancelot pop]$ timeit.py -c -s'import z' 'z.flat1()'
100 loops, best of 3: 8.5e+03 usec per loop

[alex at lancelot pop]$ timeit.py -c -s'import z' 'z.flat2()'
1000 loops, best of 3: 940 usec per loop

sum looks cooler, but it can be an order of magnitude slower
than the humble loop of result.extend calls.  We could fix this
specific performance trap by specialcasing in sum those cases
where the result has a += method -- hmmm... would a patch for
this performance bug be accepted for 2.3.* ...?  (I understand and
approve that we're keen on avoiding adding functionality in any
2.3.*, but fixed-functionality performance enhancements should
be just as ok as fixes to functionality bugs, right?)

Anyway, there's a zillion other cases that sum doesn't cover
(well, unless we extend dict to have a += synonym for update,
which might be polymorphically useful:-), such as

totaldict = {}
for subdict in listofdicts:
    totaldict.update(subdict)

Indeed, given the number of "modify in place and return None"
methods of both built-in and user-coded types, I think the variant
of "accumulation pattern" which simply calls such a method
on each item of an iterator is about as prevalent as the variant
with assignment "result = ..." as the loop body.


Alex





More information about the Python-Dev mailing list