yet another noob question

Jason Nordwick jason at adapt.com
Tue Aug 15 06:20:25 EDT 2006


I use reduce to also do indexing, hashing with upsert semantics of lists of key-value pairs, transitioning through a state table, etc...

Somebody else pointed out to me how odd it is of Python to be ditching reduce when Guido van Rossum was hired by Google, and Google is literally built on map and reduce (http://en.wikipedia.org/wiki/Mapreduce).

Here is a code fragment I pulled from what I am currently working on (with a little modification to make things simple):

First build a tree:

>>> def byn(a,n): return [[a[i] for i in x] for x in zip(*[range(y,len(a),n) for y in range(n)])]
...
>>> def whilep(pred,f,x):
...     while pred(x): x = f(x)
...     return x
...
>>> t = whilep(lambda x:1<len(x), lambda x:byn(x,2), map(chr,range(97,97+16)))
>>>
>>> t
[[[[['a', 'b'], ['c', 'd']], [['e', 'f'], ['g', 'h']]], [[['i', 'j'], ['k', 'l']], [['m', 'n'], ['o', 'p']]]]]

Now if I had the list [0,1,0,0,1] and needed to find what letter it encodes:

>>> t[0][1][0][0][1]
'j'

But sometimes, I want to be able to index branches or the leaves are at different depths, so I essentially have a list of indexes to follow:

>>> def indexDepth(a,i): return reduce(lambda x,y: x[y], i, a)
...
>>> indexDepth(t,[0,1,0,0,1])
'j'


Also, I completely understand your timings, but you are not timing reduce against a hand coded loop, but instead the operator '+' against the function add, as the function symbol lookup and call seem to have a heavy price. Reduce was one of the nice ways to eliminate some of those lookups.

-j


starGaming at gmail.com wrote:
> Jason Nordwick wrote:
>> *mouth agape*
>>
>> Wow. That really sucks. I make extensive use of reduce. It seems to run more than twice as fast as a for loop.
>>
>>>>> t = Timer('bino.reduceadd(bino.bits)', 'import bino')
>>>>> s = Timer('bino.loopadd(bino.bits)', 'import bino')
>>>>> t.timeit(10)
>> 1.2373670396656564
>>>>> s.timeit(10)
>> 2.6450051612705039
>>>>> t.timeit(100)
>> 11.312374896809501
>>>>> s.timeit(100)
>> 25.817132345032689
>>
>> where
>>
>> bits = map(lambda x:randint(0,1), xrange(1000000))
>>
>> def reduceadd(v):
>> 	return reduce(add, v)
>>
>> def loopadd(v):
>> 	sum = 0
>> 	for x in v:
>> 		sum = add(sum, x)
>> 	return sum
>>
>>
>>
>> (Yes, I know there are better ways to sum up a list, but I just wanted to test the performance of the loop against reduce. In most of my reduce usages, the function passed to reduce is much more complex.)
> [...]
> 
> Assuming Guido van Rossum is right and reduce() is only useful for
> arithmetic expressions (could you show an example where you use it
> else?), you could take advantage of more "built-ins".
> To spoiler the result of this, the loop-variant is slightly (in this
> test: 2ms) faster.
> The results on my amd64 3k+ were:
> Reduce: 0.5686828074520.56633643382
> For-loop: 0.56633643382
> 
> #!/usr/bin/env python
> #
> #
> 
> from random import randint
> from timeit import Timer
> from operator import add
> 
> def bits():
>     return [randint(0,1) for x in xrange(1000)]
> # use def & list comprehension since lambda/map will be removed in
> Python 3.0, too
> 
> 
> print bits()[0:5]
> print bits()[0:5] # it's working.
> 
> def reducefunc(x):
>     return reduce(add,x)
> def loopfunc(x):
>     y = 0
>     for i in x: y += i
>     return y
> 
> x = bits()
> print reducefunc(x)
> print loopfunc(x) # both give proper output
> 
> print sum(Timer("reducefunc(bits())", "from __main__ import bits,
> reducefunc").repeat(20,100))/20
> print sum(Timer("loopfunc(bits())", "from __main__ import bits,
> loopfunc").repeat(20,100))/20
> 




More information about the Python-list mailing list