Optimize function similiar to dict.update() but adds common values

Peter Otten __peter__ at web.de
Thu Dec 15 12:06:09 EST 2005


Christopher Subich wrote:

> Peter Otten wrote:
>> 
>> def add_freqs3(freq1, freq2):
>>     total = dict(freq1)
>>     for key, value in freq2.iteritems():
>>         try:
>>             total[key] += value
>>         except KeyError:
>>             total[key] = value
>>     return total
>> 
> 
> Untested, but replacing the try/except pair with the following should be
> semantically equivalent and a bit faster:
> 
> total[key] = total.get(key,0) + value

That depends on the data. For an 80/20 matches to misses ratio it ranges
between the two approaches I gave:

$ python -m timeit -r1 -n1 -s"from bm import fin as f" "f()"
1 loops, best of 1: 70.5 msec per loop
$ python -m timeit -r1 -n1 -s"from bm import fget as f" "f()"
1 loops, best of 1: 91.1 msec per loop
$ python -m timeit -r1 -n1 -s"from bm import fex as f" "f()"
1 loops, best of 1: 142 msec per loop

The break even for fget/fex is at about 92% hit rate:

$ python -m timeit -r1 -n1 -s"from bm import fget as f; d =
dict.fromkeys(range(92000), 0)" "f(d)"
1 loops, best of 1: 88 msec per loop

$ python -m timeit -r1 -n1 -s"from bm import fex as f; d =
dict.fromkeys(range(92000), 0)" "f(d)"
1 loops, best of 1: 87.4 msec per loop

Here's the bm.py module so you can verify my reasoning:

freqs = dict.fromkeys(range(8*10**4), 0)
keys = range(10**5)

def fin(d=freqs, keys=keys):
    for key in keys:
        if key in d:
            d[key] += 42
        else:
            d[key] = 42
    return d

def fget(d=freqs, keys=keys):
    for key in keys:
        d[key] = d.get(key, 0) + 42
    return d

def fex(d=freqs, keys=keys):
    for key in keys:
        try:
            d[key] += 42
        except KeyError:
            d[key] = 42
    return d

if __name__ == "__main__":
    assert fin(dict(freqs)) == fex(dict(freqs)) == fget(dict(freqs))

Peter



More information about the Python-list mailing list