Optimize function similiar to dict.update() but adds common values

Peter Otten __peter__ at web.de
Wed Dec 14 11:06:31 EST 2005


Gregory Piñero wrote:

> def add_freqs(freq1,freq2):
>     """Add two word freq dicts"""
>     newfreq={}
>     for key,value in freq1.items():
>         newfreq[key]=value+freq2.get(key,0)
>     for key,value in freq2.items():
>         newfreq[key]=value+freq1.get(key,0)
>     return newfreq

> Any ideas on doing this task a lot faster would be appriciated.

With items() you copy the whole dictionary into a list of tuples;
iteritems() just walks over the existing dictionary and creates one tuple
at a time.

With "80% overlap", you are looking up and setting four out of five values
twice in your for-loops. 

Dump the symmetry and try one of these:

def add_freqs2(freq1, freq2):
    total = dict(freq1)
    for key, value in freq2.iteritems():
        if key in freq1:
            total[key] += value
        else:
            total[key] = value
    return total

def add_freqs3(freq1, freq2):
    total = dict(freq1)
    for key, value in freq2.iteritems():
        try:
            total[key] += value
        except KeyError:
            total[key] = value
    return total

My guess is that add_freqs3() will perform best.

Peter



More information about the Python-list mailing list