Patch : doct.merge

Wed Dec 28 03:38:24 EST 2005

Here's method 3 :

# Python 2.3 (no generator expression)
a.update([(k,v) for k,v in b.iteritems() if k not in a])

# Python 2.4 (with generator expression)
a.update((k,v) for k,v in b.iteritems() if k not in a)

It's a bit cleaner but still less efficient than using what's already
in the PyDict_Merge C API. It's even less efficient than method 1 and 2
! Here is the benchmark I used :

import timeit

init = '''a = dict((i,i) for i in xrange(1000) if i%2==0); b =
dict((i,i+1) for i in xrange(1000))'''

t = timeit.Timer('''for k in b:\n\tif k not in a:\n\t\ta[k] =
b[k]''',init)
print 'Method 1 : %.3f'%t.timeit(10000)

t = timeit.Timer('''temp = dict(b); temp.update(a); a = temp''',init)
print 'Method 2 : %.3f'%t.timeit(10000)

t = timeit.Timer('''a.update((k,v) for k,v in b.iteritems() if k not in
a)''',init)
print 'Method 3 : %.3f'%t.timeit(10000)

t = timeit.Timer('''a.merge(b)''',init)
print 'Using dict.merge() : %.3f'%t.timeit(10000)

Here are the results :

Method 1 : 5.315
Method 2 : 3.855
Method 3 : 7.815
Using dict.merge() : 1.425

So using generator expressions is a bad idea, and using the new
dict.merge() method gives an appreciable performance boost (~ x 3.73
here).

Regards,
Nicolas