unintuitive dict timings

Alex Martelli aleaxit at yahoo.com
Sun Oct 12 18:17:55 EDT 2003


Bob van der Poel wrote:
   ...
> Before running, my thoughts were that get2() would have been the fastest
> (due to the fact there were less statements) and ifelse() would have
> been the slowest. Surprisingly (to me) ifelse() is the fastest() and
> get2() is the slowest.

Not necessarily -- pulling out the generation of keys and values to ensure
every function is tested on the same ones:

kc=1000
keys_and_values = [
    (random.randrange(kc), random.randrange(999))
    for t in xrange(100000)
    ]
for p in (get2, get1, ifelse):
  tm=time.clock()
  d={}
  print "Function: %s Keycount %s" % (p, kc)
  for key, val in keys_and_values:
    p(key,val)

  print "Time", time.clock()-tm

the numbers are a bit too close to call given the precision of measurement:

[alex at lancelot pop]$ python ti.py
Function: <function get2 at 0x402dced4> Keycount 1000
Time 0.43
Function: <function get1 at 0x402dce9c> Keycount 1000
Time 0.44
Function: <function ifelse at 0x402dce64> Keycount 1000
Time 0.41


> Which begs the question...is there a faster way? BTW, interesting (again
> to me) that the most readable is also the fastest.

def sede(key, val):
    d.setdefault(key, []).append(val)

appears to be very marginally fastest, but again it's a close call:

[alex at lancelot pop]$ python ti.py
Function: <function sede at 0x402dcf44> Keycount 1000
Time 0.37
Function: <function get2 at 0x402dcf0c> Keycount 1000
Time 0.45
Function: <function get1 at 0x402dced4> Keycount 1000
Time 0.44
Function: <function ifelse at 0x402dce9c> Keycount 1000
Time 0.4

I think you're probably best-advised to stay with what you deem most
readable: a perhaps-10%-win can hardly justify losing readability.  As
for me, I find setdefault quite readable despite the ugly name.  I sure
wish we DID have a way to tell a method to be lazy wrt one of its
arguments, tho;-).


Alex






More information about the Python-list mailing list