Optimisation Hints (dict processing and strings)
Kent Johnson
kent37 at tds.net
Wed Mar 30 07:18:51 EST 2005
OPQ wrote:
>>>for (2):
>>>for k in hash.keys()[:]: # Note : Their may be a lot of keys here
>>> if len(hash[k])<2:
>>> del hash[k]
>>
>
>>- use the dict.iter* methods to prevent building a list in memory. You
>>shouldn't use these values directly to delete the entry as this could
>>break the iterator:
>>
>>for key in [k for (k, v) in hash.iteritems () if len (v) < 2]:
>> del hash (key)
>>
>
>
> I gonna try, but think that would be overkill: a whole list has to be
> computed !
Yes, but it is smaller than the list returned by hash.keys(), so it should be a win over what you
were doing originally. Plus it avoids a lookup (hash[k]) which may improve the speed also.
BTW I have long assumed that iterating key, value pairs of a dict using iteritems() is faster than
iterating with keys() followed by a lookup, since the former method should be able to avoid actually
hashing the key and looking it up.
I finally wrote a test, and my assumption seems to be correct; using iteritems() is about 1/3 faster
for simple keys.
Here is a simple test:
##
d = dict((i, i) for i in range(10000))
def withItems(d):
for k,v in d.iteritems():
pass
def withKeys(d):
for k in d:
d[k]
from timeit import Timer
for fn in [withItems, withKeys]:
name = fn.__name__
timer = Timer('%s(d)' % name, 'from __main__ import d, %s' % name)
print name, timer.timeit(1000)
##
I get
withItems 0.980311184801
withKeys 1.37672944466
Kent
More information about the Python-list
mailing list