[Tutor] Loop optimization
Kent Johnson
kent37 at tds.net
Tue Aug 21 02:50:34 CEST 2007
wormwood_3 wrote:
> I ran a few tests, with the following results:
>
> 1. Timing using the time module:
> * Using for loop, src code:
> import time
> start = time.time()
> for word in self.dictcontents:
> self.potdomains.append(word + suffix1)
> self.potdomains.append(word + suffix2)
> end = time.time()
> runtime = end - start
> print "Using time(), for loop took %s s" % runtime
>
> ** I obtained the following results (using the full agid-4 dictionary, ~112K entries):
> python domainspotter.py --file resources/agid-4/infl.txt
> Using time(), for loop took 0.132480859756 s
> python domainspotter.py --file resources/agid-4/infl.txt
> Using time(), for loop took 0.143032073975 s
> python domainspotter.py --file resources/agid-4/infl.txt
> Using time(), for loop took 0.135424137115 s
>
> * Using generator, src code:
> def suffixGen(self, words):
> suffix1 = ".com"
> suffix2 = ".net"
> for word in words:
> yield word + suffix1
> yield word + suffix2
> def domainify(self):
> self.potdomains = []
> words = self.dictcontents
> import time
> start = time.time()
> self.potdomains = list(CheckDomains.suffixGen(self, words))
> end = time.time()
> runtime = end - start
> print "Using time(), generator took %s s" % runtime
>
> ** I obtained the following results (using the full agid-4 dictionary, ~112K entries):
> python domainspotter.py --file resources/agid-4/infl.txt
> Using time(), generator took 0.0830721855164 s
> python domainspotter.py --file resources/agid-4/infl.txt
> Using time(), generator took 0.0818212032318 s
> python domainspotter.py --file resources/agid-4/infl.txt
> Using time(), generator took 0.0830278396606 s
>
>
> This revealed that the generator seemed to be much faster, around 60% faster.
You should try an optimized for loop:
append_ = self.potdomains.append_
s1_ = suffix1
s2_ = suffix2
for word in self.dictcontents:
append_(word + s1_)
append_(word + s2_)
This will take out some of the difference at least.
Note that if you are using this as part of your domainspotter project
and you will be running a whois request on each of these names, any time
you save in this loop will be completely overshadowed by the time for
the whois request. So in this case the 'best' loop is probably the most
readable one, not the one that shaves .05 seconds off the running time.
It's still fun to play with optimization, but don't take it too
seriously until you know it will make a difference in the final program.
Kent
More information about the Tutor
mailing list