Generator slower than iterator?
Lie Ryan
lie.1296 at gmail.com
Tue Dec 16 11:10:54 EST 2008
On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote:
> Hi all,
>
> Im parsing a 4.1GB apache log to have stats about how many times an ip
> request something from the server.
>
> The first design of the algorithm was
>
> for line in fileinput.input(sys.argv[1:]):
> ip = line.split()[0]
> if match_counter.has_key(ip):
> match_counter[ip] += 1
> else:
> match_counter[ip] = 1
nitpick:
dict.has_key is usually replaced with
if ip in match_counter: ...
also, after investigating your code further, I see that you've
unnecessarily used generators, the first code is simpler and you've not
avoided any creation of huge intermediate list by using the generator
this way. You won't get any performance improvement with this, and
instead get a performance hit due to function overhead and name look up.
More information about the Python-list
mailing list