python vs. grep

Anton Slesarev slesarev.anton at gmail.com
Wed May 7 06:13:04 EDT 2008


I try to save my time not cpu cycles)

I've got file which I really need to parse:
-rw-rw-r--  1 xxx  xxx  3381564736 May  7 09:29 bigfile

That's my results:

$ time grep "python" bigfile | wc -l
    2470

real    0m4.744s
user    0m2.441s
sys     0m2.307s

And python scripts:

import sys

if len(sys.argv) != 3:
   print 'grep.py <pattern> <file>'
   sys.exit(1)

f = open(sys.argv[2],'r')

print ''.join((line for line in f if sys.argv[1] in line)),

$ time python grep.py "python" bigfile | wc -l
    2470

real    0m37.225s
user    0m34.215s
sys     0m3.009s

Second script:

import sys

if len(sys.argv) != 3:
   print 'grepwc.py <pattern> <file>'
   sys.exit(1)

f = open(sys.argv[2],'r',100000000)

print sum((1 for line in f if sys.argv[1] in line)),


time python grepwc.py "python" bigfile
2470

real    0m39.357s
user    0m34.410s
sys     0m4.491s

40 sec and 5. This is really sad...

That was on freeBSD.



On windows cygwin.

Size of bigfile is ~50 mb

$ time grep "python" bigfile | wc -l
51

real    0m0.196s
user    0m0.169s
sys     0m0.046s

$ time python grepwc.py "python" bigfile
51

real    0m25.485s
user    0m2.733s
sys     0m0.375s




More information about the Python-list mailing list