python vs. grep

Wojciech Walczak wojtek.gminick.walczak at gmail.com
Tue May 6 17:33:51 EDT 2008


2008/5/6, Anton Slesarev <slesarev.anton at gmail.com>:
>  But I have some problem with writing performance grep analog.
[...]
>  Python code 3-4 times slower on windows. And as I remember on linux
>  the same situation...
>
>  Buffering in open even increase time.
>
>  Is it possible to increase file reading performance?

The best advice would be not to try to beat grep, but if you really
want to, this is the right place ;)

Here is my code:
$ cat grep.py
import sys

if len(sys.argv) != 3:
   print 'grep.py <pattern> <file>'
   sys.exit(1)

f = open(sys.argv[2],'r')

print ''.join((line for line in f if sys.argv[1] in line)),

$ ls -lh debug.0
-rw-r----- 1 gminick root 4,1M 2008-05-07 00:49 debug.0

---
$ time grep nusia debug.0 |wc -l
26009

real    0m0.042s
user    0m0.020s
sys     0m0.004s
---

---
$ time python grep.py nusia debug.0 |wc -l
26009

real    0m0.077s
user    0m0.044s
sys     0m0.016s
---

---
$ time grep nusia debug.0

real    0m3.163s
user    0m0.016s
sys     0m0.064s
---

---
$ time python grep.py nusia debug.0
[26009 lines here...]
real    0m2.628s
user    0m0.032s
sys     0m0.064s
---

So, printing the results take 2.6 secs for python and 3.1s for original grep.
Suprised? The only reason for this is that we have reduced the number
of write calls in the python example:

$ strace -ooriggrep.log grep nusia debug.0
$ grep write origgrep.log |wc -l
26009


$ strace -opygrep.log python grep.py nusia debug.0
$ grep write pygrep.log |wc -l
12


Wish you luck saving your CPU cycles :)

-- 
Regards,
Wojtek Walczak
http://www.stud.umk.pl/~wojtekwa/



More information about the Python-list mailing list