Can I beat perl at grep-like processing speed?

Tim Smith tim at bytesmith.us
Fri Dec 29 10:51:30 EST 2006



you may not be able to beat perl's regex speed, but you can take some steps to speed up your python program using map and filter

here's a modified python program that will do your search faster

#!/usr/bin/env python
import re
r = re.compile(r'destroy', re.IGNORECASE)

def stripit(x):
  return x.rstrip("\r\n")

print "\n".join( map(stripit, filter(r.search, file('bigfile'))) )

#time comparison on my machine
real    0m0.218s
user    0m0.210s
sys     0m0.010s

real    0m0.464s
user    0m0.450s
sys     0m0.010s

#original time comparison on my machine

real    0m0.224s
user    0m0.220s
sys     0m0.010s

real    0m0.508s
user    0m0.510s
sys     0m0.000s

also, if you replace the regex with a test like lambda x: x.lower().find("destroy") != -1, you will get really close to the speed of perl's (its possible perl will even take this shortcut when getting such a simple regex

#here's the times when doing the search this way
real    0m0.221s
user    0m0.210s
sys     0m0.010s

real    0m0.277s
user    0m0.280s
sys     0m0.000s

 -- Tim

-- On 12/29/06 "js " <ebgssth at gmail.com> wrote:

> Just my curiosity.
> Can python beats perl at speed of grep-like processing?
> 
> $ wget http://www.gutenberg.org/files/7999/7999-h.zip
> $ unzip 7999-h.zip
> $ cd 7999-h
> $ cat *.htm > bigfile
> $ du -h bigfile
> du -h bigfile
> 8.2M	bigfile
> 
> ---------- grep.pl ----------
> #!/usr/local/bin/perl
> open(F, 'bigfile') or die;
> 
> while(<F>) {
>   s/[\n\r]+$//;
>   print "$_\n" if m/destroy/oi;
> }
> ---------- END ----------
> ---------- grep.py ----------
> #!/usr/bin/env python
> import re
> r = re.compile(r'destroy', re.IGNORECASE)
> 
> for s in file('bigfile'):
>   if r.search(s): print s.rstrip("\r\n")
> ---------- END ----------
> 
> $ time perl grep.pl  > pl.out; time python grep.py > py.out
> real	0m0.168s
> user	0m0.149s
> sys	0m0.015s
> 
> real	0m0.450s
> user	0m0.374s
> sys	0m0.068s
> # I used python2.5 and perl 5.8.6
> -- 
> http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list