sobering observation, python vs. perl

Ethan Furman ethan at stoneleaf.us
Thu Mar 17 13:26:12 EDT 2016


On 03/17/2016 09:36 AM, Charles T. Smith wrote:

> Yes, your point was to forgo REs despite that they are useful.
> I could have thought the search would have been better as:
>
>      'release[-.:][Rr]eq'
>
> or something else ... you're in a "defend python at all costs!" mode.

No, I'm in the "don't try to write <language X> in Python" mode, and 
"don't use 10lb sledge when 6oz hammer will do" mode:

--------------------------------------------------------
# using `in` and printing line as each is found
real	0m1.703s
user	0m0.184s
sys	0m0.260s

# using `in` and printing lines at the end
real	0m0.217s
user	0m0.112s
sys	0m0.068s

# using 're' and printing lines at the end
real	0m0.608s
user	0m0.516s
sys	0m0.060s
--------------------------------------------------------

As you can see, how you print has a huge impact.  Hopefully you also 
noticed that using `re` when `in` would do made the script 3 times slower.

--------------------------------------------------------
# using `in` code
import sys
found = []
for fn in sys.argv[1:]:
    with open(fn) as fh:
       for line in fh:
          if 'timezone' in line:
             found.append(line)
print ''.join(found)
--------------------------------------------------------
# using `re` code
import sys
import re
found = []
for fn in sys.argv[1:]:
    with open(fn) as fh:
       for line in fh:
          if re.search('timezone', line):
             found.append(line)
print ''.join(found)
--------------------------------------------------------

--
~Ethan~



More information about the Python-list mailing list