Python speed and `pcre'
Gordon McMillan
gmcm at hypernet.com
Tue Aug 31 11:51:04 EDT 1999
François Pinard writes:
> After having translated some code (not big, but not small) from Perl
> to Python, I discover it runs ten times slower. I did not learn to
> profile yet, and I am not sure it would help, after having read that
> profiling gives call counts, but no elapsed time. Is that right?
No.
> My intuition tells me that the amount of regular expression matching
> might provide an explanation. Here is my set of _hypotheses_ (I'm
> not sure):
>
> * Perl keeps compiled all /REGEXP/ not using string interpolation,
> * Python cache for compiled REGEXP is less virtuous than I imagined.
Just the last one used, I think.
[wants to cache the re.compiled regex, but keep the regex string near
its use]
Here's one way of caching (note that I don't do anything like this -
I just put all my patterns at the top and compile them right away.
The intent of a regex is often better communicated through giving it
a meaningful name, rather than staring at the regex itself):
---------------------------------------
import re
ccache = {}
def ccompile(regex):
if not ccache.has_key(regex):
ccache[regex] = re.compile(regex)
return ccache[regex]
def cmatch(regex, s):
return ccompile(regex).match(s)
def csearch(regex, s):
return ccompile(regex).search(s)
s = "This little piggie had Spam, This little piggie had None"
mo = csearch('piggie', s)
print mo.group(), 'at', mo.start()
mo = csearch('piggie', s[mo.end():])
print mo.group(), 'at', mo.start()
--------------------------------------------
Now had I been using explicitly compiled regexes, I would not have
had to slice s, since the compiled regex object lets you specify
start and end (without a copy being made). Yes, I could add that to
this code, but why bother?
- Gordon
More information about the Python-list
mailing list