Regex speed
A.M. Kuchling
amk at amk.ca
Fri Oct 29 14:18:39 EDT 2004
On Fri, 29 Oct 2004 20:06:05 +0200,
Reinhold Birkenfeld <reinhold-birkenfeld-nospam at wolke7.net> wrote:
> re1 = re.compile(r"\s*<.*>\s*")
> re2 = re.compile(r".*\((.*)\).*")
> re3 = re.compile(r'^"(.*)"$')
You should post the actual code, because these substitutions could be made
more efficient. For example, why are the bracketing \s* in re1 and the
bracketing .* in re2 there? re3 isn't using re.M, so it's equivalent
to 'if s.startswith('"') and s.endswith('"')'.
> So my question is: Why is the re module implemented in pure Python?
> Isn't it possible to integrate it into the core or rewrite it in C?
The regex engine *is* implemented in C; look at Modules/_sre.c.
> Is there a Python interface for the PCRE library out there?
PCRE was used from versions 1.5 up to 2.3; it'll be gone in Python 2.4. You
could try 'import pre' to use it, but I don't expect it to be significantly
faster.
--amk
More information about the Python-list
mailing list