Regex speed

Reinhold Birkenfeld reinhold-birkenfeld-nospam at wolke7.net
Fri Oct 29 14:06:05 EDT 2004


Hello,

I recently ported a simple utility script to analyze a data file from
Perl to Python that uses regex substitutions, not more complex than

re1 = re.compile(r"\s*<.*>\s*")
re2 = re.compile(r".*\((.*)\).*")
re3 = re.compile(r'^"(.*)"$')

When run without these regex substitutions, the scripts' speed is nearly
equal. However, with the regexes applied, the Python version's running
time is five times or more of the Perl version's.

So my question is: Why is the re module implemented in pure Python?
Isn't it possible to integrate it into the core or rewrite it in C?

--or--

Is there a Python interface for the PCRE library out there?

Thanks

Reinhold

-- 
[Windows ist wie] die Bahn: Man muss sich um nichts kuemmern, zahlt fuer
jede Kleinigkeit einen Aufpreis, der Service ist mies, Fremde koennen
jederzeit einsteigen, es ist unflexibel und zu allen anderen Verkehrs-
mitteln inkompatibel.               -- Florian Diesch in dcoulm



More information about the Python-list mailing list