crimes in Python

Fredrik Lundh effbot at telia.com
Thu Mar 9 12:12:11 EST 2000


ztranger at my-deja.com wrote:
> > It left me with several questions:
> > - on a 2800-record file, the Python program took
> > 8 seconds, while the Perl program took 1.5.  Why?
> > (I tried precompiling all the REs I'm using in the
> > loop; it took me down to 7.9.)

footnote: python stores compiled patterns in a cache.
precompiling saves you some python overhead, but not
that much, as you can see from this example.

> Well, REs are slower in Python. I can't find where
> I read it, but Fredrik Lundh has tested the new re
> module coming with Python 1.6 and it's much faster
> than the current one. (Think it was about 10 times
> as fast on some of the examples).

well, I've done more than just testing it.  I wrote the
darn thing ;-)

the major speed boost comes from much tighter bindings.
the new engine operates directly on Python objects, while
the old one consisted of a Python layer on top of a binding
for an existing library.  for simple patterns on short strings,
the new engine is typically 5-20 times faster.  here's a
(slightly outdated) benchmark:

    http://www.deja.com/=dnc/getdoc.xp?AN=588925502

if we're talking raw engine speed, the current code base
appears to be faster than PCRE on many common patterns,
especially on long strings.  it's still slower on some stuff,
including character sets.

but we have plenty of time to tune it before 1.6 final...

(if everything goes according to plans, an alpha version
should appear in the CVS repository shortly.  initially, the
"re" and "regex" modules will still use the old engines.  to
use the new one, look for a module called "sre")

> I've read a couple of good articles about
> performance lately, but I can't seem to find them
> right now. A couple of hints: 1) profiler module.
> 2) sequence operations (and the string module)
> instead of RE:s when possible. 3) a couple of
> functions (map, filter...) creates c-loops which
> can be faster than for/while loops. 4) Use built
> ins.

great summary!

> /Fredrik

another "/F" on this newsgroup?  scary ;-)

</F>





More information about the Python-list mailing list