re.match() performance

Emanuele D'Arrigo manu3d at gmail.com
Thu Dec 18 08:51:33 EST 2008


Sorry for the previous post, hit the Enter button by mistake... here's
the complete one:

Hi everybody!

I've written the code below to test the differences in performance
between compiled and non-compiled regular expression matching but I
don't quite understand the results. It appears that the compiled the
pattern only takes 2% less time to process the match. Is there some
caching going on in the uncompiled section that prevents me from
noticing its otherwise lower speed?

Manu


------------

import re
import time

## Setup
pattern = "<a>(.*)</a>"
compiledPattern = re.compile(pattern)

longMessage = "<a>"+ "a" * 100000 +"</a>"

numberOfRuns = 1000

## TIMED FUNCTIONS
startTime = time.clock()
for i in range(0, numberOfRuns):
    re.match(pattern, longMessage)
patternMatchingTime = time.clock() - startTime

startTime = time.clock()
for i in range(0, numberOfRuns):
    compiledPattern.match(longMessage)
compiledPatternMatchingTime = time.clock() - startTime

ratioCompiledToNot = compiledPatternMatchingTime / patternMatchingTime

## PRINT OUTS
print("")
print("           Pattern Matching Time: " + str(patternMatchingTime))
print("(Compiled) Pattern Matching Time: " + str
(compiledPatternMatchingTime))
print("")
print("Ratio Compiled/NotCompiled: " + str(ratioCompiledToNot))
print("")



More information about the Python-list mailing list