re.match() performance
Peter Otten
__peter__ at web.de
Thu Dec 18 09:19:49 EST 2008
Emanuele D'Arrigo wrote:
> I've written the code below to test the differences in performance
> between compiled and non-compiled regular expression matching but I
> don't quite understand the results. It appears that the compiled the
> pattern only takes 2% less time to process the match. Is there some
> caching going on in the uncompiled section that prevents me from
> noticing its otherwise lower speed?
Yes:
>>> import re
>>> re._cache
{}
>>> re.match("yadda", "")
>>> re._cache
{(<class 'str'>, 'yadda', 0): <_sre.SRE_Pattern object at 0x2ac6e66e9e70>}
Hint: questions like this are best answered by the source code, and Python
is open source. You don't even have to open an editor:
>>> import inspect
>>> print(inspect.getsource(re.match))
def match(pattern, string, flags=0):
"""Try to apply the pattern at the start of the string, returning
a match object, or None if no match was found."""
return _compile(pattern, flags).match(string)
>>> print(inspect.getsource(re._compile))
def _compile(*key):
# internal: compile pattern
cachekey = (type(key[0]),) + key
p = _cache.get(cachekey)
if p is not None:
return p
pattern, flags = key
if isinstance(pattern, _pattern_type):
if flags:
raise ValueError(
"Cannot process flags argument with a compiled pattern")
return pattern
if not sre_compile.isstring(pattern):
raise TypeError("first argument must be string or compiled pattern")
p = sre_compile.compile(pattern, flags)
if len(_cache) >= _MAXCACHE:
_cache.clear()
_cache[cachekey] = p
return p
Peter
More information about the Python-list
mailing list