Compiling regex inside function?

Diez B. Roggisch deets at nospam.web.de
Mon Aug 3 12:01:14 EDT 2009


Anthra Norell wrote:

> Hi all,
> 
>    I have a regex that has no use outside of a particular function. From
> an encapsulation point of view it should be scoped as restrictively as
> possible. Defining it inside the function certainly works, but if
> re.compile () is run every time the function is called, it isn't such a
> good idea after all. E.g.
> 
> def entries (l):
>         r = re.compile ('([0-9]+) entr(y|ies)')
>         match = r.search (l)
>         if match: return match.group (1)
> 
> So the question is: does "r" get regex-compiled once at py-compile time
> or repeatedly at entries() run time?

This can't be answered as simple yes/no-question.

While the statement is executed each time, the resulting pattern-object
isn't re-created, instead there is a caching-mechanism inside the module -
so unless you create a situation where that cache's limits are exceeded and
pattern objects are removed from it, you are essentially having the
overhead of one function-call & a dict-lookup. Certainly worth it.

As an additional note: r"" has *nothing* todo with this, that's just
so-called raw string literals which have a different escaping-behavior -
thus it's easier to write regexes in them. 

Diez





More information about the Python-list mailing list