Curious to see alternate approach on a search/replace via regex

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Feb 8 06:43:04 EST 2013


Ian Kelly wrote:

> On Thu, Feb 7, 2013 at 10:57 PM, rh <richard_hubbe11 at lavabit.com> wrote:
>> On Thu, 7 Feb 2013 18:08:00 -0700
>> Ian Kelly <ian.g.kelly at gmail.com> wrote:
>>
>>> Which is approximately 30 times slower, so clearly the regular
>>> expression *is* being cached.  I think what we're seeing here is that
>>> the time needed to look up the compiled regular expression in the
>>> cache is a significant fraction of the time needed to actually execute
>>> it.
>>
>> By "actually execute" you mean to apply the compiled expression
>> to the search or sub? Or do you mean the time needed to compile
>> the pattern into a regex obj?
> 
> The former.  Both are dwarfed by the time needed to compile the pattern.

Surely that depends on the size of the pattern, and the size of the data
being worked on.

Compiling the pattern "s[ai]t" doesn't take that much work, it's only six
characters and very simple. Applying it to:

"sazsid"*1000000 + "sat"

on the other hand may be a tad expensive.

Sweeping generalities about the cost of compiling regexes versus searching
with them are risky.



-- 
Steven




More information about the Python-list mailing list