Why is regex so slow?

Roy Smith roy at panix.com
Tue Jun 18 13:08:52 EDT 2013


On Jun 18, 2013, at 1:01 PM, Skip Montanaro wrote:

>> I don't understand why the first way is so much slower.
> 
> I have no obvious answers, but a couple suggestions:
> 
> 1. Can you anchor the pattern at the beginning of the line?  (use
> match() instead of search())

That's one of the things we tried.  Didn't make any difference.

> 2. Does it get faster it you eliminate the "(.*)" part of the pattern?

Just tried that, it also didn't make any difference.

> It seems that if you find a line matching the first part of the
> pattern, you could just as easily split the line yourself instead of
> creating a group.


At this point, I'm not so much interested in making this faster as understanding why it's so slow.  I'm tempted to open this up as a performance bug against the regex module (which I assume will be rejected, at least for the 2.x series).

---
Roy Smith
roy at panix.com




More information about the Python-list mailing list