Why is regex so slow?

MRAB python at mrabarnett.plus.com
Tue Jun 18 15:49:46 EDT 2013


On 18/06/2013 20:21, Roy Smith wrote:
> In article <mailman.3549.1371576854.3114.python-list at python.org>,
> Mark Lawrence  <breamoreboy at yahoo.co.uk> wrote:
>
>> Out of curiousity have the tried the new regex module from pypi rather
>> than the stdlib version?  A heck of a lot of work has gone into it see
>> http://bugs.python.org/issue2636
>
> I just installed that and gave it a shot.  It's *slower* (and, much
> higher variation from run to run).  I'm too exhausted fighting with
> OpenOffice to get this into some sane spreadsheet format, so here's
> the raw timings:
>
> Built-in re module:
> 0:01.32
> 0:01.33
> 0:01.32
> 0:01.33
> 0:01.35
> 0:01.32
> 0:01.35
> 0:01.36
> 0:01.33
> 0:01.32
>
> regex with flags=V0:
> 0:01.66
> 0:01.53
> 0:01.51
> 0:01.47
> 0:01.81
> 0:01.58
> 0:01.78
> 0:01.57
> 0:01.64
> 0:01.60
>
> regex with flags=V1:
> 0:01.53
> 0:01.57
> 0:01.65
> 0:01.61
> 0:01.83
> 0:01.82
> 0:01.59
> 0:01.60
> 0:01.55
> 0:01.82
>
I reckon that about 1/3 of that time is spent in 
PyArg_ParseTupleAndKeywords, just getting the arguments!

There's a higher initial overhead in using regex than string methods,
so working just a line at time will take longer.



More information about the Python-list mailing list