RE Module Performance

Chris Angelico rosuav at gmail.com
Fri Jul 12 03:37:04 EDT 2013


On Fri, Jul 12, 2013 at 9:44 AM, Devyn Collier Johnson
<devyncjohnson at gmail.com> wrote:
> I recently saw an email in this mailing list about the RE module being made
> slower. I no long have that email. However, I have viewed the source for the
> RE module, but I did not see any code that would slow down the script for no
> valid reason. Can anyone explain what that user meant or if I missed that
> part of the module?
>
> Can the RE module be optimized in any way or at least the "re.sub" portion?

There was a post by Steven D'Aprano [1] in which he referred to it
busy-waiting just to make regular expressions slower than the
alternatives, but his tongue was firmly in his cheek at the time. As
to real performance questions, there have been a variety of
alternatives proposed, including I think the regex module [2] which is
supposed to outperform 're' by quite a margin, but since I tend
towards other solutions, I can't quote personal results or hard
figures.

If re.sub can be optimized and you can see a way to do so, post a
patch to the bug tracker; if it improves performance and doesn't have
any ridiculous costs to it, it'll probably be accepted.

[1] http://mail.python.org/pipermail/python-list/2013-July/651818.html
[2] https://pypi.python.org/pypi/regex

ChrisA



More information about the Python-list mailing list