[Python-Dev] re performance
Jakub Wilk
jwilk at jwilk.net
Sun Jan 29 05:18:23 EST 2017
* Armin Rigo <armin.rigoatgmail.com>, 2017-01-28, 12:44:
>The theoretical kind of regexp is about giving a "yes/no" answer, whereas the
>concrete "re" or "regexp" modules gives a match object, which lets you ask for
>the subgroups' location, for example. Strange at it may seem, I am not aware
>of a way to do that using the linear-time approach of the theory---if it
>answers "yes", then you have no way of knowing *where* the subgroups matched.
>
>Another issue is that the theoretical engine has no notion of
>greedy/non-greedy matching.
RE2 has linear execution time, and it supports both capture groups and
greedy/non-greedy matching.
The implementation is explained in this article:
https://swtch.com/~rsc/regexp/regexp3.html
--
Jakub Wilk
More information about the Python-Dev
mailing list