[Python-Dev] re performance

Greg Ewing greg.ewing at canterbury.ac.nz
Sun Jan 29 16:38:22 EST 2017


Armin Rigo wrote:

> The theoretical kind of regexp is about giving a "yes/no" answer,
> whereas the concrete "re" or "regexp" modules gives a match object,
> which lets you ask for the subgroups' location, for example.
> 
> Another issue is that the theoretical engine has no notion of
> greedy/non-greedy matching.

These things aren't part of the classical theory of REs that is
usually taught, but it should be possible to do them in linear time.
They can be done for context-free languages using e.g. an LALR parser,
and regular languages are a subset of context-free languages.

-- 
Greg


More information about the Python-Dev mailing list