Utility to locate errors in regular expressions

Roy Smith roy at panix.com
Fri May 24 09:40:12 EDT 2013


In article <mailman.2065.1369401265.3114.python-list at python.org>,
 Devin Jeanpierre <jeanpierreda at gmail.com> wrote:

> On Fri, May 24, 2013 at 8:58 AM, Malte Forkel <malte.forkel at berlin.de> wrote:
> > As a first step, I am looking for a parser for Python regular
> > expressions, or a Python regex grammar to create a parser from.
> 
> the sre_parse module is undocumented, but very usable.
> 
> > But may be my idea is flawed? Or a similar (or better) tools already
> > exists? Any advice will be highly appreciated!
> 
> I think your task is made problematic by the possibility that no
> single part of the regexp causes a match failure. What causes failure
> depends on what branches are chosen with the |, *, +, ?, etc.
> operators -- it might be a different character/subexpression for each
> branch. And then there's exponentially many possible branches.

That's certainly true.  The full power of regex makes stuff like this 
very hard to do in the general case.  That being said, people tend to 
write regexen which match hunks of text from left to right.

So, in theory, it's probably an intractable problem.  But, in practice, 
such a tool would actually be useful in a large set of real-life cases.



More information about the Python-list mailing list