Utility to locate errors in regular expressions

Roy Smith roy at panix.com
Fri May 24 09:12:16 EDT 2013


In article <mailman.2062.1369400329.3114.python-list at python.org>,
 Malte Forkel <malte.forkel at berlin.de> wrote:

> Finding out why a regular expression does not match a given string can
> very tedious. I would like to write a utility that identifies the
> sub-expression causing the non-match. My idea is to use a parser to
> create a tree representing the complete regular expression. Then I could
> simplify the expression by dropping sub-expressions one by one from
> right to left and from bottom to top until the remaining regex matches.
> The last sub-expression dropped should be (part of) the problem.
> 
> As a first step, I am looking for a parser for Python regular
> expressions, or a Python regex grammar to create a parser from.
> 
> But may be my idea is flawed? Or a similar (or better) tools already
> exists? Any advice will be highly appreciated!

I think this would be a really cool tool.  The debugging process I've 
always used is essentially what you describe.  I start try progressively 
shorter sub-patterns until I get a match, then try to incrementally add 
back little bits of the original pattern until it no longer matches.  
With luck, the problem will become obvious at that point.

Having a tool which automated this would be really useful.

Of course, most of Python user community are wimps and shy away from big 
hairy regexes [ducking and running].



More information about the Python-list mailing list