[Python-Dev] Re: pre-PEP [corrected]: Complete, Structured Regular Expression Group Matching

Mike Coleman mkc at mathdogs.com
Fri Aug 13 03:43:01 CEST 2004


Erik Heneryd <erik at heneryd.com> writes:
> Well, I guess that if you want structmatch into the stdlib you'll have to show
> that it's better than it's alternatives.  Including those parser packages.

As a practical matter, this may well be the case.  I look at it kind of like

    'structmatch' is to (a grammar parser) 
    as 
    'sed' is to (a full scripting language)

That is, it has its niche, but it certainly not as general or powerful as a
full parsing package.  I'm not sure, either, exactly how to show that it's
better.  That said, I use sed all the time, not because it's a better than
full scripting languages, but because it nicely fits the problem I'm
addressing better.

> You'd still have to do a real implementation.  If it can't be done without
> rewriting a whole lot of code, that would be a problem.

I agree.  Being busy and lazy, I'm trying to get a bead first on whether this
would be a wasted effort.

> Hmm... think this is the wrong approach.  Your PEP is not just about
> "structured matching", it tries to deal with a couple of issues and I think it
> would be better to address them separately, one by one:
> 
> * Parsing/scanning - this is mostly what's been discussed so far...
> 
> * Capturing repeated groups - IMO nice-to-have (tm) but not something I would
> lose sleep over.  Hard to do.
> 
> * Partial matches - would be great for debugging more complex regexes. Why not
> a general re.RAISE flag raising an exception on failure?

This is true.  For me, the second is fundamental--it's why I'm bothering with
this.  The third is a useful add-on, and as you suggest could probably be
added orthogonally to several of the existing methods.

The first I'm not sure about.  I don't think re.structmatch does
scanning--that's not really the problem it tries to solve.  As for "parsing",
I guess it depends on what you mean by that.

Certainly it would be possible to address the "repeated groups" point without
the whole structured return value thing, but I'm not seeing what would be
better.

Mike



More information about the Python-Dev mailing list