need some regular expression help

Diez B. Roggisch deets at nospam.web.de
Sun Oct 8 08:26:11 EDT 2006


Mirco Wahab schrieb:
> Thus spoke Diez B. Roggisch (on 2006-10-08 10:49):
>> Certainly true, and it always gives me a hard time because I don't know
>> to which extend a regular expression nowadays might do the job because
>> of these extensions. It was so much easier back in the old times....
> 
> Right, in perl, this would be a no-brainer,
> its documented all over the place, like:
> 
>    my $re;
> 
>    $re = qr{
>             (?:
>               (?> [^\\()]+ | \\. )
>                |
>              \( (??{ $re }) \)
>             )*
>             }xs;
> 
> where you have a 'delayed execution'
> of the
> 
>   (??{ $re })
> 
> which in the end makes the whole a thing
> recursive one, it gets expanded and
> executed if the match finds its way
> to it.
> 
> Above regex will match balanced parens,
> as in:
> 
>    my $good = 'a + (b / (c - 2)) * (d ^ (e+f))  ';
>    my $bad1 = 'a + (b / (c - 2)  * (d ^ (e+f))  ';
>    my $bad2 = 'a + (b / (c - 2)) * (d) ^ (e+f) )';
> 
> if you do:
> 
>    print "ok \n" if $good =~ /^$re$/;
>    print "ok \n" if $bad1 =~ /^$re$/;
>    print "ok \n" if $bad2 =~ /^$re$/;
> 
> 
> This in some depth documented e.g. in
> http://japhy.perlmonk.org/articles/tpj/2004-summer.html
> (topic: Recursive Regexes)

That clearly is a recursive grammar rule, and thus it can't be regular 
anymore :) But first of all, I find it ugly - the clean separation of 
lexical and syntactical analysis is better here, IMHO - and secondly, 
what are the properties of that parsing? Is it LL(k), LR(k), backtracking?

Diez



More information about the Python-list mailing list