regular expression for nested parentheses

John Machin sjmachin at lexicon.net
Mon Dec 10 01:26:27 EST 2007


On Dec 10, 12:22 pm, MRAB <goo... at mrabarnett.plus.com> wrote:
> On Dec 9, 10:12 pm, John Machin <sjmac... at lexicon.net> wrote:
>
>
>
>
>
> > On Dec 10, 8:53 am, Noah Hoffman <noah.hoff... at gmail.com> wrote:
>
> > > On Dec 9, 1:41 pm, John Machin <sjmac... at lexicon.net> wrote:
>
> > > > A pattern that can validly be described as a "regular expression"
> > > > cannot count and thus can't match balanced parentheses. Some "RE"
> > > > engines provide a method of tagging a sub-pattern so that a match must
> > > > include balanced () (or [] or {}); Python's doesn't.
>
> > > Okay, thanks for the clarification. So recursion is not possible using
> > > python regular expressions?
>
> > > > Ummm ... even if Python's re engine did do what you want, wouldn't you
> > > > need flags=re.VERBOSE in there?
>
> > > Ah, thanks for letting me know about that flag; but removing
> > > whitespace as I did with the no_ws lambda expression should also work,
> > > no?
>
> > Under a very limited definition of "work". That technique would not
> > produce correct answers on patterns that contain any *significant*
> > whitespace e.g. you want to match "foo" and "bar" separated by one or
> > more spaces (but not tabs, newlines etc) ....
> > pattern = r"""
> > foo
> > [ ]+
> > bar
> > """
>
> You can also escape a literal space:
>
> pattern = r"""
> foo
> \ +
> bar
> """

I know that. *Any* method of putting in a literal significant space is
clobbered by the OP's "trick" of removing *all* whitespace instead of
using the VERBOSE flag, which also permits comments:
pattern = r"""
\ + # ugly
[ ]+ # not quite so ugly
"""



More information about the Python-list mailing list