regular expression for nested parentheses

MRAB google at mrabarnett.plus.com
Sun Dec 9 20:22:22 EST 2007


On Dec 9, 10:12 pm, John Machin <sjmac... at lexicon.net> wrote:
> On Dec 10, 8:53 am, Noah Hoffman <noah.hoff... at gmail.com> wrote:
>
>
>
> > On Dec 9, 1:41 pm, John Machin <sjmac... at lexicon.net> wrote:
>
> > > A pattern that can validly be described as a "regular expression"
> > > cannot count and thus can't match balanced parentheses. Some "RE"
> > > engines provide a method of tagging a sub-pattern so that a match must
> > > include balanced () (or [] or {}); Python's doesn't.
>
> > Okay, thanks for the clarification. So recursion is not possible using
> > python regular expressions?
>
> > > Ummm ... even if Python's re engine did do what you want, wouldn't you
> > > need flags=re.VERBOSE in there?
>
> > Ah, thanks for letting me know about that flag; but removing
> > whitespace as I did with the no_ws lambda expression should also work,
> > no?
>
> Under a very limited definition of "work". That technique would not
> produce correct answers on patterns that contain any *significant*
> whitespace e.g. you want to match "foo" and "bar" separated by one or
> more spaces (but not tabs, newlines etc) ....
> pattern = r"""
> foo
> [ ]+
> bar
> """

You can also escape a literal space:

pattern = r"""
foo
\ +
bar
"""



More information about the Python-list mailing list