regular expression for nested parentheses

John Machin sjmachin at lexicon.net
Sun Dec 9 16:41:17 EST 2007


On Dec 10, 8:13 am, Noah Hoffman <noah.hoff... at gmail.com> wrote:
> I have been trying to write a regular expression that identifies a
> block of text enclosed by (potentially nested) parentheses. I've found
> solutions using other regular expression engines (for example, my text
> editor, BBEdit, which uses the PCRE library), but have not been able
> to replicate it using python's re module.

A pattern that can validly be described as a "regular expression"
cannot count and thus can't match balanced parentheses. Some "RE"
engines provide a method of tagging a sub-pattern so that a match must
include balanced () (or [] or {}); Python's doesn't.

Looks like you need a parser; try pyparsing.

[snip]
> py> rexp = r"""(?P<parens>
> ...     \(
> ...         (?>
> ...             (?> [^()]+ ) |
> ...             (?P>parens)
> ...         )*
> ...     \)
> ... )"""
> py> print re.findall(no_ws(rexp), text)

Ummm ... even if Python's re engine did do what you want, wouldn't you
need flags=re.VERBOSE in there?




More information about the Python-list mailing list