lexing nested parenthesis

Jonathan Hogg jonathan at onegoodidea.com
Tue Jul 30 07:42:24 EDT 2002


On 29/7/2002 4:57, in article
mailman.1027915180.16513.python-list at python.org, "Dave Cinege"
<dcinege at psychosis.com> wrote:

> IE
> if 1 and (var1 or ?(-d /etc/)):
> 
> I want to find ?(.*), but not runneth under or over.

While the general case is impossible with regular expressions as other
people have already described eloquently, the particular example you gave is
trivial:

>>> import re
>>> 
>>> text = 'if 1 and (var1 or ?(-d /etc/)):'
>>> 
>>> matcher = re.compile( r'\?\(([^()]*)\)' )
>>> match = matcher.search( text )
>>> match.group( 0 )
'?(-d /etc/)'
>>> print match.group( 1 )
-d /etc/
>>> 

If you know that you can't have parentheses *within* the ?(...) construct,
then parentheses outside of it are of little importance. You can also write
fairly complex regular expressions that allow parentheses inside the ?(...)
if they are within single quotes. For instance:

>>> text = r"if foo and (bar or ?(-d 'silly\')dir'))"
>>> 
>>> matcher = re.compile( r"\?\((([^()']+|'([^'\\]|\\'|\\\\)*')*)\)" )
>>> match = matcher.search( text )
>>> match.group( 0 )
"?(-d 'silly\\')dir')"
>>> print match.group( 1 )
-d 'silly\')dir'
>>> 

However, If you mean that you want to find expressions of the form:

    ?( ... ( ... ) ... )

then you're gonna have to write a parser.

Jonathan




More information about the Python-list mailing list