creating a pattern using a previous match and a count of the number of '('s in it
MRAB
google at mrabarnett.plus.com
Tue Jan 27 10:57:55 EST 2009
me wrote:
> <code>
> I'm new to regexs and trying to get a list of all my C++ methods with balanced
> parenthesis as follows.
>
>
> #find all c++ method prototypes with a '::' in the middle
> #upto and including the 1st closing parenthesis
> pattern_upto_1st_closed_parenth = re.compile('\w+::\w+\([^)]*\)')
> match_upto_1st_closed_parenth =
> re.findall(pattern_upto_1st_closed_parenth,txt)
> num_of_protos = len(match_upto_1st_closed_parenth)
>
> for i in range (0,num_of_protos-1):
This should actually be range(0, num_of_protos).
> num_of_open_parenths = match_upto_1st_closed_parenth[i].count('(')
>
> #expand the pattern to get all of the prototype
> #ie upto the last closed parenthesis
> #saying something like
> pattern = re.compile(\
> 'match_upto_1st_closed_parenth[i]+\
> (([^)]*\)){num_of_open_parenths-1}'\
> )
> #====================================================================
> #HELP!!!!!! I'm not sure how to incorporate:
> #1 'match_upto_1st_closed_parenth[i]' into the above extended pattern???
> #2 the count 'num_of_open_parenths' instead of a literal ???
> #====================================================================
>
>
>
>
> #=======================================
> #if I could do it this sort of this would appear to offer the neatest solution
> pattern_upto_last_balanced_parenthesis = re.compile('
> (\w+::\w+\([^)]*\))\
> ([^)]*\)){\1.count('(')-1}
> ')
> #=======================================
>
> Should I be using regexs to do this?
>
> I've only put \ line extensions to separate the pattern components to assist
> readability
>
> Thx
> </code>
>
Not necessarily the best way, but:
methods = []
# The pattern for the start of the method's header.
start_pattern = re.compile(r'\w+::\w+\([^)]*\)')
# Start at the beginning of the text.
pos = 0
while True:
# Search for the start of the next method's header.
start_match = start_pattern.search(txt, pos)
if not start_match:
break
# Search for the end of the method's header.
end_pattern = re.compile(r'(?:[^)]*\)){%d}' %
start_match.group().count('('))
end_match = end_pattern.search(txt, pos)
methods.append(txt[start_match.start() : end_match.end()])
# Continue the next search from where we left off.
pos = end_match.end()
More information about the Python-list
mailing list