creating a pattern using a previous match and a count of the number of '('s in it

MRAB google at mrabarnett.plus.com
Tue Jan 27 10:57:55 EST 2009


me wrote:
> <code>
> I'm new to regexs and trying to get a list of all my C++ methods with balanced 
> parenthesis as follows.
> 
> 
> #find all c++ method prototypes with a '::' in the middle 
> #upto and including the 1st closing parenthesis
> pattern_upto_1st_closed_parenth  = re.compile('\w+::\w+\([^)]*\)') 
> match_upto_1st_closed_parenth    = 
> re.findall(pattern_upto_1st_closed_parenth,txt)
> num_of_protos = len(match_upto_1st_closed_parenth)
> 
> for i in range (0,num_of_protos-1):                  
This should actually be range(0, num_of_protos).

>    num_of_open_parenths = match_upto_1st_closed_parenth[i].count('(')
>    
>    #expand the pattern to get all of the prototype 
>    #ie upto the last closed parenthesis
>    #saying something like
>    pattern = re.compile(\
>                         'match_upto_1st_closed_parenth[i]+\ 
>                         (([^)]*\)){num_of_open_parenths-1}'\
>                        )
>    #====================================================================
>    #HELP!!!!!! I'm not sure how to incorporate:
>    #1 'match_upto_1st_closed_parenth[i]' into the above extended pattern???
>    #2 the count 'num_of_open_parenths' instead of a literal ???
>    #====================================================================
> 
> 
> 
> 
> #=======================================
> #if I could do it this sort of this would appear to offer the neatest solution
> pattern_upto_last_balanced_parenthesis  = re.compile('
> 									(\w+::\w+\([^)]*\))\			
>                                  					([^)]*\)){\1.count('(')-1}		
>                                                               		') 		
> #=======================================
> 
> Should I be using regexs to do this?
> 
> I've only put \ line extensions to separate the pattern components to assist 
> readability
> 
> Thx
> </code>
> 
Not necessarily the best way, but:

methods = []
# The pattern for the start of the method's header.
start_pattern = re.compile(r'\w+::\w+\([^)]*\)')
# Start at the beginning of the text.
pos = 0
while True:
     # Search for the start of the next method's header.
     start_match = start_pattern.search(txt, pos)
     if not start_match:
         break
     # Search for the end of the method's header.
     end_pattern = re.compile(r'(?:[^)]*\)){%d}' % 
start_match.group().count('('))
     end_match = end_pattern.search(txt, pos)
     methods.append(txt[start_match.start() : end_match.end()])
     # Continue the next search from where we left off.
     pos = end_match.end()



More information about the Python-list mailing list