[Tutor] Help with re module maybe?

Kent Johnson kent37 at tds.net
Sat Nov 20 04:39:29 CET 2004


In general it is hard to use regular expressions to parse when you have 
to keep some state to know what to do. In this case you have to keep 
track of the nesting of the parens in order to know how to handle a plus 
sign.

This is a simple enough problem that a simple state machine that counts 
open and close parentheses does the job. For more complicated state a 
parsing library like pyparsing is very helpful.

I encourage you to learn about the re module, though. It is often handy. 
This HOW-TO might help you get started:
http://www.amk.ca/python/howto/regex/

Kent

Some people, when confronted with a problem, think “I know, I’ll use 
regular expressions.” Now they have two problems.
	 --Jamie Zawinski, in comp.emacs.xemacs


def splitter(s):
     ''' Split a string on each plus sign that is not inside parentheses '''

     parenCount = 0  # Counts current nesting
     start = 0       # Start of current run

     for i, c in enumerate(s):
         if c == '+' and parenCount == 0:
             yield s[start:i]
             start = i+1
         elif c == '(':
             parenCount += 1
         elif c == ')':
             parenCount -= 1

     # Yield any leftovers
     if start < len(s):
         yield s[start:]

test = [
     "x**2+sin(x**2+2*x)",
     'abcd',
     '(a+b)+(c+d)',
     '((a+b)+(c+d))+e'
]

for s in test:
     print s, list(splitter(s))


prints:
x**2+sin(x**2+2*x) ['x**2', 'sin(x**2+2*x)']
abcd ['abcd']
(a+b)+(c+d) ['(a+b)', '(c+d)']
((a+b)+(c+d))+e ['((a+b)+(c+d))', 'e']



Jacob S. wrote:
> Okay,
> 
>     say I have a string "x**2+sin(x**2+2*x)" and I want to split it at the
> addition sign. However, I don't want to split it at the addition signs
> inside the parenthesis. How do I go about doing this?
> 
> Goes something along the lines
> 
> a = a.split("+") # if and only if + not inside parenthesis
> 
> That should be enough information to help...
> I think the re module might help. Any insights as to a good walkthrough of
> the re module would be helpful. If you have any suggestions, or would like
> to give me a more detailed way to do this (maybe a specific place in the re
> module?)
> 
> Thanks in advance,
> Jacob Schmidt
> 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
> 


More information about the Tutor mailing list