Regex help...pretty please?

MooMaster ntv1534 at gmail.com
Wed Aug 23 14:47:02 EDT 2006


I'm trying to develop a little script that does some string
manipulation. I have some few hundred strings that currently look like
this:

cond(a,b,c)

and I want them to look like this:

cond(c,a,b)

but it gets a little more complicated because the conds themselves may
have conds within, like the following:

cond(0,cond(c,cond(e,cond(g,h,(a<f)),(a<d)),(a<b)),(a<1))

What I want to do in this case is move the last parameter to the front
and then work backwards all the way out (if you're thinking recursion
too, I'm vindicated) so that it ends up looking like this:

cond((a<1), 0, cond((a<b),c,cond((a<d), e, cond((a<f), g, h))))

futhermore, the conds may be multiplied by an expression, such as the
following:

cond(-1,1,f)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float(a))

Here, all I want to do is switch the parameters of the conds without
touching the expression, like so:

cond(f,-1,1)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float(a))

So that's the gist of my problem statement. I immediately thought that
regular expressions would provide an elegant solution. I would go
through the string by conds, stripping them & the () off, until I got
to the lowest level, then move the parameters and work backwards. That
thought process became this:
-------------------------------------CODE--------------------------------------------------------
import re

def swap(left, middle, right):
    left = left.replace("(", "")
    right = right.replace(")", "")
    temp = left
    left = right
    right = temp
    temp = middle
    middle = right
    right = temp
    whole = 'cond(' + left + ',' + middle + ',' + right + ')'
    return whole

def condReplacer(string):
     #regex = re.compile(r'cond\(.*,.*,.+\)')
     regex = re.compile(r'cond\(.*,.*,.+?\)')
     if not regex.search(string):
          print "whole string is: " + string
          [left, middle, right] = string.split(',')
          right = right.replace('\'', ' ')
          string = swap(left.strip(), middle.strip(), right.strip())
          print "the new string is:" + string
          return string
     else:
          more_conds = regex.search(string)
          temp_string = more_conds.group()
          firstParen = temp_string.find('(')
          temp_string = temp_string[firstParen:]
          print "there are more conditionals!" + temp_string
          condReplacer(temp_string)
def lineReader(file):
     for line in file:
         regex = r'cond\(.*,.*,.+\)?'
         if re.search(regex,line,re.DOTALL):
            condReplacer(line)

if __name__ == "__main__":
   input_file = open("only_conds2.txt", 'r')
   lineReader(input_file)
-------------------------------------CODE--------------------------------------------------------

I think my problem lies in my regular expression... If I use the one
commented out I do a greedy search and in my test case where I have a
conditional * an expression, I grab the expression too, like so:

INPUT:

cond(-1,1,f)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float(a))
OUTPUT:
whole string is:
(-1,1,f)*((float(e)*(2**4))+(float(d)*8)+(float(c)*4)+(float(b)*2)+float
    (a))
the new string
is:cond(f*((float(e*(2**4+(float(d*8+(float(c*4+(float(b*2+float
(a,-1,1)

when all I really want to do is grab the part associated with the cond.
But if I do a non-greedy search I avoid that problem but stop too early
when I have an expression like this:

INPUT:
cond(a,b,(abs(c) >= d))
OUTPUT:
whole string is: (a,b,(abs(c)
the new string is:cond((abs(c,a,b)

Can anyone help me with the regular expression? Is this even the best
approach to take? Anyone have any thoughts? 

Thanks for your time!




More information about the Python-list mailing list