Breaking up Strings correctly:

Gerard Flanagan grflanagan at yahoo.co.uk
Mon Apr 9 13:29:08 EDT 2007


On Apr 9, 1:19 pm, "Michael Yanowitz" <m.yanow... at kearfott.com> wrote:
> Hello:
>
>    I have been searching for an easy solution, and hopefully one
> has already been written, so I don't want to reinvent the wheel:
>
>    Suppose I have a string of expressions such as:
> "((($IP = "127.1.2.3") AND ($AX < 15)) OR (($IP = "127.1.2.4") AND ($AY !=
> 0)))
>   I would like to split up into something like:
> [ "OR",
>   "(($IP = "127.1.2.3") AND ($AX < 15))",
>   "(($IP = "127.1.2.4") AND ($AY != 0))" ]
>
>      which I may then decide to or not to further split into:
> [ "OR",
>   ["AND", "($IP = "127.1.2.3")", "($AX < 15)"],
>   ["AND", "(($IP = "127.1.2.4")", ($AY != 0))"] ]
>
>   Is there an easy way to do this?

If you look into infix to prefix conversion algorithms it might help
you.  The following seems to work with the example you give, but not
tested further:


data = '''
((($IP = "127.1.2.3") AND ($AX < 15)) OR (($IP = "127.1.2.4") AND
($AY !=
0)))
'''

import tokenize
from cStringIO import StringIO

opstack = []
valstack = []
s = ''
g = tokenize.generate_tokens(StringIO(data).readline)   # tokenize the
string
for _, tokval, _, _, _  in g:
    if tokval in ['(', ')', 'AND', 'OR']:
        if tokval != ')':
            opstack.append(tokval)
        else:
            if s:
                valstack.append(s)
                s = ''
            while opstack[-1] != '(':
                op = opstack.pop()
                rhs = valstack.pop()
                lhs = valstack.pop()
                valstack.append([op, lhs, rhs])
            opstack.pop()
    else:
        s += tokval.strip()

print valstack

[['OR', ['AND', '$IP="127.1.2.3"', '$AX<15'], ['AND',
'$IP="127.1.2.4"', '$AY!=0']]]

Gerard




More information about the Python-list mailing list