Help improve program for parsing simple rules

Fri Apr 17 08:37:27 EDT 2009

On Apr 16, 3:59 pm, Aaron Brady <castiro... at gmail.com> wrote:
> On Apr 16, 10:57 am, prueba... at latinmail.com wrote:
>
> > Another interesting task for those that are looking for some
> > interesting problem:
> > I inherited some rule system that checks for programmers program
> > outputs that to be ported: given some simple rules and the values it
> > has to determine if the program is still working correctly and give
> > the details of what the values are. If you have a better idea of how
> > to do this kind of parsing please chime in. I am using tokenize but
> > that might be more complex than it needs to be. This is what I have
> > come up so far:
>
> > rules=[
> >          '( A - B ) = 0',
> >          '(A + B + C + D + E + F + G + H + I) = J',
> >          '(A + B + C + D + E + F + G + H) = I',
> >          '(A + B + C + D + E + F) = G',
> >          '(A + B + C + D + E) = (F + G + H + I + J)',
> >          '(A + B + C + D + E) = (F + G + H + I)',
> >          '(A + B + C + D + E) = F',
> >          '(A + B + C + D) = (E + F + G + H)',
> >          '(A + B + C) = (D + E + F)',
> >          '(A + B) = (C + D + E + F)',
> >          '(A + B) = (C + D)',
> >          '(A + B) = (C - D + E - F - G + H + I + J)',
> >          '(A + B) = C',
> >          '(A + B) = 0',
> >          '(A+B+C+D+E) = (F+G+H+I+J)',
> >          '(A+B+C+D) = (E+F+G+H)',
> >          '(A+B+C+D)=(E+F+G+H)',
> >          '(A+B+C)=(D+E+F)',
> >          '(A+B)=(C+D)',
> >          '(A+B)=C',
> >          '(A-B)=C',
> >          '(A/(B+C))',
> >          '(G + H) = I',
> >          '-0.99 LE ((A+B+C)-(D+E+F+G)) LE 0.99',
> >          '-0.99 LE (A-(B+C)) LE 0.99',
> >          '-1000.00 LE A LE 0.00',
> snip
> > def main():
> >     for cur_rule in rules[20:26]:
> >         tokens=get_tokens(cur_rule)
> >         normal=replace_comps(tokens,COMP_REPLACERS)
> >         subst=replace_names(normal,vars_)
> >         groups=split_seccions(subst,COMP_REPLACERS.values())
> >         rep=all_seccions(groups)
> >         rep_out=''.join(x[0]+x[1] for x in rep)
> >         calc=calc_seccions(rep)
> >         calc_out=''.join(x[0]+x[1] for x in calc)
> >         deltas=calc_deltas(calc)
> >         result=eval(calc_out,{},{})
>
> snip
>
> You are using 'eval' which isn't safe if its inputs are dangerous.  If
> you are controlling the inputs, you might be interested in the
> optional arguments to 'eval'.
>
> >>> a= '-1000.00 < A < 0.00'
> >>> eval( a, { 'A': -100 } )
> True
> >>> eval( a, { 'A': -1000 } )
>
> False
>
> The syntax is slightly different for Python 2.  For the replacement of
> 'LE', I assume you require spaces on both sides.
>
> >>> a= '-1000.00 LE A LE 0.00'
> >>> b= a.replace( ' LE ', ' <= ' )
> >>> b
>
> '-1000.00 <= A <= 0.00'>>> eval( b, { 'A': -1000 } )
> True
> >>> eval( b, { 'A': -1001 } )
>
> False
>
> If you need something more flexible, the 'ast' module gives you more
> options.  The string has to be a valid Python module to start with.
>
> FYI, have you checked order of precedence in your custom rule set to
> match Python's?

I know about evals implication of safety. Rules are defined by the
programmers so I don't worry too much about it at this point. They
should know better than messing up their application server. Unless
there is some easier way to do it I am willing to take the risk.
Precedence is standard math, we can always add some extra parenthesis
to the rules, I don't thing the old system would mind.

I thought about using eval with a locals dictionary, but they want
output of the intermediate values. I want to avoid the situation where
the intermediate output does not match what eval is doing.