quiz about symbolic manipulation

Bengt Richter bokr at oz.net
Fri Dec 13 19:05:19 EST 2002


On 13 Dec 2002 05:48:22 -0800, mis6 at pitt.edu (Michele Simionato) wrote:

>bokr at oz.net (Bengt Richter) wrote in message news:<atasp6$loc$0 at 216.39.172.122>...
>> >
>> >In[1]:= square[square[x+y]+z]+square[x+w]/.square[x_] -> x^2
>> >
>> >               2                      2
>> >Out[1]= (w + x)  + (z + square[x + y])
>>          ^^^^^^^^   ^^^^^^^^^^^^^^^^^^^^
>>             |                   |
>>             +-----------+       |
>>                         |       |
>>                      vvvvvvvv   |
>> [2] "((x+y)**2+z)**2+(x+w)**2"  |
>>      ^^^^^^^^^^^^^^^            |
>>             |                   |
>>             +-------------------+
>> 
>> What am I missing?
> 
>The "/." substitution operator of Mathematica does not work
>recursively:
>you see that the result still contain the expression square[x+y] and
D'oh ;-/

>that
>you should apply "/." again to get rid of this second square.
>Of course, as Carl pointed out, you could use the "//." operator, but
>this
>is not the point: I would like to do this in Python. My example
>(perhaps
>not too happily chosen) simply wanted to point out the need for a
>recursive
>algorithm. Still I think the solution should be not too difficult
>using the
>compiler module, but I have problems in understanding this module
>since the documentation is still rather poor and not so clear (to me
>at least).

Well, for a toy problem, parsing according to python grammar is probably
overkill. I.e.,

'square(square(x+y)+z)+square(x+w)'

becomes

['eval_input',
 ['testlist',
  ['test',
   ['and_test',
    ['not_test',
     ['comparison',
      ['expr',
       ['xor_expr',
        ['and_expr',
         ['shift_expr',
          ['arith_expr',
           ['term',
            ['factor',
             ['power',
              ['atom', ['NAME', 'square']],
              ['trailer',
               ['LPAR', '('],
               ['arglist',
                ['argument',
                 ['test',
                  ['and_test',
                   ['not_test',
                    ['comparison',
                     ['expr',
                      ['xor_expr',
                       ['and_expr',
                        ['shift_expr',
                         ['arith_expr',
                          ['term',
                           ['factor',
                            ['power',
                             ['atom',
                              ['NAME', 'square']],
                             ['trailer',
                              ['LPAR', '('],
                              ['arglist',
                               ['argument',
                                ['test',
                                 ['and_test',
                                  ['not_test',
                                   ['comparison',
                                    ['expr',
                                     ['xor_expr',
                                      ['and_expr',
                                       ['shift_expr',
                                        ['arith_expr',
                                         ['term',
                                          ['factor',
                                           ['power',
                                            ['atom',
                                             ['NAME',
                                              'x']]]]],
                                         ['PLUS',
                                          '+'],
                                         ['term',
                                          ['factor',
                                           ['power',
                                            ['atom',
                                             ['NAME',
                                              'y']]]]]]]]]]]]]]]],
                              ['RPAR', ')']]]]],
                          ['PLUS', '+'],
                          ['term',
                           ['factor',
                            ['power',
                             ['atom',
                              ['NAME', 'z']]]]]]]]]]]]]]]],
               ['RPAR', ')']]]]],
           ['PLUS', '+'],
           ['term',
            ['factor',
             ['power',
              ['atom', ['NAME', 'square']],
              ['trailer',
               ['LPAR', '('],
               ['arglist',
                ['argument',
                 ['test',
                  ['and_test',
                   ['not_test',
                    ['comparison',
                     ['expr',
                      ['xor_expr',
                       ['and_expr',
                        ['shift_expr',
                         ['arith_expr',
                          ['term',
                           ['factor',
                            ['power',
                             ['atom',
                              ['NAME', 'x']]]]],
                          ['PLUS', '+'],
                          ['term',
                           ['factor',
                            ['power',
                             ['atom',
                              ['NAME', 'w']]]]]]]]]]]]]]]],
               ['RPAR', ')']]]]]]]]]]]]]]],
 ['NEWLINE', ''],
 ['ENDMARKER', '']]

which compiles to

          0 SET_LINENO               0
          3 LOAD_NAME                0 (square)
          6 LOAD_NAME                0 (square)
          9 LOAD_NAME                1 (x)
         12 LOAD_NAME                2 (y)
         15 BINARY_ADD
         16 CALL_FUNCTION            1
         19 LOAD_NAME                3 (z)
         22 BINARY_ADD
         23 CALL_FUNCTION            1
         26 LOAD_NAME                0 (square)
         29 LOAD_NAME                1 (x)
         32 LOAD_NAME                4 (w)
         35 BINARY_ADD
         36 CALL_FUNCTION            1
         39 BINARY_ADD
         40 RETURN_VALUE

Probably rolling your own parser and using the tokenizer would make sense. E.g., the tokenizer
will get you

 >>> import tokenize, StringIO
 >>> def prtoks(expr):
 ...     tokens = tokenize.generate_tokens(StringIO.StringIO(expr).readline)
 ...     for t in tokens: print '%10s: %s' %(token.tok_name[t[0]], t[1])
 ...
 >>> e = 'square(square(x+y)+z)+square(x+w)'
 >>> prtoks(e)
       NAME: square
         OP: (
       NAME: square
         OP: (
       NAME: x
         OP: +
       NAME: y
         OP: )
         OP: +
       NAME: z
         OP: )
         OP: +
       NAME: square
         OP: (
       NAME: x
         OP: +
       NAME: w
         OP: )
  ENDMARKER:

Or maybe just recognize the ops yourself in the tokens:

 >>> def tokens(expr):
 ...     tokens = tokenize.generate_tokens(StringIO.StringIO(expr).readline)
 ...     return [t[1] for t in tokens]
 ...
 >>> tokens(e)
 ['square', '(', 'square', '(', 'x', '+', 'y', ')', '+', 'z', ')', '+', 'square', '(', 'x', '+',
 'w', ')', '']


You said in your other post:
______________________________________________

e="square(square(x+y)+z)+square(x+w)"

I would like to define a function

def substitute(math_expr,**subs):
    #.... something here
    result result

such that when I call

print substitute(e, square="x -> x**2")

I obtain

"((x+y)**2+z)**2+(x+w)**2"
______________________________________________

what does
    square="x -> x**2"
mean in general? I gather here it means that
    square(something)
should be rewritten as
    (something)**2
perhaps without parentheses if something is a single term,
but what other kinds of **subs do you anticipate?

How about this (using tokens() defined above), specifying the
function parameter rewrite via a format string:

 >>> from __future__ import generators
 >>> def funrew(expr, funname, rewfmt):
 ...     toks = tokens(expr)
 ...     while toks:
 ...         if toks[0]!=funname:
 ...             yield toks.pop(0)
 ...             continue
 ...         toks.pop(0) # dump funname
 ...         # find and process parenthesized call parameter expression
 ...         i = ump = 0
 ...         for t in toks:
 ...             i += 1
 ...             if t =='(': ump += 1
 ...             if t == ')':
 ...                 ump -= 1
 ...                 if not ump: break
 ...         rest = toks[i:]
 ...         yield (rewfmt % ''.join(funrew(''.join(toks[1:i-1]), funname, rewfmt)))
 ...         toks = rest
 ...
 >>> def substitute(expr, funname, rewfmt):
 ...     return ''.join(funrew(e, funname,rewfmt))
 ...
 >>> e
 'square(square(x+y)+z)+square(x+w)'
 >>> substitute(e, 'square','(%s)**2')
 '((x+y)**2+z)**2+(x+w)**2'

Well, this is not so exciting for functions with multiple parameters. How
might you want to rewrite those? Probably isolate terms and rewrite each term
and then feed that as a tuple to a rewfmt? If you can assume no terms containing
nested commas (probably reasonable for math expressions) then it shouldn't be
too hard. But much beyond that and you may want to use the Python parser after all ;-)

Anyway, I'll leave the rest as an exercise ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list