RegExp question

Tue Apr 11 15:20:46 EDT 2006

"Michael McGarry" <michael.mcgarry at gmail.com> wrote in message
news:1144781090.622493.252460 at t31g2000cwb.googlegroups.com...
> Hi,
>
> I would like to form a regular expression to find a few different
> tokens (and, or, xor) followed by some variable number of whitespace
> (i.e., tabs and spaces) followed by a hash mark (i.e., #). What would
> be the regular expression for this?
>
> Thanks for any help,
>
> Michael
>
Using pyparsing, whitespace is implicitly ignored.  Your expression would
look like:

oneOf("and or xor") + Literal("#")

Here's a complete example:

from pyparsing import *

pattern = oneOf("and or xor") + Literal("#")

testString = """
  z = (a and b) and #XVAL;
  q = z xor #YVAL;
"""

# use scanString to locate matches
for tokens,start,end in pattern.scanString(testString):
    print tokens[0], tokens.asList()
    print line(start,testString)
    print (" "*(col(start,testString)-1)) + "^"
    print
print

# use transformString to locate matches and substitute values
subs = {
    'XVAL': 0,
    'YVAL': True,
    }
def replaceSubs(st,loc,toks):
    try:
        return toks[0] + " " + str(subs[toks[2]])
    except KeyError:
        pass

pattern2 = (pattern + Word(alphanums)).setParseAction(replaceSubs)
print pattern2.transformString(testString)

-----------------
Prints:
and ['and', '#']
  z = (a and b) and #XVAL;
                ^

xor ['xor', '#']
  q = z xor #YVAL;
        ^

  z = (a and b) and 0;
  q = z xor True;

Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul