embedding executable code in a regular expression in Python

Paul McGuire ptmcg at austin.rr.com
Mon Jul 17 04:09:10 EDT 2006


Avi Kak wrote:
> Folks,
>
> Does regular expression processing in Python allow for executable
> code to be embedded inside a regular expression?
>
> For example, in Perl the following two statements
>
> $regex = qr/hello(?{print "saw hello\n"})mello(?{print "saw
> mello\n"})/;
> "jellohellomello"  =~  /$regex/;
>
> will produce the output
>
>   saw hello
>   saw mello
>

Not nearly so terse, but perhaps easier to follow, here is a pyparsing
version.  Pyparsing parse actions are intended to do just what you ask.
 Parse actions may be defined to take no arguments, just one argument
(which will be passed the list of matching token strings), 2 arguments
(the match location and the matching tokens), or 3 arguments (the
original source string, the match location, and the tokens).  Parse
actions are very good for transforming input text into modified output
form, such as the "background-color" to "backgroundColor" transform -
the BoaConstructor team used pyparsing to implement a version upgrade
that transformed user source to a new version of wx (involving a
variety of suh changes).

Here is your jello/mello program, with two variations of parse actions.

-- Paul

from pyparsing import *

instr = "jellorelohellomellofellowbellowmello"
searchTerm = oneOf( ["jello","mello"] )

# simple parse action, just echoes matched text
def echoMatchedText(tokens):
    print "saw", tokens[0]

searchTerm.setParseAction( echoMatchedText )
searchTerm.searchString(instr)

# modified parse action, prints location too
def echoMatchedText(loc,tokens):
    print "saw", tokens[0], "at locn", loc

searchTerm.setParseAction( echoMatchedText )
searchTerm.searchString(instr)

Prints out:
saw jello
saw mello
saw mello
saw jello at locn 0
saw mello at locn 14
saw mello at locn 31




More information about the Python-list mailing list