how to handle repetitive regexp match checks

Jeff Shannon jeffshannon at gmail.com
Fri Mar 18 13:34:30 EST 2005


Matt Wette wrote:

> 
> Over the last few years I have converted from Perl and Scheme to
> Python.  There one task that I do often that is really slick in Perl
> but escapes me in Python.  I read in a text line from a file and check
> it against several regular expressions and do something once I find a 
> match.
> For example, in perl ...
> 
>     if ($line =~ /struct {/) {
>       do something
>     } elsif ($line =~ /typedef struct {/) {
>       do something else
>     } elsif ($line =~ /something else/) {
>     } ...
> 
> I am having difficulty doing this cleanly in python.  Can anyone help?
> 
>     rx1 = re.compile(r'struct {')
>     rx2 = re.compile(r'typedef struct {')
>     rx3 = re.compile(r'something else')
> 
>     m = rx1.match(line)
>     if m:
>       do something
>     else:
>       m = rx2.match(line)
>       if m:
>         do something
>       else:
>         m = rx3.match(line)
>         if m:
>       do something
>     else:
>       error

If you don't need the match object as part of "do something", you 
could do a fairly literal translation of the Perl:

if rx1.match(line):
     do something
elif rx2.match(line):
     do something else
elif rx3.match(line):
     do other thing
else:
     raise ValueError("...")

Alternatively, if each of the "do something" phrases can be easily 
reduced to a function call, then you could do something like:

def do_something(line, match): ...
def do_something_else(line, match): ...
def do_other_thing(line, match): ...

table = [ (re.compile(r'struct {'), do_something),
           (re.compile(r'typedef struct {'), do_something_else),
           (re.compile(r'something else'), do_other_thing) ]

for pattern, func in table:
     m = pattern.match(line)
     if m:
         func(line, m)
         break
else:
     raise ValueError("...")

The for/else pattern may look a bit odd, but the key feature here is 
that the else clause only runs if the for loop terminates normally -- 
if you break out of the loop, the else does *not* run.

Jeff Shannon




More information about the Python-list mailing list