Best way to extract from regex in if statement

Nick Craig-Wood nick at craig-wood.com
Thu Apr 16 03:14:10 EDT 2009


Paul McGuire <ptmcg at austin.rr.com> wrote:
>  On Apr 3, 9:26 pm, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
> > bwgoudey <bwgou... at gmail.com> writes:
> > > elif re.match("^DATASET:\s*(.+) ", line):
> > >         m=re.match("^DATASET:\s*(.+) ", line)
> > >         print m.group(1))
> >
> > Sometimes I like to make a special class that saves the result:
> >
> >   class Reg(object):   # illustrative code, not tested
> >      def match(self, pattern, line):
> >         self.result = re.match(pattern, line)
> >         return self.result
> >
>  I took this a little further, *and* lightly tested it too.
> 
>  Since this idiom makes repeated references to the input line, I added
>  that to the constructor of the matching class.
> 
>  By using __call__, I made the created object callable, taking the RE
>  expression as its lone argument and returning a boolean indicating
>  match success or failure.  The result of the re.match call is saved in
>  self.matchresult.
> 
>  By using __getattr__, the created object proxies for the results of
>  the re.match call.
> 
>  I think the resulting code looks pretty close to the original C or
>  Perl idiom of cascading "elif (c=re_expr_match("..."))" blocks.
> 
>  (I thought about cacheing previously seen REs, or adding support for
>  compiled REs instead of just strings - after all, this idiom usually
>  occurs in a loop while iterating of some large body of text.  It turns
>  out that the re module already caches previously compiled REs, so I
>  left my cacheing out in favor of that already being done in the std
>  lib.)
> 
> 
>  import re
> 
>  class REmatcher(object):
>      def __init__(self,sourceline):
>          self.line = sourceline
>      def __call__(self, regexp):
>          self.matchresult = re.match(regexp, self.line)
>          self.success = self.matchresult is not None
>          return self.success
>      def __getattr__(self, attr):
>          return getattr(self.matchresult, attr)

That is quite similar to the one I use...

"""
Matcher class encapsulating a call to re.search for ease of use in conditionals.
"""

import re

class Matcher(object):
    """
    Matcher class

    m = Matcher()

    if m.search(r'add (\d+) (\d+)', line):
        do_add(m[0], m[1])
    elif m.search(r'mult (\d+) (\d+)', line):
        do_mult(m[0], m[1])
    elif m.search(r'help (\w+)', line):
        show_help(m[0])

    """
    def search(self, r, s):
        """
        Do a regular expression search and return if it matched.
        """
        self.value = re.search(r, s)
        return self.value
    def __getitem__(self, n):
        """
        Return n'th matched () item.

        Note so the first matched item will be matcher[0]
        """
        return self.value.group(n+1)
    def groups(self):
        """
        Return all the matched () items.
        """
        return self.value.groups()

-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick



More information about the Python-list mailing list