checking a string against multiple patterns

Duncan Booth duncan.booth at invalid.invalid
Tue Dec 18 12:39:11 EST 2007


tomasz <tmkmarc at googlemail.com> wrote:

> Is there an alternative to it? Am I missing something? Python doesn't
> have special variables $1, $2 (right?) so you must assign the result
> of a match to a variable, to be able to access the groups.

Look for repetition in your code and remove it. That will almost always 
remove the nesting. Or, combine your regular expressions into one large 
expression and branch on the existence of relevant groups. Using named 
groups stops all your code breaking just because you need to change one 
part of the regex.

e.g. This would handle your example, but it is just one way to do it:

import re
from string import Template

def sub(patterns, s):
    for pat, repl in patterns:
        m = re.match(pat, s)
        if m:
            return Template(repl).substitute(m.groupdict())
    return s

PATTERNS = [
	(r'(?P<start>.*?)(?P<b>b+)', 'start=$start, b=$b'),
	(r'(?P<a>a+)(?P<tail>.*)$', 'Got a: $a, tail=$tail'),
	(r'(?P<c>c+)', 'starts with c: $c'),
	]

>>> sub(PATTERNS, 'abc')
'start=a, b=b'
>>> sub(PATTERNS, 'is a something')
'is a something'
>>> sub(PATTERNS, 'a something')
'Got a: a, tail= something'




More information about the Python-list mailing list