regex question: backreferences in brackets

Alex Martelli aleax at aleax.it
Fri Dec 28 11:22:25 EST 2001


"Jeremy Jones" <cypher_dpg at yahoo.com> wrote in message
news:mailman.1009552952.16557.python-list at python.org...
    ...
"""
match_string = 'XXX|1|22|333|4444:'
test_compile = re.compile(r'XXX(.)[^|]{1}\1[^|]{2}\1[^|]{3}\1[^|]{4}(.)')
mymatch = test_compile.match(match_string)
if mymatch:
print "Found a match"
print mymatch.group(0)
else:
print "No match found"
"""

The format of the strings that I am trying to match are 3 specific
characters followed by some delimter followed by N number of characters
other than the delimiter followed by the delimiter, etc.  The above code
snipped works and matches the string perfectly.  I guess my question is
this:  how can I match any other character except for the delimiter without
knowing it beforehand (which I won't know what it is beforehand)?  I am
thinking use the backreference to the delimiter (i.e.


Simplest might be something like:

def clevermatch(match_string):
    try: myre = r'XXX(.)[^|]{1}\1[^|]{2}\1[^|]{3}\1[^|]{4}(.)'.sub('|',
        re.escape(match_string[3]))
    except IndexError: return None
    else return re.match(myre, match_string)

i.e., tweak the re pattern appropriately before using it as a re.


Alex






More information about the Python-list mailing list