regex question: backreferences in brackets
Alex Martelli
aleax at aleax.it
Fri Dec 28 11:22:25 EST 2001
"Jeremy Jones" <cypher_dpg at yahoo.com> wrote in message
news:mailman.1009552952.16557.python-list at python.org...
...
"""
match_string = 'XXX|1|22|333|4444:'
test_compile = re.compile(r'XXX(.)[^|]{1}\1[^|]{2}\1[^|]{3}\1[^|]{4}(.)')
mymatch = test_compile.match(match_string)
if mymatch:
print "Found a match"
print mymatch.group(0)
else:
print "No match found"
"""
The format of the strings that I am trying to match are 3 specific
characters followed by some delimter followed by N number of characters
other than the delimiter followed by the delimiter, etc. The above code
snipped works and matches the string perfectly. I guess my question is
this: how can I match any other character except for the delimiter without
knowing it beforehand (which I won't know what it is beforehand)? I am
thinking use the backreference to the delimiter (i.e.
Simplest might be something like:
def clevermatch(match_string):
try: myre = r'XXX(.)[^|]{1}\1[^|]{2}\1[^|]{3}\1[^|]{4}(.)'.sub('|',
re.escape(match_string[3]))
except IndexError: return None
else return re.match(myre, match_string)
i.e., tweak the re pattern appropriately before using it as a re.
Alex
More information about the Python-list
mailing list