regex question: backreferences in brackets
Jeremy Jones
cypher_dpg at yahoo.com
Fri Dec 28 10:23:50 EST 2001
From the Python Library Reference on regular expressions, I read (in section 4.2.1 - concerning backreferences):
"""
\number
<snip> Inside the "[" and "]" of a character class, all numeric escapes are treated as characters.
"""
I want to be able to match any character other than a character that I already matched and have a named (or numbered) group for. For example:
"""
match_string = 'XXX|1|22|333|4444:'
test_compile = re.compile(r'XXX(.)[^|]{1}\1[^|]{2}\1[^|]{3}\1[^|]{4}(.)')
mymatch = test_compile.match(match_string)
if mymatch:
print "Found a match"
print mymatch.group(0)
else:
print "No match found"
"""
The format of the strings that I am trying to match are 3 specific characters followed by some delimter followed by N number of characters other than the delimiter followed by the delimiter, etc. The above code snipped works and matches the string perfectly. I guess my question is this: how can I match any other character except for the delimiter without knowing it beforehand (which I won't know what it is beforehand)? I am thinking use the backreference to the delimiter (i.e. \1), but you can't but that in the brackets. I tried putting \1 in the brackets and it sort of worked. But when I changed one of the numbers to a |, it still matched, which isn't what I want (and actually, it doesn't match if I leave the | hard-coded). I also tried using named backreferences like (?P<delimiter>.) rather than (.) and tried putting (?P=delimiter) in the brackets and no luck. Any suggestions? TIA.
Jeremy Jones
More information about the Python-list
mailing list