Making regex suck less
jepler at unpythonic.net
jepler at unpythonic.net
Mon Sep 2 20:37:40 EDT 2002
On Mon, Sep 02, 2002 at 09:23:18PM +1000, John La Rooy wrote:
> Carl Banks wrote:
>
> >>It would be more likely to look like this (I haven't put too much
> >>thought into this)
> >
> >
> >No kidding.
> >
> >
> >
> >>"anything,anything,anything,same_as_3rd,same_as_2nd,same_as_1st"
> >>or would you like to suggest something else?
> >
> >
> >How about:
> >
> >pattern = Group(Any()) + Group(Any()) + Group(Any()) \
> > + GroupRef(3) + GroupRef(2) + GroupRef(1)
> >
> Err symantically that's exactly the same as the re and my suggestion
> only the syntax is different. It's still nothing like saying
>
> pattern = "6 character palindrome"
Do you mean something like this?
def palindrome_re(n):
pat = ["(.)" * ((n+1)/2)]
for i in range(n/2, 0, -1):
pat.append("\\%d" % i)
return "".join(pat)
With a little work, you can extend this to use named groups and named
backrefs as well, so that you can use it as a building block for larger
patterns:
def Any(): return "."
def Group(s, g): return "(?P<%s>%s)" % (g, s)
def Backref(g): return "(?P=%s)" % g
def Or(*args): return "|".join(args)
def palindrome_re(n, p):
pat = [Group(Any(), "%s%d") % (p, i+1) for i in range((n+1)/2)]
for i in range(n/2, 0, -1):
pat.append(Backref("%s%d" % (p, i)))
return "".join(pat)
I think that building REs in functions is a great approach for more
complex REs.
>>> q = re.compile(palindrome_re(7, "a") + palindrome_re(6, "b"))
>>> q.match("abcdcbaxyzzyx")
<_sre.SRE_Match object at 0x401c4f00>
>>> _.groupdict()
{'l4': 'd', 'l2': 'b', 'l3': 'c', 'l1': 'a', 'i1': 'x', 'i3': 'z', 'i2': 'y'}
>>> q = re.compile(Or(palindrome_re(7, "a"), palindrome_re(6, "b")))
>>> q.match('abccbb')
>>> q.match("abcdcba")
<_sre.SRE_Match object at 0x401c4f00>
>>> q.match("abccba")
<_sre.SRE_Match object at 0x402e5020>
Jeff
More information about the Python-list
mailing list