regex help for a newbie
F. Petitjean
littlejohn.75 at news.noos.fr
Thu Apr 8 05:01:44 EDT 2004
On 6 Apr 2004 22:38:24 GMT, Marco Herrn <herrn at gmx.net> wrote:
> On 2004-04-06, marco <marco at reimeika.ca> wrote:
>> Marco Herrn <herrn at gmx.net> writes:
>>
>>> On 2004-04-06, marco <marco at reimeika.ca> wrote:
>>> > Marco Herrn <herrn at gmx.net> writes:
>>> >> the parts in a recursive function. So the thing I want to achieve here
>>> >> is to extract %(BBB%(CCC)BBB) and %(DDD).
>>> >
>>
>> Does the "aaa"-type string really show up three times? Or is it actually:
>>
>> "maybeeggs%(BBB%(CCC)BBB)maybeham%(DDD)maybespam"
>
> Yes, it is this. I just used the same strings to indicate the nesting
> levels. All strings in this expression are arbitrary strings.
>
>> (but I doubt it -- I guess you'll need a real parser :)
>
> Yes, I already realized that :-)
>
> Marco
>
A solution without any re nor parser :
the basic idea is nesting, wrapping of parsplit as a true recursive
function is left as an exercice to the reader.
#! /usr/bin/env python
# -*- coding: iso-8859-1 -*-
#
# parparse.py
#
class NestingParenError(Exception):
"""Parens %( ) do not match"""
def parsplit(s, begin='%(', end=')'):
"""returns before, inside, after or s, None, None
raises NestingParenError if begin, end pairs are not nested"""
pbegin = s.find(begin)
if pbegin == -1:
return s, None, None
before = s[:pbegin]
pend = s.rfind(end)
if pend == -1:
raise NestingParenError("in '%s' '%s' found without matching '%s'" %\
(s, begin, end))
inside = s[pbegin+len(begin):pend]
return before, inside, s[pend+len(end):]
def usage(s):
"""Typical use of parsplit"""
before, inside, after = parsplit(s)
if inside is None:
print "'%s' has no %%( ) part" % (s,)
return
# process :
print "before %s\ninside %s\nafter %s" % (before, inside, after)
while inside:
before, inside, after = parsplit(inside)
# process :
print "before %s\ninside %s\nafter %s" % (before, inside, after)
if __name__ == '__main__':
"""basic tests"""
s1 = """aaaa a%(bbb bbb%(iiii) ccc)dddd"""
print "nested case %s" % (s1,)
usage(s1)
print
print
usage("""0123before%()""")
print
usage("""%(inside)""")
print
usage("""%()after""")
print
s2 = """without closing %( paren"""
s3 = """without opening ) paren"""
try:
usage(s2)
except NestingParenError, e:
print e
print
usage(s3)
Hope that helps
Regards
More information about the Python-list
mailing list