regex help for a newbie

F. Petitjean littlejohn.75 at news.noos.fr
Thu Apr 8 05:01:44 EDT 2004


On 6 Apr 2004 22:38:24 GMT, Marco Herrn <herrn at gmx.net> wrote:
> On 2004-04-06, marco <marco at reimeika.ca> wrote:
>> Marco Herrn <herrn at gmx.net> writes:
>>
>>> On 2004-04-06, marco <marco at reimeika.ca> wrote:
>>> > Marco Herrn <herrn at gmx.net> writes:
>>> >> the parts in a recursive function. So the thing I want to achieve here
>>> >> is to extract %(BBB%(CCC)BBB) and %(DDD).
>>> >
>>
>> Does the "aaa"-type string really show up three times? Or is it actually:
>>
>> "maybeeggs%(BBB%(CCC)BBB)maybeham%(DDD)maybespam"
> 
> Yes, it is this. I just used the same strings to indicate the nesting
> levels. All strings in this expression are arbitrary strings.
> 
>> (but I doubt it -- I guess you'll need a real parser :)
> 
> Yes, I already realized that :-)
> 
> Marco
> 
A solution without any re nor parser :
the basic idea  is nesting, wrapping of parsplit as a true recursive
function is left as an exercice to the reader.

#! /usr/bin/env python
# -*- coding: iso-8859-1 -*-
#
#  parparse.py
#
class NestingParenError(Exception):
    """Parens %( ) do not match"""

def parsplit(s, begin='%(', end=')'):
    """returns before, inside, after  or s, None, None
    raises NestingParenError if begin, end pairs are not nested"""
    pbegin = s.find(begin)
    if pbegin == -1:
        return s, None, None
    before = s[:pbegin]
    pend = s.rfind(end)
    if pend == -1:
        raise NestingParenError("in '%s' '%s' found without matching '%s'" %\
            (s, begin, end))
    inside = s[pbegin+len(begin):pend]
    return before, inside, s[pend+len(end):]

def usage(s):
    """Typical use of parsplit"""
    before, inside, after = parsplit(s)
    if inside is None:
        print "'%s' has no %%( ) part" % (s,)
        return
    # process :
    print "before %s\ninside %s\nafter %s" % (before, inside, after)
    while inside:
        before, inside, after = parsplit(inside)
        # process :
        print "before %s\ninside %s\nafter %s" % (before, inside, after)

if __name__ == '__main__':
    """basic tests"""
    s1 = """aaaa a%(bbb bbb%(iiii) ccc)dddd"""
    print "nested case %s" % (s1,)
    usage(s1)
    print
    print
    usage("""0123before%()""")
    print
    usage("""%(inside)""")
    print
    usage("""%()after""")
    print
    s2 = """without closing %( paren"""
    s3 = """without opening ) paren"""
    try:
        usage(s2)
    except NestingParenError, e:
        print e
    print
    usage(s3)

Hope that helps
Regards



More information about the Python-list mailing list