Newbie Class/Counter question

Paul McGuire ptmcg at austin.rr._bogus_.com
Tue Mar 14 22:07:19 EST 2006


"ProvoWallis" <gshepherd281281 at yahoo.com> wrote in message
news:1142383961.199561.137480 at i39g2000cwa.googlegroups.com...
> Hi,
>
> I've always struggled with classes and this one is no exception.
>
> I'm working in an SGML file and I want to renumber a couple of elements
> in the hierarchy based on the previous level.
>
> E.g.,
>
> My document looks like this
>
> <level1>A. Title Text
> <level2>1. Title Text
> <level2>1. Title Text
> <level2>1. Title Text
> <level1>B. Title Text
> <level2>1. Title Text
> <level2>1. Title Text
>
> but I want to change the numbering of the second level to sequential
> numbers like 1, 2, 3, etc. so my output would look like this
>
> <level1>A. Title Text
> <level2>1. Title Text
> <level2>2. Title Text
> <level2>3. Title Text
> <level1>B. Title Text
> <level2>1. Title Text
> <level2>2. Title Text
>
> This is what I've come up with on my own but it doesn't work. I was
> hoping someone could critique this and point me in the right or better
> direction.
>
> Thanks,
>
> Greg
>
> ###
>
>
> def Fix(m):
>
>      new = m.group(1)
>
>      class ReplacePtSubNumber(object):
>
>           def __init__(self):
>                self._count = 0
>                self._ptsubtwo_re = re.compile(r'<pt-sub2
> no=\"[0-9]\">', re.IGNORECASE| re.UNICODE)
>               # self._ptsubone_re = re.compile(r'<pt-sub1',
> re.IGNORECASE| re.UNICODE)
>
>           def sub(self, new):
>                return self._ptsubtwo_re.sub(self._ptsubNum, new)
>
>           def _ptsubNum(self, match):
>                self._count +=1
>                return '<pt-sub2 no="%s">' % (self._count)
>
>
>      new = ReplacePtSubNumber().sub(new)
>      return '<pt-sub1%s<pt-sub1' % (new)
>
> data = re.sub(r'(?i)(?m)(?s)<pt-sub1(.*?)<pt-sub1', Fix, data)
>

This may not be as elegant as your RE approach, but it seems more readable
to me.  Using pyparsing, we can define search patterns, attach callbacks to
be invoked when a match is found, and the callbacks can return modified text
to replace the original.  Although the running code matches your text
sample, I've also included commented statements that match your source code
sample.

Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul


testData = """<level1>A. Title Text
<level2>1. Title Text
<level2>1. Title Text
<level2>1. Title Text
<level1>B. Title Text
<level2>1. Title Text
<level2>1. Title Text
"""

from pyparsing import *

class Fix(object):
    def __init__(self):
        self.curItem = 0

    def resetCurItem(self,s,l,t):
        self.curItem = 0

    def nextCurItem(self,s,l,t):
        self.curItem += 1
        return "<level2>%d." % self.curItem
        # return '<pt-sub2 no="%d">' % self.curItem

    def fixText(self,data):
        # set up patterns for searching
        lev1 = Literal("<level1>")
        lev2 = Literal("<level2>") + Word(nums) + "."
        # lev1 = CaselessLiteral("<pt-sub1>")
        # lev2 = CaselessLiteral('<pt-sub2 no="') + Word(nums) + '">'

        # when level 1 encountered, reset the cur item counter
        lev1.setParseAction(self.resetCurItem)

        # when level 2 encountered, use next cur item counter value
        lev2.setParseAction(self.nextCurItem)

        patterns = (lev1 | lev2)
        return patterns.transformString( data )

f = Fix()
print f.fixText( testData )

returns:
<level1>A. Title Text
<level2>1. Title Text
<level2>2. Title Text
<level2>3. Title Text
<level1>B. Title Text
<level2>1. Title Text
<level2>2. Title Text





More information about the Python-list mailing list