Tag parsing in python

Paul McGuire ptmcg at austin.rr.com
Sun Aug 29 00:23:13 EDT 2010


On Aug 28, 11:14 am, agnibhu <dee... at gmail.com> wrote:
> Hi all,
>
> I'm a newbie in python. I'm trying to create a library for parsing
> certain keywords.
> For example say I've key words like abc: bcd: cde: like that... So the
> user may use like
> abc: How are you bcd: I'm fine cde: ok
>
> So I've to extract the "How are you" and "I'm fine" and "ok"..and
> assign them to abc:, bcd: and cde: respectively.. There may be
> combination of keyowords introduced in future. like abc: xy: How are
> you
> So new keywords qualifying the other keywords so on..
> So I would like to know the python way of doing this. Is there any
> library already existing for making my work easier. ?
>
> ~
> Agnibhu

Here's how pyparsing can parse your keyword/tags:

from pyparsing import Combine, Word, alphas, Group, OneOrMore, empty,
SkipTo, LineEnd

text1 = "abc: How are you bcd: I'm fine cde: ok"
text2 = "abc: xy: How are you"

tag = Combine(Word(alphas)+":")
tag_defn = Group(OneOrMore(tag))("tag") + empty + SkipTo(tag |
LineEnd())("body")

for text in (text1,text2):
    print text
    for td in tag_defn.searchString(text):
        print td.dump()
    print

Prints:

abc: How are you bcd: I'm fine cde: ok
[['abc:'], 'How are you']
- body: How are you
- tag: ['abc:']
[['bcd:'], "I'm fine"]
- body: I'm fine
- tag: ['bcd:']
[['cde:'], 'ok']
- body: ok
- tag: ['cde:']

abc: xy: How are you
[['abc:', 'xy:'], 'How are you']
- body: How are you
- tag: ['abc:', 'xy:']



Now here's how to further use pyparsing to actually use those tags as
substitution macros:

from pyparsing import Forward, MatchFirst, Literal, And, replaceWith,
FollowedBy

# now combine macro detection with substitution
macros = {}
macro_substitution = Forward()
def make_macro_sub(tokens):
    macros[tuple(tokens.tag)] = tokens.body

    # define macro substitution
    macro_substitution << MatchFirst(
            [(Literal(k[0]) if len(k)==1
                else And([Literal(kk) for kk in
k])).setParseAction(replaceWith(v))
                    for k,v in macros.items()] ) + ~FollowedBy(tag)

    return ""
tag_defn.setParseAction(make_macro_sub)

scan_pattern = macro_substitution | tag_defn

test_text = text1 + "\nBob said, 'abc:?' I said, 'bcd:.'" + text2 +
"\nThen Bob said 'abc: xy:?'"

print test_text
print scan_pattern.transformString(test_text)


Prints:

abc: How are you bcd: I'm fine cde: ok
Bob said, 'abc:?' I said, 'bcd:.'abc: xy: How are you
Then Bob said 'abc: xy:?'

Bob said, 'How are you?' I said, 'I'm fine.'
Then Bob said 'How are you?'




More information about the Python-list mailing list