Tag parsing in python
Paul McGuire
ptmcg at austin.rr.com
Sun Aug 29 00:23:13 EDT 2010
On Aug 28, 11:14 am, agnibhu <dee... at gmail.com> wrote:
> Hi all,
>
> I'm a newbie in python. I'm trying to create a library for parsing
> certain keywords.
> For example say I've key words like abc: bcd: cde: like that... So the
> user may use like
> abc: How are you bcd: I'm fine cde: ok
>
> So I've to extract the "How are you" and "I'm fine" and "ok"..and
> assign them to abc:, bcd: and cde: respectively.. There may be
> combination of keyowords introduced in future. like abc: xy: How are
> you
> So new keywords qualifying the other keywords so on..
> So I would like to know the python way of doing this. Is there any
> library already existing for making my work easier. ?
>
> ~
> Agnibhu
Here's how pyparsing can parse your keyword/tags:
from pyparsing import Combine, Word, alphas, Group, OneOrMore, empty,
SkipTo, LineEnd
text1 = "abc: How are you bcd: I'm fine cde: ok"
text2 = "abc: xy: How are you"
tag = Combine(Word(alphas)+":")
tag_defn = Group(OneOrMore(tag))("tag") + empty + SkipTo(tag |
LineEnd())("body")
for text in (text1,text2):
print text
for td in tag_defn.searchString(text):
print td.dump()
print
Prints:
abc: How are you bcd: I'm fine cde: ok
[['abc:'], 'How are you']
- body: How are you
- tag: ['abc:']
[['bcd:'], "I'm fine"]
- body: I'm fine
- tag: ['bcd:']
[['cde:'], 'ok']
- body: ok
- tag: ['cde:']
abc: xy: How are you
[['abc:', 'xy:'], 'How are you']
- body: How are you
- tag: ['abc:', 'xy:']
Now here's how to further use pyparsing to actually use those tags as
substitution macros:
from pyparsing import Forward, MatchFirst, Literal, And, replaceWith,
FollowedBy
# now combine macro detection with substitution
macros = {}
macro_substitution = Forward()
def make_macro_sub(tokens):
macros[tuple(tokens.tag)] = tokens.body
# define macro substitution
macro_substitution << MatchFirst(
[(Literal(k[0]) if len(k)==1
else And([Literal(kk) for kk in
k])).setParseAction(replaceWith(v))
for k,v in macros.items()] ) + ~FollowedBy(tag)
return ""
tag_defn.setParseAction(make_macro_sub)
scan_pattern = macro_substitution | tag_defn
test_text = text1 + "\nBob said, 'abc:?' I said, 'bcd:.'" + text2 +
"\nThen Bob said 'abc: xy:?'"
print test_text
print scan_pattern.transformString(test_text)
Prints:
abc: How are you bcd: I'm fine cde: ok
Bob said, 'abc:?' I said, 'bcd:.'abc: xy: How are you
Then Bob said 'abc: xy:?'
Bob said, 'How are you?' I said, 'I'm fine.'
Then Bob said 'How are you?'
More information about the Python-list
mailing list