Tag parsing in python

agnibhu deepud at gmail.com
Mon Aug 30 02:35:26 EDT 2010


On Aug 29, 5:43 pm, Paul McGuire <pt... at austin.rr.com> wrote:
> On Aug 28, 11:23 pm, Paul McGuire <pt... at austin.rr.com> wrote:
>
>
>
> > On Aug 28, 11:14 am, agnibhu <dee... at gmail.com> wrote:
>
> > > Hi all,
>
> > > I'm a newbie in python. I'm trying to create a library for parsing
> > > certain keywords.
> > > For example say I've key words like abc: bcd: cde: like that... So the
> > > user may use like
> > > abc: How are you bcd: I'm fine cde: ok
>
> > > So I've to extract the "How are you" and "I'm fine" and "ok"..and
> > > assign them to abc:, bcd: and cde: respectively.. There may be
> > > combination of keyowords introduced in future. like abc: xy: How are
> > > you
> > > So new keywords qualifying the other keywords so on..
>
> I got to thinking more about your keywords-qualifying-keywords
> example, and I thought this would be a good way to support locale-
> specific tags.  I also thought how one might want to have tags within
> tags, to be substituted later, requiring a "abc::" escaped form of
> "abc:", so that the tag is substituted with the value of tag "abc:" as
> a late binding.
>
> Wasn't too hard to modify what I posted yesterday, and now I rather
> like it.
>
> -- Paul
>
> # tag_substitute.py
>
> from pyparsing import (Combine, Word, alphas, FollowedBy, Group,
> OneOrMore,
>     empty, SkipTo, LineEnd, Optional, Forward, MatchFirst, Literal,
> And, replaceWith)
>
> tag = Combine(Word(alphas) + ~FollowedBy("::") + ":")
> tag_defn = Group(OneOrMore(tag))("tag") + empty + SkipTo(tag |
> LineEnd())("body") + Optional(LineEnd().suppress())
>
> # now combine macro detection with substitution
> macros = {}
> macro_substitution = Forward()
> def make_macro_sub(tokens):
>     # unescape '::' and substitute any embedded tags
>     tag_value =
> macro_substitution.transformString(tokens.body.replace("::",":"))
>
>     # save this tag and value (or overwrite previous)
>     macros[tuple(tokens.tag)] = tag_value
>
>     # define overall macro substitution expression
>     macro_substitution << MatchFirst(
>             [(Literal(k[0]) if len(k)==1
>                 else And([Literal(kk) for kk in
> k])).setParseAction(replaceWith(v))
>                     for k,v in macros.items()] ) + ~FollowedBy(tag)
>
>     # return empty string, so macro definitions don't show up in final
>     # expanded text
>     return ""
>
> tag_defn.setParseAction(make_macro_sub)
>
> # define pattern for macro scanning
> scan_pattern = macro_substitution | tag_defn
>
> sorry = """\
> nm: Dave
> sorry: en: I'm sorry, nm::, I'm afraid I can't do that.
> sorry: es: Lo siento nm::, me temo que no puedo hacer eso.
> Hal said, "sorry: en:"
> Hal dijo, "sorry: es:" """
> print scan_pattern.transformString(sorry)
>
> Prints:
>
> Hal said, "I'm sorry, Dave, I'm afraid I can't do that."
> Hal dijo, "Lo siento Dave, me temo que no puedo hacer eso."

Thanks all for giving me great solutions. I'm happy to see the
respones.
Will try out these and post the reply soon.

Thanks once again,
Agnibhu..



More information about the Python-list mailing list