[Tutor] Parsing problem

Sat Jul 23 15:33:10 CEST 2005

Hmmm... just a quick update, I've been poking around and I'm obviously 
making some error of logic. 

Given a line - 

f = "j = { line = { foo = 10 bar = 20 } }"

And given the following code - 

select = pp.Forward()
select << 
pp.Word(pp.printables) + pp.Suppress("=") + pp.Suppress("{") + 
pp.OneOrMore( (pp.Word(pp.printables) + pp.Suppress("=") + 
pp.Word(pp.printables) ) | select ) + pp.Suppress("}")

sel.parseString(f) gives - 

(['j', 'line', '{', 'foo', '10', 'bar', '20'], {})

So I've got a bracket sneaking through there. Argh. My brain hurts. 

Is the | operator an exclusive or? 

Befuddled, 

Liam Clarke

On 7/23/05, Liam Clarke <cyresse at gmail.com> wrote:
> 
> Howdy, 
> 
> I've attempted to follow your lead and have started from scratch, I could 
> just copy and paste your solution (which works pretty well), but I want to 
> understand what I'm doing *grin*
> 
> However, I've been hitting a couple of ruts in the path to enlightenment. 
> Is there a way to tell pyparsing that to treat specific escaped characters 
> as just a slash followed by a letter? For the time being I've converted all 
> backslashes to forwardslashes, as it was choking on \a in a file path.
> 
> But my latest hitch, takes this form (apologies for large traceback)
> 
> Traceback (most recent call last):
> File "<interactive input>", line 1, in ?
> File "parse.py", line 336, in parse
> parsedEntries = dicts.parseString(test_data)
> File "c:\python24\Lib\site-packages\pyparsing.py", line 616, in 
> parseString
> loc, tokens = self.parse( instring.expandtabs(), 0 )
> File "c:\python24\Lib\site-packages\pyparsing.py", line 558, in parse
> loc,tokens = self.parseImpl( instring, loc, doActions )
> File "c:\python24\Lib\site-packages\pyparsing.py", line 1518, in parseImpl
> return self.expr.parse( instring, loc, doActions )
> File "c:\python24\Lib\site-packages\pyparsing.py", line 558, in parse
> loc,tokens = self.parseImpl( instring, loc, doActions )
> File "c:\python24\Lib\site-packages\pyparsing.py", line 1367, in parseImpl
> loc, exprtokens = e.parse( instring, loc, doActions )
> File "c:\python24\Lib\site-packages\pyparsing.py", line 558, in parse
> loc,tokens = self.parseImpl( instring, loc, doActions )
> File "c:\python24\Lib\site-packages\pyparsing.py", line 1518, in parseImpl
> return self.expr.parse( instring, loc, doActions )
> File "c:\python24\Lib\site-packages\pyparsing.py", line 560, in parse
> raise ParseException, ( instring, len(instring), self.errmsg, self )
> 
> ParseException: Expected "}" (at char 9909), (line:325, col:5)
> 
> The offending code can be found here (includes the data) - 
> http://www.rafb.net/paste/results/L560wx80.html
> 
> It's like pyparsing isn't recognising a lot of my "}"'s, as if I add 
> another one, it throws the same error, same for adding another two...
> 
> No doubt I've done something silly, but any help in finding the tragic 
> flaw would be much appreciated. I need to get a parsingResults object out so 
> I can learn how to work with the basic structure!
> 
> Much regards,
> 
> Liam Clarke
> 
> On 7/21/05, Paul McGuire <paul at alanweberassociates.com> wrote:
> > 
> > Liam, Kent, and Danny -
> > 
> > It sure looks like pyparsing is taking on a life of its own! I can see I 
> > no
> > longer am the only one pitching pyparsing at some of these applications!
> > 
> > Yes, Liam, it is possible to create dictionary-like objects, that is, 
> > ParseResults objects that have named values in them. I looked into your
> > application, and the nested assignments seem very similar to a 
> > ConfigParse
> > type of structure. Here is a pyparsing version that handles the test 
> > data 
> > in your original post (I kept Danny Yoo's recursive list values, and 
> > added
> > recursive dictionary entries):
> > 
> > --------------------------
> > import pyparsing as pp
> > 
> > listValue = pp.Forward()
> > listSeq = pp.Suppress ('{') + pp.Group(pp.ZeroOrMore(listValue)) +
> > pp.Suppress('}')
> > listValue << ( pp.dblQuotedString.setParseAction(pp.removeQuotes) |
> > pp.Word(pp.alphanums) | listSeq )
> > 
> > keyName = pp.Word( pp.alphas )
> > 
> > entries = pp.Forward()
> > entrySeq = pp.Suppress('{') + pp.Group(pp.OneOrMore(entries)) +
> > pp.Suppress('}')
> > entries << pp.Dict(
> > pp.OneOrMore (
> > pp.Group( keyName + pp.Suppress('=') + (entrySeq |
> > listValue) ) ) )
> > --------------------------
> > 
> > 
> > Dict is one of the most confusing classes to use, and there are some
> > examples in the examples directory that comes with pyparsing (see 
> > dictExample2.py), but it is still tricky. Here is some code to access 
> > your
> > input test data, repeated here for easy reference:
> > 
> > --------------------------
> > testdata = """\
> > country = {
> > tag = ENG 
> > ai = {
> > flags = { }
> > combat = { DAU FRA ORL PRO }
> > continent = { }
> > area = { }
> > region = { "British Isles" "NorthSeaSea" "ECAtlanticSea" "NAtlanticSea"
> > "TagoSea" "WCAtlanticSea" } 
> > war = 60
> > ferocity = no
> > }
> > }
> > """
> > parsedEntries = entries.parseString(testdata)
> > 
> > def dumpEntries(dct,depth=0):
> > keys = dct.keys()
> > keys.sort()
> > for k in keys:
> > print (' '*depth) + '- ' + k + ':', 
> > if isinstance(dct[k],pp.ParseResults):
> > if dct[k][0].keys():
> > print
> > dumpEntries(dct[k][0],depth+1)
> > else:
> > print dct[k][0]
> > else:
> > print dct[k]
> > 
> > dumpEntries( parsedEntries )
> > 
> > print
> > print parsedEntries.country[0].tag
> > print parsedEntries.country[0].ai[0].war
> > print parsedEntries.country[0].ai[0].ferocity 
> > --------------------------
> > 
> > This will print out:
> > 
> > --------------------------
> > - country:
> > - ai:
> > - area: []
> > - combat: ['DAU', 'FRA', 'ORL', 'PRO']
> > - continent: []
> > - ferocity: no 
> > - flags: []
> > - region: ['British Isles', 'NorthSeaSea', 'ECAtlanticSea',
> > 'NAtlanticSea', 'TagoSea', 'WCAtlanticSea']
> > - war: 60
> > - tag: ENG
> > 
> > ENG
> > 60
> > No
> > --------------------------
> > 
> > But I really dislike having to dereference those nested values using the
> > 0'th element. So I'm going to fix pyparsing so that in the next release,
> > you'll be able to reference the sub-elements as:
> > 
> > print parsedEntries.country.tag 
> > print parsedEntries.country.ai.war
> > print parsedEntries.country.ai.ferocity
> > 
> > This *may* break some existing code, but Dict is not heavily used, based 
> > on
> > feedback from users, and this may make it more useful in general, 
> > especially 
> > when data parses into nested Dict's.
> > 
> > Hope this sheds more light than confusion!
> > -- Paul McGuire
> > 
> > _______________________________________________
> > Tutor maillist - Tutor at python.org
> > http://mail.python.org/mailman/listinfo/tutor
> > 
> 
> 
> 
> -- 
> 'There is only one basic human right, and that is to do as you damn well 
> please. 
> And with it comes the only basic human duty, to take the consequences.' 

-- 
'There is only one basic human right, and that is to do as you damn well 
please.
And with it comes the only basic human duty, to take the consequences.'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20050724/e0627cbd/attachment-0001.htm