trouble pyparsing

the.theorist the.theorist at gmail.com
Wed Jan 4 19:53:45 EST 2006


Hey, I'm trying my hand and pyparsing a log file (named l.log):
FIRSTLINE

PROPERTY1      DATA1
PROPERTY2      DATA2

PROPERTYS LIST
        ID1     data1
        ID2     data2

        ID1     data11
        ID2     data12

SECTION

So I wrote up a small bit of code (named p.py):
from pyparsing import *
import sys

toplevel = Forward()

firstLine = Word('FIRSTLINE')
property  = (Word('PROPERTY1') + Word(alphanums)) ^ (Word('PROPERTY2')
+ Word(alphanums))

id        = (Word('ID1') + Word(alphanums)) ^ (Word('ID2') +
Word(alphanums))
plist     = Word('PROPERTYS LIST') + ZeroOrMore( id )

toplevel << firstLine
toplevel << OneOrMore( property )
toplevel << plist

par = toplevel

print toplevel.parseFile(sys.argv[1])

The problem is that I get the following error:
Traceback (most recent call last):
  File "./p.py", line 23, in ?
    print toplevel.parseFile(sys.argv[1])
  File "/home/erich/tap/lib/python/pyparsing.py", line 833, in
parseFile
    return self.parseString(file_contents)
  File "/home/erich/tap/lib/python/pyparsing.py", line 622, in
parseString
    loc, tokens = self.parse( instring.expandtabs(), 0 )
  File "/home/erich/tap/lib/python/pyparsing.py", line 564, in parse
    loc,tokens = self.parseImpl( instring, loc, doActions )
  File "/home/erich/tap/lib/python/pyparsing.py", line 1743, in
parseImpl
    return self.expr.parse( instring, loc, doActions )
  File "/home/erich/tap/lib/python/pyparsing.py", line 564, in parse
    loc,tokens = self.parseImpl( instring, loc, doActions )
  File "/home/erich/tap/lib/python/pyparsing.py", line 1511, in
parseImpl
    loc, resultlist = self.exprs[0].parse( instring, loc, doActions )
  File "/home/erich/tap/lib/python/pyparsing.py", line 568, in parse
    loc,tokens = self.parseImpl( instring, loc, doActions )
  File "/home/erich/tap/lib/python/pyparsing.py", line 1068, in
parseImpl
    raise exc
pyparsing.ParseException: Expected W:(PROP...) (at char 0), (line:1,
col:1)

I fiddled around with this for quite awhile, and it looks like because
"PROPERTYS LIST" follows one of ['PROPERTY1', 'PROPERTY2'] that pyparse
grabs the overlapping text 'PROPERTY' so that it only has 'S', 'LIST'
when it goes looking for the next thing to parse.

Is this a fundamental error, or is it just me? (I haven't yet tried
simpleparse)




More information about the Python-list mailing list