help with pyparsing

Paul McGuire ptmcg at austin.rr.com
Mon Dec 10 02:24:59 EST 2007


On Dec 9, 11:01 pm, Prabhu Gurumurthy <pguru... at gmail.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> All,
>
> I have the following lines that I would like to parse in python using
> pyparsing, but have some problems forming the grammar.
>
> Line in file:
> table <ALINK> const { 207.135.103.128/26, 207.135.112.64/29 }
> table <INTRANET> persist { ! 10.200.2/24, 10.200/22 }
> table <RFC_1918> const { 192.168/16, ! 172.24.1/29, 172.16/12, 169.254/16 }
> table <DIALER> persist { 10.202/22 }
> table <RAVPN> const { 10.206/22 }
> table <KS> const {   \
>    10.205.1/24,      \
>    169.136.241.68,   \
>    169.136.241.70,   \
>    169.136.241.71,   \
>    169.136.241.72,   \
>    169.136.241.75,   \
>    169.136.241.76,   \
>    169.136.241.77,   \
>    169.136.241.78,   \
>    169.136.241.79,   \
>    169.136.241.81,   \
>    169.136.241.82,   \
>    169.136.241.85 }
>
> I have the following grammar defn.
>
> tableName = Word(alphanums + "-" + "_")
> leftClose = Suppress("<")
> rightClose = Suppress(">")
> key = Suppress("table")
> tableType = Regex("persist|const")
> ip4Address = OneOrMore(Word(nums + "."))
> ip4Network = Group(ip4Address + Optional(Word("/") +
> OneOrMore(Word(nums))))
> temp = ZeroOrMore("\\" + "\n")
> tableList = OneOrMore(Optional("\\") |
>                ip4Network | ip4Address | Suppress(",") | Literal("!"))
> leftParen = Suppress("{")
> rightParen = Suppress("}")
>
> table = key + leftClose + tableName + rightClose + tableType + \
>                   leftParen + tableList + rightParen
>
> I cannot seem to match sixth line in the file above, i.e table name with
> KS, how do I form the grammar for it, BTW, I still cannot seem to ignore
> comments using table.ignore(Literal("#") + restOfLine), I get a parse error.
>
> Any help appreciated.
> Thanks
> Prabhu

Prabhu -

This is a good start, but here are some suggestions:

1. ip4Address = OneOrMore(Word(nums + "."))

Word(nums+".") will read any contiguous set of characters in the
string nums+".", so OneOrMore is not necessary for reading in an
ip4Address.  Just use:

ip4Address = Word(nums + ".")


2. ip4Network = Group(ip4Address + Optional(Word("/") +
OneOrMore(Word(nums))))

Same comment, OneOrMore is not needed for the added value to the
ip4Address:

ip4Network = Group(ip4Address + Optional(Word("/") + Word(nums))))


3. tableList = OneOrMore(Optional("\\") |
               ip4Network | ip4Address | Suppress(",") |
Literal("!"))

The list of ip4Networks is just a comma-delimited list, with some
entries preceded with a '!' character.  It is simpler to use
pyparsing's built-in helper, delimitedList, as in:

tableList = Group( delimitedList(Group("!"+ip4Network)|ip4Network) )


Yes, I know, you are saying, "but what about all those backslashes?"
The backslashes look like they are just there as line continuations.
We can define an ignore expression, so that the table expression, and
all of its contained expressions, will ignore '\' characters as line
continuations:

table.ignore( Literal("\\") + LineEnd() )

And I'm not sure why you had trouble with ignoring '#' + restOfLine,
it works fine in the program below.

If you make these changes, your program will look something like this:

tableName = Word(alphanums + "-" + "_")
leftClose = Suppress("<")
rightClose = Suppress(">")
key = Suppress("table")
tableType = Regex("persist|const")
ip4Address = Word(nums + ".")
ip4Network = Group(ip4Address + Optional(Word("/") + Word(nums)))
tableList = Group(delimitedList(Group("!"+ip4Network)|ip4Network))
leftParen = Suppress("{")
rightParen = Suppress("}")

table = key + leftClose + tableName + rightClose + tableType + \
                  leftParen + tableList + rightParen
table.ignore(Literal("\\") + LineEnd())
table.ignore(Literal("#") + restOfLine)

# parse the input line, and pprint the results
result = OneOrMore(table).parseString(line)
from pprint import pprint
pprint(result.asList())

Prints out:
['ALINK',
 'const',
 [['207.135.103.128', '/', '26'], ['207.135.112.64', '/', '29']],
 'INTRANET',
 'persist',
 [['!', ['10.200.2', '/', '24']], ['10.200', '/', '22']],
 'RFC_1918',
 'const',
 [['192.168', '/', '16'],
  ['!', ['172.24.1', '/', '29']],
  ['172.16', '/', '12'],
  ['169.254', '/', '16']],
 'DIALER',
 'persist',
 [['10.202', '/', '22']],
 'RAVPN',
 'const',
 [['10.206', '/', '22']],
 'KS',
 'const',
 [['10.205.1', '/', '24'],
  ['169.136.241.68'],
  ['169.136.241.70'],
  ['169.136.241.71'],
  ['169.136.241.72'],
  ['169.136.241.75'],
  ['169.136.241.76'],
  ['169.136.241.77'],
  ['169.136.241.78'],
  ['169.136.241.79'],
  ['169.136.241.81'],
  ['169.136.241.82'],
  ['169.136.241.85']]]

-- Paul



More information about the Python-list mailing list