help with pyparsing
Prabhu Gurumurthy
pgurumur at gmail.com
Mon Dec 10 11:04:32 EST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Paul McGuire wrote:
> On Dec 9, 11:01 pm, Prabhu Gurumurthy <pguru... at gmail.com> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> All,
>>
>> I have the following lines that I would like to parse in python using
>> pyparsing, but have some problems forming the grammar.
>>
>> Line in file:
>> table <ALINK> const { 207.135.103.128/26, 207.135.112.64/29 }
>> table <INTRANET> persist { ! 10.200.2/24, 10.200/22 }
>> table <RFC_1918> const { 192.168/16, ! 172.24.1/29, 172.16/12, 169.254/16 }
>> table <DIALER> persist { 10.202/22 }
>> table <RAVPN> const { 10.206/22 }
>> table <KS> const { \
>> 10.205.1/24, \
>> 169.136.241.68, \
>> 169.136.241.70, \
>> 169.136.241.71, \
>> 169.136.241.72, \
>> 169.136.241.75, \
>> 169.136.241.76, \
>> 169.136.241.77, \
>> 169.136.241.78, \
>> 169.136.241.79, \
>> 169.136.241.81, \
>> 169.136.241.82, \
>> 169.136.241.85 }
>>
>> I have the following grammar defn.
>>
>> tableName = Word(alphanums + "-" + "_")
>> leftClose = Suppress("<")
>> rightClose = Suppress(">")
>> key = Suppress("table")
>> tableType = Regex("persist|const")
>> ip4Address = OneOrMore(Word(nums + "."))
>> ip4Network = Group(ip4Address + Optional(Word("/") +
>> OneOrMore(Word(nums))))
>> temp = ZeroOrMore("\\" + "\n")
>> tableList = OneOrMore(Optional("\\") |
>> ip4Network | ip4Address | Suppress(",") | Literal("!"))
>> leftParen = Suppress("{")
>> rightParen = Suppress("}")
>>
>> table = key + leftClose + tableName + rightClose + tableType + \
>> leftParen + tableList + rightParen
>>
>> I cannot seem to match sixth line in the file above, i.e table name with
>> KS, how do I form the grammar for it, BTW, I still cannot seem to ignore
>> comments using table.ignore(Literal("#") + restOfLine), I get a parse error.
>>
>> Any help appreciated.
>> Thanks
>> Prabhu
>
> Prabhu -
>
> This is a good start, but here are some suggestions:
>
> 1. ip4Address = OneOrMore(Word(nums + "."))
>
> Word(nums+".") will read any contiguous set of characters in the
> string nums+".", so OneOrMore is not necessary for reading in an
> ip4Address. Just use:
>
> ip4Address = Word(nums + ".")
>
>
> 2. ip4Network = Group(ip4Address + Optional(Word("/") +
> OneOrMore(Word(nums))))
>
> Same comment, OneOrMore is not needed for the added value to the
> ip4Address:
>
> ip4Network = Group(ip4Address + Optional(Word("/") + Word(nums))))
>
>
> 3. tableList = OneOrMore(Optional("\\") |
> ip4Network | ip4Address | Suppress(",") |
> Literal("!"))
>
> The list of ip4Networks is just a comma-delimited list, with some
> entries preceded with a '!' character. It is simpler to use
> pyparsing's built-in helper, delimitedList, as in:
>
> tableList = Group( delimitedList(Group("!"+ip4Network)|ip4Network) )
>
>
> Yes, I know, you are saying, "but what about all those backslashes?"
> The backslashes look like they are just there as line continuations.
> We can define an ignore expression, so that the table expression, and
> all of its contained expressions, will ignore '\' characters as line
> continuations:
>
> table.ignore( Literal("\\") + LineEnd() )
>
> And I'm not sure why you had trouble with ignoring '#' + restOfLine,
> it works fine in the program below.
>
> If you make these changes, your program will look something like this:
>
> tableName = Word(alphanums + "-" + "_")
> leftClose = Suppress("<")
> rightClose = Suppress(">")
> key = Suppress("table")
> tableType = Regex("persist|const")
> ip4Address = Word(nums + ".")
> ip4Network = Group(ip4Address + Optional(Word("/") + Word(nums)))
> tableList = Group(delimitedList(Group("!"+ip4Network)|ip4Network))
> leftParen = Suppress("{")
> rightParen = Suppress("}")
>
> table = key + leftClose + tableName + rightClose + tableType + \
> leftParen + tableList + rightParen
> table.ignore(Literal("\\") + LineEnd())
> table.ignore(Literal("#") + restOfLine)
>
> # parse the input line, and pprint the results
> result = OneOrMore(table).parseString(line)
> from pprint import pprint
> pprint(result.asList())
>
> Prints out:
> ['ALINK',
> 'const',
> [['207.135.103.128', '/', '26'], ['207.135.112.64', '/', '29']],
> 'INTRANET',
> 'persist',
> [['!', ['10.200.2', '/', '24']], ['10.200', '/', '22']],
> 'RFC_1918',
> 'const',
> [['192.168', '/', '16'],
> ['!', ['172.24.1', '/', '29']],
> ['172.16', '/', '12'],
> ['169.254', '/', '16']],
> 'DIALER',
> 'persist',
> [['10.202', '/', '22']],
> 'RAVPN',
> 'const',
> [['10.206', '/', '22']],
> 'KS',
> 'const',
> [['10.205.1', '/', '24'],
> ['169.136.241.68'],
> ['169.136.241.70'],
> ['169.136.241.71'],
> ['169.136.241.72'],
> ['169.136.241.75'],
> ['169.136.241.76'],
> ['169.136.241.77'],
> ['169.136.241.78'],
> ['169.136.241.79'],
> ['169.136.241.81'],
> ['169.136.241.82'],
> ['169.136.241.85']]]
>
> -- Paul
Awesome, thanks a lot will try it today and will let you know how it
proceeds.
thanks again.
Prabhu
- -
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org
iD8DBQFHXWOQTkjpaeKzB9YRAq/aAJ9b0uocbP+1XxIVj4LgS76uFEuQHwCgxojY
zv05Raaj5McSEzDWXiSxf9c=
=MMFV
-----END PGP SIGNATURE-----
More information about the Python-list
mailing list