Newbie code review of parsing program Please

Paul McGuire ptmcg at austin.rr.com
Mon Nov 17 09:01:47 EST 2008


On Nov 16, 12:53 pm, len <lsumn... at gmail.com> wrote:
> On Nov 16, 12:40 pm, "Mark Tolonen" <M8R-yft... at mailinator.com> wrote:
>
>
> > You might want to check out the pyparsing library.
>
> > -Mark
>
> Thanks Mark I will check in out right now.
>
> Len

Len -

Here is a rough pyparsing starter for your problem:

from pyparsing import *

COMP = Optional("USAGE IS") + oneOf("COMP COMPUTATIONAL")
PIC = oneOf("PIC PICTURE") + Optional("IS")
PERIOD,LPAREN,RPAREN = map(Suppress,".()")

ident = Word(alphanums.upper()+"_-")
integer = Word(nums).setParseAction(lambda t:int(t[0]))
lineNum = Suppress(Optional(LineEnd()) + LineStart() + Word(nums))

rep = LPAREN + integer + RPAREN
repchars = "X" + rep
repchars.setParseAction(lambda tokens: ['X']*tokens[1])
strdecl = Combine(OneOrMore(repchars | "X"))

SIGN = Optional("S")
repdigits = "9" + rep
repdigits.setParseAction(lambda tokens: ['9']*tokens[1])
intdecl = SIGN("sign") + Combine(OneOrMore(repdigits | "9"))
("intpart")
realdecl = SIGN("sign") + Combine(OneOrMore(repdigits | "9"))
("intpart") + "V" + \
                Combine(OneOrMore("9" + rep | "9"))("realpart")

type = Group((strdecl | realdecl | intdecl) +
                Optional(COMP("COMP")))

fieldDecl = lineNum + "05" + ident("name") + \
                PIC + type("type") + PERIOD
structDecl = lineNum + "01" + ident("name") + PERIOD + \
                OneOrMore(Group(fieldDecl))("fields")

It prints out:

SALESMEN-RECORD
   SALESMEN-NO ['999']
   SALESMEN-NAME ['XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX']
   SALESMEN-TERRITORY ['XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX']
   SALESMEN-QUOTA ['S', '9999999', 'COMP']
   SALESMEN-1ST-BONUS ['S', '99999', 'V', '99', 'COMP']
   SALESMEN-2ND-BONUS ['S', '99999', 'V', '99', 'COMP']
   SALESMEN-3RD-BONUS ['S', '99999', 'V', '99', 'COMP']
   SALESMEN-4TH-BONUS ['S', '99999', 'V', '99', 'COMP']

I too have some dim, dark, memories of COBOL.  I seem to recall having
to infer from the number of digits in an integer or real what size the
number would be.  I don't have that logic implemented, but here is an
extension to the above program, which shows you where you could put
this kind of type inference logic (insert this code before the call to
searchString):

class TypeDefn(object):
    @staticmethod
    def intType(tokens):
        self = TypeDefn()
        self.str = "int(%d)" % (len(tokens.intpart),)
        self.isSigned = bool(tokens.sign)
        return self
    @staticmethod
    def realType(tokens):
        self = TypeDefn()
        self.str = "real(%d.%d)" % (len(tokens.intpart),len
(tokens.realpart))
        self.isSigned = bool(tokens.sign)
        return self
    @staticmethod
    def charType(tokens):
        self = TypeDefn()
        self.str = "char(%d)" % len(tokens)
        self.isSigned = False
        self.isComp = False
        return self
    def __repr__(self):
        return ("+-" if self.isSigned else "") + self.str
intdecl.setParseAction(TypeDefn.intType)
realdecl.setParseAction(TypeDefn.realType)
strdecl.setParseAction(TypeDefn.charType)

This prints:

SALESMEN-RECORD
   SALESMEN-NO [int(3)]
   SALESMEN-NAME [char(1)]
   SALESMEN-TERRITORY [char(1)]
   SALESMEN-QUOTA [+-int(7), 'COMP']
   SALESMEN-1ST-BONUS [+-real(5.2), 'COMP']
   SALESMEN-2ND-BONUS [+-real(5.2), 'COMP']
   SALESMEN-3RD-BONUS [+-real(5.2), 'COMP']
   SALESMEN-4TH-BONUS [+-real(5.2), 'COMP']

You can post more questions about pyparsing on the Discussion tab of
the pyparsing wiki home page.

Best of luck!
-- Paul



More information about the Python-list mailing list