Newbie code review of parsing program Please

len lsumnler at gmail.com
Sun Nov 16 13:53:59 EST 2008


On Nov 16, 12:40 pm, "Mark Tolonen" <M8R-yft... at mailinator.com> wrote:
> "len" <lsumn... at gmail.com> wrote in message
>
> news:fc3ef718-edc4-4892-8418-3eeff0975edc at u18g2000pro.googlegroups.com...
>
>
>
>
>
> >I have created the following program to read a text file which happens
> > to be a cobol filed definition.  The program then outputs to a file
> > what is essentially a file which is a list definition which I can
> > later
> > copy and past into a python program.  I will eventually expand the
> > program
> > to also output an SQL script to create a SQL file in MySQL
>
> > The program still need a little work, it does not handle the following
> > items
> > yet;
>
> > 1.  It does not handle OCCURS yet.
> > 2.  It does not handle REDEFINE yet.
> > 3.  GROUP structures will need work.
> > 4.  Does not create SQL script yet.
>
> > It is my anticipation that any files created out of this program may
> > need
> > manual tweeking but I have a large number of cobol file definitions
> > which
> > I may need to work with and this seemed like a better solution than
> > hand
> > typing each list definition and SQL create file script by hand.
>
> > What I would like is if some kind soul could review my code and give
> > me
> > some suggestions on how I might improve it.  I think the use of
> > regular
> > expression might cut the code down or at least simplify the parsing
> > but
> > I'm just starting to read those chapters in the book;)
>
> > *** SAMPLE INPUT FILE ***
>
> > 000100 FD  SALESMEN-FILE
> > 000200     LABEL RECORDS ARE STANDARD
> > 000300     VALUE OF FILENAME IS "SALESMEN".
> > 000400
> > 000500 01  SALESMEN-RECORD.
> > 000600     05  SALESMEN-NO                PIC 9(3).
> > 000700     05  SALESMEN-NAME              PIC X(30).
> > 000800     05  SALESMEN-TERRITORY         PIC X(30).
> > 000900     05  SALESMEN-QUOTA             PIC S9(7) COMP.
> > 001000     05  SALESMEN-1ST-BONUS         PIC S9(5)V99 COMP.
> > 001100     05  SALESMEN-2ND-BONUS         PIC S9(5)V99 COMP.
> > 001200     05  SALESMEN-3RD-BONUS         PIC S9(5)V99 COMP.
> > 001300     05  SALESMEN-4TH-BONUS         PIC S9(5)V99 COMP.
>
> > *** PROGRAM CODE ***
>
> > #!/usr/bin/python
>
> > import sys
>
> > f_path = '/home/lenyel/Bruske/MCBA/Internet/'
> > f_name = sys.argv[1]
>
> > fd = open(f_path + f_name, 'r')
>
> > def fmtline(fieldline):
> >    size = ''
> >    type = ''
> >    dec = ''
> >    codeline = []
> >    if fieldline.count('COMP.') > 0:
> >        left = fieldline[3].find('(') + 1
> >        right = fieldline[3].find(')')
> >        num = fieldline[3][left:right].lstrip()
> >        if fieldline[3].count('V'):
> >            left = fieldline[3].find('V') + 1
> >            dec = int(len(fieldline[3][left:]))
> >            size = ((int(num) + int(dec)) / 2) + 1
> >        else:
> >            size = (int(num) / 2) + 1
> >            dec = 0
> >        type = 'Pdec'
> >    elif fieldline[3][0] in ('X', '9'):
> >        dec = 0
> >        left = fieldline[3].find('(') + 1
> >        right = fieldline[3].find(')')
> >        size = int(fieldline[3][left:right].lstrip('0'))
> >        if fieldline[3][0] == 'X':
> >            type = 'Xstr'
> >        else:
> >            type = 'Xint'
> >    else:
> >        dec = 0
> >        left = fieldline[3].find('(') + 1
> >        right = fieldline[3].find(')')
> >        size = int(fieldline[3][left:right].lstrip('0'))
> >        if fieldline[3][0] == 'X':
> >            type = 'Xint'
> >    codeline.append(fieldline[1].replace('-', '_').replace('.',
> > '').lower())
> >    codeline.append(size)
> >    codeline.append(type)
> >    codeline.append(dec)
> >    return codeline
>
> > wrkfd = []
> > rec_len = 0
>
> > for line in fd:
> >    if line[6] == '*':      # drop comment lines
> >        continue
> >    newline = line.split()
> >    if len(newline) == 1:   # drop blank line
> >        continue
> >    newline = newline[1:]
> >    if 'FILENAME' in newline:
> >        filename = newline[-1].replace('"','').lower()
> >        filename = filename.replace('.','')
> >        output = open('/home/lenyel/Bruske/MCBA/Internet/'+filename
> > +'.fd', 'w')
> >        code = filename + ' = [\n'
> >        output.write(code)
> >    elif newline[0].isdigit() and 'PIC' in newline:
> >        wrkfd.append(fmtline(newline))
> >        rec_len += wrkfd[-1][1]
>
> > fd.close()
>
> > fmtfd = []
>
> > for wrkline in wrkfd[:-1]:
> >    fmtline = str(tuple(wrkline)) + ',\n'
> >    output.write(fmtline)
>
> > fmtline = tuple(wrkfd[-1])
> > fmtline = str(fmtline) + '\n'
> > output.write(fmtline)
>
> > lastline = ']\n'
> > output.write(lastline)
>
> > lenrec = filename + '_len = ' + str(rec_len)
> > output.write(lenrec)
>
> > output.close()
>
> > *** RESULTING OUTPUT ***
>
> > salesmen = [
> > ('salesmen_no', 3, 'Xint', 0),
> > ('salesmen_name', 30, 'Xstr', 0),
> > ('salesmen_territory', 30, 'Xstr', 0),
> > ('salesmen_quota', 4, 'Pdec', 0),
> > ('salesmen_1st_bonus', 4, 'Pdec', 2),
> > ('salesmen_2nd_bonus', 4, 'Pdec', 2),
> > ('salesmen_3rd_bonus', 4, 'Pdec', 2),
> > ('salesmen_4th_bonus', 4, 'Pdec', 2)
> > ]
> > salesmen_len = 83
>
> > If you find this code useful please feel free to use any or all of it
> > at your own risk.
>
> > Thanks
> > Len S
>
> You might want to check out the pyparsing library.
>
> -Mark

Thanks Mark I will check in out right now.

Len



More information about the Python-list mailing list