parsing

Terry Reedy tjreedy at udel.edu
Wed Jun 23 12:03:13 EDT 2004


"Todd Moyer" <tmoyer at inventa.com> wrote in message
news:mailman.44.1088000278.27577.python-list at python.org...
>
> I would like to use Python to parse a *python-like* data description
> language.  That is, it would have it's own keywords, but would have a
> syntax like Python.  For instance:
>
> Ob1 ('A'):
>     Ob2 ('B'):
>         Ob3 ('D')
>         Ob3 ('E')
>     Ob2 ('C')
>
> I'm looking for the ':' and indentation to provide nested execution so I
> can use a description like the one above to construct an object tree.
>
> In looking at the parser and tokenize sections of the Python Language
> Services (http://docs.python.org/lib/language.html), it looks as though
> this will only parse Python keywords.  Is there a way to tap into Python
> parsing at a lower level so that I can use it to parse my own keywords?

Perhaps the following copied from another article in another thread will
help
From: "Bram Stolk" <bram at nospam.sara.nl>
Subject: Re: Parsing C Preprocessor files
(I have not checked his code and results, just copy and paste)
================
I would like to thank the people who responded on my question about
preprocessor parsing. However, I think I will just roll my own, as I
found out that it takes a mere 16 lines of code to create a #ifdef tree.

I simply used a combination of lists and tuples. A tuple denotes a #if
block (startline,body,endline). A body is a list of lines/tuples.

This will parse the following text:

Top level line
#if foo
on foo level
#if bar
on bar level
#endif
#endif
#ifdef bla
on bla level
#ifdef q
q
#endif
#if r
r
#endif
#endif

into:

['Top level line\n', ('#if foo\n', ['on foo level\n', ('#if bar\n', ['on
bar level\n'], '#endif\n')], '#endif\n'), ('#ifdef bla\n', ['on bla
level\n', ('#ifdef q\n', ['q\n'], '#endif\n'), ('#if r\n', ['r\n'],
'#endif\n')], '#endif\n')]

Which is very suitable for me.

Code is:

def parse_block(lines) :
  retval = []
  while lines :
    line = lines.pop(0)
    if line.find("#if") != -1 :
      headline = line
      b=parse_block(lines)
      endline = lines.pop(0)
      retval.append( (headline, b, endline) )
    else :
      if line.find("#endif") != -1 :
        lines.insert(0, line)
        return retval
      else :
        retval.append(line)
  return retval

And pretty pretting with indentation is easy:

def traverse_block(block, indent) :
  while block:
    i = block.pop(0)
    if type(i) == type((1,2,3)) :
      print indent*"\t"+i[0],
      traverse_block(i[1], indent+1)
      print indent*"\t"+i[2],
    else :
      print indent*"\t"+i,





More information about the Python-list mailing list