Help needed with nested parsing of file into objects

richard pullenjenna10 at gmail.com
Tue Jun 5 15:57:44 EDT 2012


On Jun 4, 3:20 pm, richard <pullenjenn... at gmail.com> wrote:
> On Jun 4, 3:14 pm, Alain Ketterlin <al... at dpt-info.u-strasbg.fr>
> wrote:
>
>
>
>
>
>
>
>
>
> > richard <pullenjenn... at gmail.com> writes:
> > > Hi guys i am having a bit of dificulty finding the best approach /
> > > solution to parsing a file into a list of objects / nested objects any
> > > help would be greatly appreciated.
>
> > > #file format to parse .txt
> > > [code]
> > > An instance of TestArray
> > >  a=a
> > >  b=b
> > >  c=c
> > >  List of 2 A elements:
> > >   Instance of A element
>
> > [...]
>
> > Below is a piece of code that seems to work on your data. It builds a
> > raw tree, i leave it to you to adapt and built the objects you want. The
> > assumption is that the number of leading blanks faithfully denotes
> > depth.
>
> > As noted in another message, you're probably better off using an
> > existing syntax (json, python literals, yaml, xml, ...)
>
> > -- Alain.
>
> > #!/usr/bin/env python
>
> > import sys
> > import re
>
> > RE = re.compile("( *)(.*)")
> > stack = [("-",[])] # tree nodes are: (head,[children])
> > for line in sys.stdin:
> >     matches = RE.match(line)
> >     if len(matches.group(2)) > 0:
> >         depth = 1 + len(matches.group(1))
> >         while len(stack) > depth:
> >             stack[-2][1].append(stack[-1])
> >             del stack[-1]
> >             pass
> >         stack.append( (matches.group(2),[]) )
> >         pass
> >     pass
> > while len(stack) > 1:
> >     stack[-2][1].append(stack[-1])
> >     del stack[-1]
> >     pass
>
> > print(stack)
>
> thank you both for your replies. Unfortunately it is a pre-existing
> file format imposed by an external system that I can't
> change. Thank you for the code snippet.

Hi guys still struggling to get the code that was posted to me on this
forum to work in my favour and get the output in the format shown
above. This is what I have so far. Any help will be greatly
apprectiated.

output trying to achieve
parsed = [
    {
      "a":"a",
      "b":"b",
      "c":"c",
      "A_elements":[
          {
            "a":1,
            "b":2,
            "c":3
          },
          {
             "a":1,
             "b":2,
             "c":3
          }
       ],
      "B_elements":[
          {
            "a":1,
            "b":2,
            "c":3,
            "C_elements":[
                 {
                     "a":1,
                     "b":2,
                     "c":3
                  },
                  {
                      "a":1,
                      "b":2,
                      "c":3
                  }
             ]
          }
       ]
    },

    {
      "a":"1",
      "b":"2",
      "c":"3",
    }

]

file format unchangeable

An instance of TestArray
 a=a
 b=b
 c=c
 List of 2 A elements:
  Instance of A element
   a=1
   b=2
   c=3
  Instance of A element
   d=1
   e=2
   f=3
 List of 1 B elements
  Instance of B element
   a=1
   b=2
   c=3
   List of 2 C elements
    Instance of C element
     a=1
     b=2
     c=3
    Instance of C element
     a=1
     b=2
     c=3

An instance of TestArray
 a=1
 b=2
 c=3

def test_parser(filename):
    class Stanza:
        def __init__(self, values):
            for attr, val in values:
                setattr(self, attr, val)

    def build(couple):
        if "=" in couple[0]:
            attr, val = couple[0].split("=")
            return attr,val
        elif "Instance of" in couple[0]:
            match = re.search("Instance of (.+) element", couple[0])
            return ("attr_%s" % match.group(1),Stanza(couple[1]))
        elif "List of" in couple[0]:
            match = re.search("List of \d (.+) elements", couple[0])
            return ("%s_elements" % match.group(1),couple[1])

    fo = open(filename, "r")
    RE = re.compile("( *)(.*)")
    stack = [("-",[])]
    for line in fo:
        matches = RE.match(line)
        if len(matches.group(2)) > 0:
            depth = 1 + len(matches.group(1))
            while len(stack) > depth:
                stack[-2][1].append(build(stack[-1]))
                del stack[-1]
            stack.append( (matches.group(2),[]) )
    while len(stack) > 1:
        stack[-2][1].append(stack[-1])
        del stack[-1]
    return stack

stanzas = test_parser("test.txt")



More information about the Python-list mailing list