Help needed with nested parsing of file into objects

richard pullenjenna10 at gmail.com
Mon Jun 4 08:57:14 EDT 2012


Hi guys i am having a bit of dificulty finding the best approach /
solution to parsing a file into a list of objects / nested objects any
help would be greatly appreciated.

#file format to parse .txt
[code]
An instance of TestArray
 a=a
 b=b
 c=c
 List of 2 A elements:
  Instance of A element
   a=1
   b=2
   c=3
  Instance of A element
   d=1
   e=2
   f=3
 List of 1 B elements
  Instance of B element
   a=1
   b=2
   c=3
   List of 2 C elements
    Instance of C element
     a=1
     b=2
     c=3
    Instance of C element
     a=1
     b=2
     c=3

An instance of TestArray
 a=1
 b=2
 c=3
[/code]

expected output
list of 2 TestArray objects been the parents the first one having an
attribute holding a list of the 2 instance of A objects the parents
children, another
attribute of the parent holding a list of just the 1 child instance of
B object with the child object then containing an attribute holding a
list of the 2 Instance of C objects
but the nesting could be more this is just an example. The instance of
TestArray may or may not have any nesting at all
this is illustrated in the second TestArray. Basically just want to
create a list of objects with the objects may or may not contain more
nested objects as attributes but
need a generic way to do it that would work for any amount of depth.

#end list of objects with objects printed as dicts

[code]
parsed = [
    {
      "a":"a",
      "b":"b",
      "c":"c",
      "A_elements":[
          {
            "a":1,
            "b":2,
            "c":3
          },
          {
             "a":1,
             "b":2,
             "c":3
          }
       ],
      "B_elements":[
          {
            "a":1,
            "b":2,
            "c":3,
            "C_elements":[
                 {
                     "a":1,
                     "b":2,
                     "c":3
                  },
                  {
                      "a":1,
                      "b":2,
                      "c":3
                  }
             ]
          }
       ]
    },

    {
      "a":"1",
      "b":"2",
      "c":"3",
    }

]

[/code]

#this is what i have so far which works with the 2nd instance but cant
figure
out the best way to handle the multi nested objects.

[code]
import re
def test_parser(filename):
    parent_stanza = None
    stanzas = []

    class parentStanza:
        pass

    fo = open(filename)

    for line in fo:
        line = line.strip()
        if re.search("An instance of TestArray", line):
            if parent_stanza:
                stanzas.append(parent_stanza)
            parent_stanza = parentStanza()
        if parent_stanza and "=" in line:
            attr, val = line.split("=")
            setattr(parent_stanza, attr, val)
    else:
        stanzas.append(parent_stanza)
    return stanzas

stanzas = test_parser("test.txt")

import pprint
for stanza in stanzas:
    pprint.pprint(stanza.__dict__)
    n=raw_input("paused")
[/code]



More information about the Python-list mailing list