Help needed with nested parsing of file into objects

richard pullenjenna10 at gmail.com
Tue Jun 5 17:21:11 EDT 2012


On Jun 5, 9:40 pm, Alain Ketterlin <al... at dpt-info.u-strasbg.fr>
wrote:
> richard <pullenjenn... at gmail.com> writes:
>
> [I'm leaving the data in the message in case anybody has troubles going
> up-thread.]
>
>
>
>
>
>
>
>
>
> > Hi guys still struggling to get the code that was posted to me on this
> > forum to work in my favour and get the output in the format shown
> > above. This is what I have so far. Any help will be greatly
> > apprectiated.
>
> > output trying to achieve
> > parsed = [
> >     {
> >       "a":"a",
> >       "b":"b",
> >       "c":"c",
> >       "A_elements":[
> >           {
> >             "a":1,
> >             "b":2,
> >             "c":3
> >           },
> >           {
> >              "a":1,
> >              "b":2,
> >              "c":3
> >           }
> >        ],
> >       "B_elements":[
> >           {
> >             "a":1,
> >             "b":2,
> >             "c":3,
> >             "C_elements":[
> >                  {
> >                      "a":1,
> >                      "b":2,
> >                      "c":3
> >                   },
> >                   {
> >                       "a":1,
> >                       "b":2,
> >                       "c":3
> >                   }
> >              ]
> >           }
> >        ]
> >     },
>
> >     {
> >       "a":"1",
> >       "b":"2",
> >       "c":"3",
> >     }
>
> > ]
>
> > file format unchangeable
>
> > An instance of TestArray
> >  a=a
> >  b=b
> >  c=c
> >  List of 2 A elements:
> >   Instance of A element
> >    a=1
> >    b=2
> >    c=3
> >   Instance of A element
> >    d=1
> >    e=2
> >    f=3
> >  List of 1 B elements
> >   Instance of B element
> >    a=1
> >    b=2
> >    c=3
> >    List of 2 C elements
> >     Instance of C element
> >      a=1
> >      b=2
> >      c=3
> >     Instance of C element
> >      a=1
> >      b=2
> >      c=3
>
> > An instance of TestArray
> >  a=1
> >  b=2
> >  c=3
>
> > def test_parser(filename):
> >     class Stanza:
> >         def __init__(self, values):
> >             for attr, val in values:
> >                 setattr(self, attr, val)
>
> >     def build(couple):
> >         if "=" in couple[0]:
> >             attr, val = couple[0].split("=")
> >             return attr,val
> >         elif "Instance of" in couple[0]:
> >             match = re.search("Instance of (.+) element", couple[0])
> >             return ("attr_%s" % match.group(1),Stanza(couple[1]))
> >         elif "List of" in couple[0]:
> >             match = re.search("List of \d (.+) elements", couple[0])
> >             return ("%s_elements" % match.group(1),couple[1])
>
> You forgot one case:
>
>     def build(couple):
>         if "=" in couple[0]:
>             attr, val = couple[0].split("=")
>             return attr,val
>         elif "Instance of" in couple[0]:
>             #match = re.search("Instance of (.+) element", couple[0])
>             #return ("attr_%s" % match.group(1),Stanza(couple[1]))
>             return dict(couple[1])
>         elif "An instance of" in couple[0]: # you forgot that case
>             return dict(couple[1])
>         elif "List of" in couple[0]:
>             match = re.search("List of \d (.+) elements", couple[0])
>             return ("%s_elements" % match.group(1),couple[1])
>         else:
>             pass # put a test here
>
> >     fo = open(filename, "r")
> >     RE = re.compile("( *)(.*)")
> >     stack = [("-",[])]
> >     for line in fo:
> >         matches = RE.match(line)
> >         if len(matches.group(2)) > 0:
> >             depth = 1 + len(matches.group(1))
> >             while len(stack) > depth:
> >                 stack[-2][1].append(build(stack[-1]))
> >                 del stack[-1]
> >             stack.append( (matches.group(2),[]) )
> >     while len(stack) > 1:
> >         stack[-2][1].append(stack[-1])
>
> Change this to:
>
>           stack[-2][1].append(build(stack[-1])) # call build() here also
>
> >         del stack[-1]
> >     return stack
>
> Actually the first and only element of stack is a container: all you
> need is the second element of the only tuple in stack, so:
>
>       return stack[0][1]
>
> and this is your list. If you need it pretty printed, you'll have to
> work the hierarchy.
>
> -- Alain.

Hi Alain, thanks for the reply. Amended the code and just busy
debugging but the stack i get back justs return [None, None]. Also
should have been clearer when i mentioned the format above the dicts
are actually objects instantaited from classes and just printed out as
obj.__dict__ just for representation putposes. so where you have
replaced the following i presume this was because of my format
confusion. Thanks

>         elif "Instance of" in couple[0]:
>             match = re.search("Instance of (.+) element", couple[0])
>             return ("attr_%s" % match.group(1),Stanza(couple[1])) #instantiating new object and setting attributes
>
with
>         elif "Instance of" in couple[0]:
>             #match = re.search("Instance of (.+) element", couple[0])
>             #return ("attr_%s" % match.group(1),Stanza(couple[1]))
>             return dict(couple[1])




More information about the Python-list mailing list