Help needed with nested parsing of file into objects

richard pullenjenna10 at gmail.com
Tue Jun 5 17:30:53 EDT 2012


On Jun 5, 10:21 pm, richard <pullenjenn... at gmail.com> wrote:
> On Jun 5, 9:40 pm, Alain Ketterlin <al... at dpt-info.u-strasbg.fr>
> wrote:
>
>
>
>
>
>
>
>
>
> > richard <pullenjenn... at gmail.com> writes:
>
> > [I'm leaving the data in the message in case anybody has troubles going
> > up-thread.]
>
> > > Hi guys still struggling to get the code that was posted to me on this
> > > forum to work in my favour and get the output in the format shown
> > > above. This is what I have so far. Any help will be greatly
> > > apprectiated.
>
> > > output trying to achieve
> > > parsed = [
> > >     {
> > >       "a":"a",
> > >       "b":"b",
> > >       "c":"c",
> > >       "A_elements":[
> > >           {
> > >             "a":1,
> > >             "b":2,
> > >             "c":3
> > >           },
> > >           {
> > >              "a":1,
> > >              "b":2,
> > >              "c":3
> > >           }
> > >        ],
> > >       "B_elements":[
> > >           {
> > >             "a":1,
> > >             "b":2,
> > >             "c":3,
> > >             "C_elements":[
> > >                  {
> > >                      "a":1,
> > >                      "b":2,
> > >                      "c":3
> > >                   },
> > >                   {
> > >                       "a":1,
> > >                       "b":2,
> > >                       "c":3
> > >                   }
> > >              ]
> > >           }
> > >        ]
> > >     },
>
> > >     {
> > >       "a":"1",
> > >       "b":"2",
> > >       "c":"3",
> > >     }
>
> > > ]
>
> > > file format unchangeable
>
> > > An instance of TestArray
> > >  a=a
> > >  b=b
> > >  c=c
> > >  List of 2 A elements:
> > >   Instance of A element
> > >    a=1
> > >    b=2
> > >    c=3
> > >   Instance of A element
> > >    d=1
> > >    e=2
> > >    f=3
> > >  List of 1 B elements
> > >   Instance of B element
> > >    a=1
> > >    b=2
> > >    c=3
> > >    List of 2 C elements
> > >     Instance of C element
> > >      a=1
> > >      b=2
> > >      c=3
> > >     Instance of C element
> > >      a=1
> > >      b=2
> > >      c=3
>
> > > An instance of TestArray
> > >  a=1
> > >  b=2
> > >  c=3
>
> > > def test_parser(filename):
> > >     class Stanza:
> > >         def __init__(self, values):
> > >             for attr, val in values:
> > >                 setattr(self, attr, val)
>
> > >     def build(couple):
> > >         if "=" in couple[0]:
> > >             attr, val = couple[0].split("=")
> > >             return attr,val
> > >         elif "Instance of" in couple[0]:
> > >             match = re.search("Instance of (.+) element", couple[0])
> > >             return ("attr_%s" % match.group(1),Stanza(couple[1]))
> > >         elif "List of" in couple[0]:
> > >             match = re.search("List of \d (.+) elements", couple[0])
> > >             return ("%s_elements" % match.group(1),couple[1])
>
> > You forgot one case:
>
> >     def build(couple):
> >         if "=" in couple[0]:
> >             attr, val = couple[0].split("=")
> >             return attr,val
> >         elif "Instance of" in couple[0]:
> >             #match = re.search("Instance of (.+) element", couple[0])
> >             #return ("attr_%s" % match.group(1),Stanza(couple[1]))
> >             return dict(couple[1])
> >         elif "An instance of" in couple[0]: # you forgot that case
> >             return dict(couple[1])
> >         elif "List of" in couple[0]:
> >             match = re.search("List of \d (.+) elements", couple[0])
> >             return ("%s_elements" % match.group(1),couple[1])
> >         else:
> >             pass # put a test here
>
> > >     fo = open(filename, "r")
> > >     RE = re.compile("( *)(.*)")
> > >     stack = [("-",[])]
> > >     for line in fo:
> > >         matches = RE.match(line)
> > >         if len(matches.group(2)) > 0:
> > >             depth = 1 + len(matches.group(1))
> > >             while len(stack) > depth:
> > >                 stack[-2][1].append(build(stack[-1]))
> > >                 del stack[-1]
> > >             stack.append( (matches.group(2),[]) )
> > >     while len(stack) > 1:
> > >         stack[-2][1].append(stack[-1])
>
> > Change this to:
>
> >           stack[-2][1].append(build(stack[-1])) # call build() here also
>
> > >         del stack[-1]
> > >     return stack
>
> > Actually the first and only element of stack is a container: all you
> > need is the second element of the only tuple in stack, so:
>
> >       return stack[0][1]
>
> > and this is your list. If you need it pretty printed, you'll have to
> > work the hierarchy.
>
> > -- Alain.
>
> Hi Alain, thanks for the reply. Amended the code and just busy
> debugging but the stack i get back justs return [None, None]. Also
> should have been clearer when i mentioned the format above the dicts
> are actually objects instantaited from classes and just printed out as
> obj.__dict__ just for representation putposes. so where you have
> replaced the following i presume this was because of my format
> confusion. Thanks
>
>
>
>
>
>
>
>
>
> >         elif "Instance of" in couple[0]:
> >             match = re.search("Instance of (.+) element", couple[0])
> >             return ("attr_%s" % match.group(1),Stanza(couple[1])) #instantiating new object and setting attributes
>
> with
> >         elif "Instance of" in couple[0]:
> >             #match = re.search("Instance of (.+) element", couple[0])
> >             #return ("attr_%s" % match.group(1),Stanza(couple[1]))
> >             return dict(couple[1])

Sorry silly mistake made with "An instance" and "Instance of" code
emende below for fix

        if "=" in couple[0]:
            attr, val = couple[0].split("=")
            return attr,val
        elif re.search("Instance of .+",couple[0]):
            #match = re.search("Instance of (.+) element", couple[0])
            #return ("attr_%s" % match.group(1),Stanza(couple[1]))
            return dict(couple[1])
        elif re.search("An instance of .+", couple[0]):
            return dict(couple[1])
        elif "List of" in couple[0]:
            match = re.search("List of \d (.+) elements", couple[0])
            return ("%s_elements" % match.group(1),couple[1])
        else:
            pass



More information about the Python-list mailing list