Help needed with nested parsing of file into objects

richard pullenjenna10 at gmail.com
Tue Jun 5 17:09:58 EDT 2012


On Jun 5, 9:40 pm, Alain Ketterlin <al... at dpt-info.u-strasbg.fr>
wrote:
> richard <pullenjenn... at gmail.com> writes:
>
> [I'm leaving the data in the message in case anybody has troubles going
> up-thread.]
>
>
>
>
>
>
>
>
>
> > Hi guys still struggling to get the code that was posted to me on this
> > forum to work in my favour and get the output in the format shown
> > above. This is what I have so far. Any help will be greatly
> > apprectiated.
>
> > output trying to achieve
> > parsed = [
> >     {
> >       "a":"a",
> >       "b":"b",
> >       "c":"c",
> >       "A_elements":[
> >           {
> >             "a":1,
> >             "b":2,
> >             "c":3
> >           },
> >           {
> >              "a":1,
> >              "b":2,
> >              "c":3
> >           }
> >        ],
> >       "B_elements":[
> >           {
> >             "a":1,
> >             "b":2,
> >             "c":3,
> >             "C_elements":[
> >                  {
> >                      "a":1,
> >                      "b":2,
> >                      "c":3
> >                   },
> >                   {
> >                       "a":1,
> >                       "b":2,
> >                       "c":3
> >                   }
> >              ]
> >           }
> >        ]
> >     },
>
> >     {
> >       "a":"1",
> >       "b":"2",
> >       "c":"3",
> >     }
>
> > ]
>
> > file format unchangeable
>
> > An instance of TestArray
> >  a=a
> >  b=b
> >  c=c
> >  List of 2 A elements:
> >   Instance of A element
> >    a=1
> >    b=2
> >    c=3
> >   Instance of A element
> >    d=1
> >    e=2
> >    f=3
> >  List of 1 B elements
> >   Instance of B element
> >    a=1
> >    b=2
> >    c=3
> >    List of 2 C elements
> >     Instance of C element
> >      a=1
> >      b=2
> >      c=3
> >     Instance of C element
> >      a=1
> >      b=2
> >      c=3
>
> > An instance of TestArray
> >  a=1
> >  b=2
> >  c=3
>
> > def test_parser(filename):
> >     class Stanza:
> >         def __init__(self, values):
> >             for attr, val in values:
> >                 setattr(self, attr, val)
>
> >     def build(couple):
> >         if "=" in couple[0]:
> >             attr, val = couple[0].split("=")
> >             return attr,val
> >         elif "Instance of" in couple[0]:
> >             match = re.search("Instance of (.+) element", couple[0])
> >             return ("attr_%s" % match.group(1),Stanza(couple[1]))
> >         elif "List of" in couple[0]:
> >             match = re.search("List of \d (.+) elements", couple[0])
> >             return ("%s_elements" % match.group(1),couple[1])
>
> You forgot one case:
>
>     def build(couple):
>         if "=" in couple[0]:
>             attr, val = couple[0].split("=")
>             return attr,val
>         elif "Instance of" in couple[0]:
>             #match = re.search("Instance of (.+) element", couple[0])
>             #return ("attr_%s" % match.group(1),Stanza(couple[1]))
>             return dict(couple[1])
>         elif "An instance of" in couple[0]: # you forgot that case
>             return dict(couple[1])
>         elif "List of" in couple[0]:
>             match = re.search("List of \d (.+) elements", couple[0])
>             return ("%s_elements" % match.group(1),couple[1])
>         else:
>             pass # put a test here
>
> >     fo = open(filename, "r")
> >     RE = re.compile("( *)(.*)")
> >     stack = [("-",[])]
> >     for line in fo:
> >         matches = RE.match(line)
> >         if len(matches.group(2)) > 0:
> >             depth = 1 + len(matches.group(1))
> >             while len(stack) > depth:
> >                 stack[-2][1].append(build(stack[-1]))
> >                 del stack[-1]
> >             stack.append( (matches.group(2),[]) )
> >     while len(stack) > 1:
> >         stack[-2][1].append(stack[-1])
>
> Change this to:
>
>           stack[-2][1].append(build(stack[-1])) # call build() here also
>
> >         del stack[-1]
> >     return stack
>
> Actually the first and only element of stack is a container: all you
> need is the second element of the only tuple in stack, so:
>
>       return stack[0][1]
>
> and this is your list. If you need it pretty printed, you'll have to
> work the hierarchy.
>
> -- Alain.

Hi Alain thanks for the reply. With regards to the missing case "An
Instance of" im not sure where/ how that is working as the case i put
in originally "Instance of" is in the file and been handled in the
previous case. Also when running the final solution im getting a list
of [None, None] as the final stack? just busy debugging it to see
whats going wrong. But sorry should have been clearer with regards to
the format mentioned above. The objects are been printed out as dicts
so where you put in

        elif "An Instance of" in couple[0]:
            return dict(couple[1])

        should still be ?
        elif "Instance of" in couple[0]:
            match = re.search("Instance of (.+) element", couple[0])
            return ("attr_%s" % match.group(1),Stanza(couple[1])) #
instantiating new stanza object and setting attributes.



More information about the Python-list mailing list