xml aggregator

Gerard Flanagan grflanagan at yahoo.co.uk
Mon Jul 10 09:20:58 EDT 2006


> Gerard Flanagan wrote:
> > Gerard Flanagan wrote:
> > > kepioo wrote:
> > > > Hi all,
> > > >
> > > > I am trying to write an xml aggregator, but so far, i've been failing
> > > > miserably.
> > > >
> > > > what i want to do :
> > > >
> > > > i have entries, in a list format :[[key1,value],[key2,value],[
> > > > key3,value]], value]
> > > >
> > > > example :
> > > > [["route","23"],["equip","jr2"],["time","3pm"]],"my first value"]
> > > >  [["route","23"],["equip","jr1"],["time","3pm"]],"my second value"]
> > > >  [["route","23"],["equip","jr2"],["time","3pm"]],"my third value"]
> > > >  [["route","24"],["equip","jr2"],["time","3pm"]],"my fourth value"]
> > > >  [["route","25"],["equip","jr2"],["time","3pm"]],'"my fifth value"]
> > > >
> > >
> > > [snip example data]
> > >
> > > >
> > > >
> > > > If anyone has an idea of implemetation or any code ( i was trying with
> > > > ElementTree...
> > > >
> > >
> > > (You should have posted the code you tried)
> > >
> > > The code below might help (though you should test it more than I have).
> > > The 'findall' function comes from here:
> > >
> > >     http://gflanagan.net/site/python/elementfilter/elementfilter.py
> > >
> > > it's not the elementtree one.
> > >
> >
> > Sorry, elementfilter.py was a bit broken - fixed now.  Use the current
> > one and change the code I posted to:
> >
> >     [...]
> >     existing_route = findall(results, "route[@id==%s]" % routeid)
> > #changed line
> >     if existing_route:
> >         route = existing_route[0]
> >         existing_equip = findall(route, "equip[@id=='%s']" % equipid)
> >         if existing_equip:
> >     [...]
> >
> > ie. don't quote the route id since it's numeric.

kepioo wrote:
> thanks a lot for the code.
>
> It was not working the first time (do not recognize item and
> existing_time --

Apologies, I ran the code from PythonWin which remembers names that
were previously declared though deleted - should have run it as a
script.

> i changed item by r[-1] and existing_time by
> existing_equip).
>

'item' was wrong but not the other two.  (I'm assuming your data is
regular - ie. all the records have the same number of fields)

change the for loop to the following:

8<------------------------------------------------------

for routeid, equipid, timeid, data in records:
    route, equip, time = None, None, None
    existing_route = findall(results, "route[@id==%s]" % routeid)
    if existing_route:
        route = existing_route[0]
        existing_equip = findall(route, "equip[@id==%s]" % equipid)
        if existing_equip:
            equip = existing_equip[0]
            existing_time = findall(equip, "time[@id==%s]" % timeid)
            if existing_time:
                time = existing_time[0]
    route = route or SubElement(results, 'route', id=routeid)
    equip = equip or SubElement(route, 'equip', id=equipid)
    time = time or SubElement(equip, 'time', id=timeid)
    dataitem = SubElement(time,'data')
    dataitem.text = data

8<------------------------------------------------------

> however, it is not producing the result i expected, as in it doesn't
> group by same category the elements, it creates a new block of xml
>
[...]

the changes above should give you what you want - remember, as I wrote
in the previous post, it should be:

    "[@id==%s]"

not

    "[@id=='%s']"

ie. no single quotes needed.

With the above amended code I get:

<results>
    <route id="23">
        <equip id="jr2">
            <time id="3pm">
                <data>my first value</data>
                <data>my third value</data>
            </time>
        </equip>
        <equip id="jr1">
            <time id="3pm">
                <data>my second value</data>
            </time>
        </equip>
    </route>
    <route id="24">
        <equip id="jr2">
            <time id="3pm">
                <data>my fourth value</data>
            </time>
        </equip>
    </route>
    <route id="25">
        <equip id="jr2">
            <time id="3pm">
                <data>my fifth value</data>
            </time>
            <time id="4pm">
                <data>my sixth value</data>
            </time>
        </equip>
    </route>
</results>
------------------------------------

all the best

Gerard

ps. this newsgroup prefers that you don't top-post.




More information about the Python-list mailing list