[XML-SIG] Learning to use elementtree
Doran, Harold
HDoran at air.org
Tue Apr 8 15:48:05 CEST 2008
Thanks. I'm piecing this together slowly, but I did get the following to
work.
Test.py
from xml.etree.ElementTree import ElementTree as ET
f = open('test.txt', 'w')
et = ET(file='out_g4r_b.xml')
for statentityref in
et.findall('admin/responseanalyses/analysis/analysisdata/statentityref')
:
print >> f, statentityref.attrib['id']
for statentityref in statentityref.findall('statentityref'):
for statval in statentityref.findall('statval'):
print >> f, statentityref.attrib['id'], '\t',
statval.attrib['type'], '\t', statval.attrib['value']
f.close()
And this gives output like:
13963
0.000000 UncollapsedMeanScore 23.863636
0.000000 ScorePtPct 0.018333
0.000000 ScorePtBiserial -0.496309
0.000000 ScorePtAdjBiserial -0.452588
1.000000 UncollapsedMeanScore 34.941426
1.000000 ScorePtPct 0.981667
1.000000 ScorePtBiserial 0.496309
1.000000 ScorePtAdjBiserial 0.452588
omit ScorePtPct 0.000000
omit ScorePtBiserial -99999.990000
omit ScorePtAdjBiserial -99999.990000
13962
0.000000 UncollapsedMeanScore 29.305195
0.000000 ScorePtPct 0.256667
0.000000 ScorePtBiserial -0.484469
0.000000 ScorePtAdjBiserial -0.425165
1.000000 UncollapsedMeanScore 36.614350
1.000000 ScorePtPct 0.743333
1.000000 ScorePtBiserial 0.484469
1.000000 ScorePtAdjBiserial 0.425165
omit ScorePtPct 0.000000
omit ScorePtBiserial -99999.990000
omit ScorePtAdjBiserial -99999.990000
...
This is almost exactly what I want, and can live with this if needed.
What would be most convenient, however, is to format the ouput as
follows:
13963 0.000000 UncollapsedMeanScore 23.863636
13963 0.000000 ScorePtPct 0.018333
13963 0.000000 ScorePtBiserial -0.496309
13963 0.000000 ScorePtAdjBiserial -0.452588
13963 1.000000 UncollapsedMeanScore 34.941426
13963 1.000000 ScorePtPct 0.981667
13963 1.000000 ScorePtBiserial 0.496309
13963 1.000000 ScorePtAdjBiserial 0.452588
I think this may be what Cliff meant by name collusion. That is, the
number 13963 comes from an attribute ['id'] in statentityref. But also,
0.000 and 1.0 are also from the id attribute in statentityref nested in
statentityref. So, I'm a bit confused as to how to go about printing
them out side by side.
> -----Original Message-----
> From: Stefan Behnel [mailto:stefan_ml at behnel.de]
> Sent: Monday, April 07, 2008 8:32 AM
> To: Doran, Harold
> Cc: J. Cliff Dyer; xml-sig at python.org
> Subject: Re: [XML-SIG] Learning to use elementtree
>
> Hi,
>
> Doran, Harold wrote:
> > Well, I think I'm getting close. But, I think this is
> similar to the
> > problem I had when I started. This seems to create a huge data file
> > with all information under the first item, and then again all
> > information under the second item and so forth.
> >
> > for statentityref in \
> >
> et.findall('admin/responseanalyses/analysis/analysisdata/state
> ntityref')
> > :
> > print >> f, statentityref.attrib['id']
> > for statentityref in \
> >
> >
> et.findall('admin/responseanalyses/analysis/analysisdata/state
> ntityref/s
> > tatentityref'):
> > for statval in statentityref.findall('statval'):
> > print >> f, statentityref.attrib['id'], '\t',
> > statval.attrib['type'], '\t', statval.attrib['value']
>
> I think you should read the previous post again. You are
> nesting three loops here where two would do what you want.
>
> Stefan
>
>
> >> -----Original Message-----
> >> From: J. Cliff Dyer [mailto:jcd at unc.edu]
> >> Sent: Wednesday, April 02, 2008 3:36 PM
> >> To: Doran, Harold
> >> Cc: xml-sig at python.org
> >> Subject: Re: [XML-SIG] Learning to use elementtree
> >>
> >> On Wed, 2008-04-02 at 15:28 -0400, Doran, Harold wrote:
> >>> Indeed, navigating the xml is tough (for me). I have been
> >> able to get
> >>> the following to work. I put in "Sub Element" to indicate the new
> >>> section of data. But, from looking at the text output,
> one doesn't
> >>> know which item these sub elements belong to. I think the
> >> solution is
> >>> to create an index like 13965-0 to show that this is the
> >>> subinformation from the item above it. That seems to be
> >> where I am getting stuck.
> >>> Although, I am open to other suggestions on how to best
> >> represent the
> >>> output.
> >>>
> >>> from xml.etree.ElementTree import ElementTree as ET
> >>>
> >>> filename = raw_input("Please enter the AM XML file: ") new_file =
> >>> raw_input("Save this file as: ")
> >>>
> >>> # create a new file defined by the user f = open(new_file, 'w')
> >>>
> >>> et = ET(file=filename)
> >>>
> >>> for statentityref in \
> >>>
> >>
> et.findall('admin/responseanalyses/analysis/analysisdata/statentityre
> >> f
> >>> ')
> >>> :
> >>> for statval in statentityref.findall('statval'):
> >>> print >> f, statentityref.attrib['id'], '\t',
> >>> statval.attrib['type'], '\t', statval.attrib['value']
> >>>
> >>> f.write("\n\n")
> >>> f.write("Sub Element\n\n")
> >>>
> >>> for statentityref in \
> >>>
> >>
> et.findall('admin/responseanalyses/analysis/analysisdata/statentityre
> >> f
> >>> /s
> >>> tatentityref'):
> >>> for statval in statentityref.findall('statval'):
> >>> print >> f, statentityref.attrib['id'], '\t',
> >>> statval.attrib['type'], '\t', statval.attrib['value']
> >>> f.close()
> >> Do you want your second statentityref loop to be based on
> its parent
> >> statentityref? If so, you need to nest it in the original
> loop, and
> >> use an xpath relative to your outer statentityref (and
> watch for name
> >> collisions).
>
>
More information about the XML-SIG
mailing list