Parsing xml file using python
Andrew Clover
and-google at doxdesk.com
Fri Mar 5 12:30:46 EST 2004
antonyliu2002 at yahoo.com (chad) wrote:
> <tag1>This</tag1>
> <tag2>is</tag2>
> <tag3>a</tag3>
> <tag1>test</tag1>
> I need to write "This is a test"
Assuming no nested tags (in which case you'd have to specify the problem
more completely), and no entity reference issues, any DOM Level 1
implementation can do this, eg. with minidom:
from xml.dom import minidom
doc= minidom.parse(inputFilename)
parent= doc.documentElement
children= [child for child in parent.childNodes if child.nodeType==1]
content= ' '.join([child.firstChild.nodeValue for child in children])
fp= open(outputFilename, 'wb')
fp.write(content)
fp.close()
For more complicated structures, the 'textContent' property in DOM Level 3
might be of use. (Insert standard pxdom plug here.)
--
Andrew Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/
More information about the Python-list
mailing list