Parsing xml file using python

Andrew Clover and-google at doxdesk.com
Fri Mar 5 12:30:46 EST 2004


antonyliu2002 at yahoo.com (chad) wrote:

> <tag1>This</tag1>
>     <tag2>is</tag2>
>        <tag3>a</tag3>
> <tag1>test</tag1>

> I need to write "This is a test"

Assuming no nested tags (in which case you'd have to specify the problem
more completely), and no entity reference issues, any DOM Level 1
implementation can do this, eg. with minidom:

  from xml.dom import minidom
  doc= minidom.parse(inputFilename)

  parent= doc.documentElement
  children= [child for child in parent.childNodes if child.nodeType==1]
  content= ' '.join([child.firstChild.nodeValue for child in children])

  fp= open(outputFilename, 'wb')
  fp.write(content)
  fp.close()

For more complicated structures, the 'textContent' property in DOM Level 3
might be of use. (Insert standard pxdom plug here.)

-- 
Andrew Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/



More information about the Python-list mailing list