[XML-SIG] dom.minidom getting the text content of a node
Rick Hurst
rick.hurst at gmail.com
Thu Dec 9 16:21:07 CET 2004
Hi,
i'm trying to migrate my blogworksXML blog to zope/plone - all the
blog content is stored in XML files and I am trying to walk through it
using minidom and extract relevant info. I can walk through and read
attributes but can't get at text content stored inside nodes which
holds the title and body text.
The node in question looks like this <blogtitle><![CDATA[Plone: remove
member self registration]]></blogtitle>
i'm trying the following (i'm a python newbie BTW):-
from xml.dom.minidom import parse, parseString
dom1 = parse('foo.xml')
for node in dom1.getElementsByTagName("blog"):
id = node.getAttribute("id")
print id
for contentNode in node.getElementsByTagName("text"):
for titleNode in node.getElementsByTagName("blogtitle"):
print titleNode.nodeName #returns "blogtitle"
print titleNode.nodeType #returns 1
#print titleNode.data #AttributeError: Element
instance has no attribute 'data'
print titleNode.nodeValue #returns "None"
is there a way of doing this with minidom or do I need to be using a
different parser? Any advice appreciated!
the xml is as follows:-
<?xml version="1.0" encoding="ISO-8859-1"?>
<baef version="1.0">
<blog id="1087570241010">
<author>
<authorname>rick</authorname>
<authormail></authormail>
</author>
<information>
<commentthread>1087570241010</commentthread>
<timestamp>1087570241</timestamp>
<language>en</language>
<categories>
<category id=""></category>
</categories>
</information>
<text>
<blogtitle>
<![CDATA[Plone: remove member self registration]]></blogtitle>
<blogbody><![CDATA[I wanted to set up a plone site with no (rest of
post removed for clarity)]]></blogbody>
</text>
</blog>
</baef>
--
Rick Hurst
http://hypothecate.co.uk
More information about the XML-SIG
mailing list