[XML-SIG] DOM seems incomplete
Markus Jostock
markus.jostock at softwareag.com
Tue Aug 10 14:17:55 CEST 2004
Hi
Thanks for the hint, but stripping whitespaces does not seem to help:
Trying to access a child node results in an exception since the child
does not exist (i.e. it is of type 'None').
print doc.firstChild.firstChild.nodeName
causes an exception:
Traceback (most recent call last):
File "TUsecaseCreateEmptyDoc.py", line 54, in test01
print structure.firstChild.firstChild.nodeName
AttributeError: 'NoneType' object has no attribute 'nodeName'
Kind regards
Markus
Alexandre CONRAD wrote:
> Markus Jostock wrote:
>
>> Hi
>>
>> I am parsing a string into a DOM. That works without problems. But
>> when I want to access childen of the first element, there seem to be
>> none. But pretty printing shows them.
>>
>> Maybe you have an idea what might be going wrong?
>>
>> Thanks in advance for some clues.
>>
>> Kind regards
>> Markus
>>
>>
>> The string I parse:
>> string = '<MYXML><DOCUMENT><DOCAT INFO=""
>> STATUS="PRV"><DOCAT.HEAD.LK><LINK DOC="!NEW!"
>> /></DOCAT.HEAD.LK><RESAT.LK><LINK DOC="!NEW!"
>> /></RESAT.LK></DOCAT></DOCUMENT></MYXML>'
>>
>> Parsing works without errors:
>> from xml.dom.ext.reader import Sax2
>> reader = Sax2.Reader()
>> doc = reader.fromString(string)
>>
>> When I pretty print it, it looks ok:
>> from xml.dom.ext import PrettyPrint
>> PrettyPrint(doc)
>> prints:
>> <?xml version='1.0' encoding='UTF-8'?>
>> <MYXML>
>> <DOCUMENT>
>> <DOCAT INFO='' STATUS='PRV'>
>> <DOCAT.HEAD.LK>
>> <LINK DOC='!NEW!'/>
>> </DOCAT.HEAD.LK>
>> <RESAT.LK>
>> <LINK DOC='!NEW!'/>
>> </RESAT.LK>
>> </DOCAT>
>> </DOCUMENT>
>> </MYXML>
>>
>> Accessing doc.firstChild is ok:
>> print doc.firstChild.nodeName prints MYXML
>>
>> But if a want to access further children of <MYXML>, there are none:
>> print doc.firstChild.nodeList prints <NodeList at c43968: []> or
>> print doc.firstChild.firstChild prints None
>>
>> Where are my children gone?
>
>
> Because you are PrettyPrint'ing it parses newlines and whitespaces
> (indentation) as text nodes. Try
>
> 'print doc.firstChild.firstChild.firstChild'. You should find your node
> (I think, maybe you'll have to add 1 more fistChild).
>
> In my case, I want to keep the xml file PrettyPrint'ed. So what I do is
> that I parse the PrettyPrint'ed file and strip out new lines and
> whitespaces before I do anything to it :
>
> def openDoc(self, xml_file):
> # Create Reader object
> reader = Sax2.Reader()
> # Parse the document
> doc = reader.fromStream(xml_file)
> # Strip out white spaces from doc
> xml.dom.ext.StripXml(doc)
> return doc
>
> Now, I can play around with my 'doc' without worrying about whitespaces.
> When I write it back on disk, I pretty print it again :
>
> def write_xml(self, doc, xml_file):
> # Open XML file in write mode
> f = open(xml_file, "w")
> # Write doc pretty printed to file
> f.write(xml.dom.ext.PrettyPrint(doc, xml_file))
> # Close file
> f.close()
>
> Regards,
More information about the XML-SIG
mailing list