[Fwd: Re: [XML-SIG] DOM seems incomplete]
Alexandre CONRAD
aconrad.tlv at magic.fr
Tue Aug 10 13:45:51 CEST 2004
Forgot to send to the list...
-------- Original Message --------
Subject: Re: [XML-SIG] DOM seems incomplete
Date: Tue, 10 Aug 2004 12:40:22 +0200
From: Alexandre CONRAD <aconrad.tlv at magic.fr>
To: Markus Jostock <markus.jostock at softwareag.com>
References: <41189C74.8050902 at softwareag.com>
Markus Jostock wrote:
> Hi
>
> I am parsing a string into a DOM. That works without problems. But when
> I want to access childen of the first element, there seem to be none.
> But pretty printing shows them.
>
> Maybe you have an idea what might be going wrong?
>
> Thanks in advance for some clues.
>
> Kind regards
> Markus
>
>
> The string I parse:
> string = '<MYXML><DOCUMENT><DOCAT INFO=""
> STATUS="PRV"><DOCAT.HEAD.LK><LINK DOC="!NEW!"
> /></DOCAT.HEAD.LK><RESAT.LK><LINK DOC="!NEW!"
> /></RESAT.LK></DOCAT></DOCUMENT></MYXML>'
>
> Parsing works without errors:
> from xml.dom.ext.reader import Sax2
> reader = Sax2.Reader()
> doc = reader.fromString(string)
>
> When I pretty print it, it looks ok:
> from xml.dom.ext import PrettyPrint
> PrettyPrint(doc)
> prints:
> <?xml version='1.0' encoding='UTF-8'?>
> <MYXML>
> <DOCUMENT>
> <DOCAT INFO='' STATUS='PRV'>
> <DOCAT.HEAD.LK>
> <LINK DOC='!NEW!'/>
> </DOCAT.HEAD.LK>
> <RESAT.LK>
> <LINK DOC='!NEW!'/>
> </RESAT.LK>
> </DOCAT>
> </DOCUMENT>
> </MYXML>
>
> Accessing doc.firstChild is ok:
> print doc.firstChild.nodeName prints MYXML
>
> But if a want to access further children of <MYXML>, there are none:
> print doc.firstChild.nodeList prints <NodeList at c43968: []> or
> print doc.firstChild.firstChild prints None
>
> Where are my children gone?
Because you are PrettyPrint'ing it parses newlines and whitespaces
(indentation) as text nodes. Try
'print doc.firstChild.firstChild.firstChild'. You should find your node
(I think, maybe you'll have to add 1 more fistChild).
In my case, I want to keep the xml file PrettyPrint'ed. So what I do is
that I parse the PrettyPrint'ed file and strip out new lines and
whitespaces before I do anything to it :
def openDoc(self, xml_file):
# Create Reader object
reader = Sax2.Reader()
# Parse the document
doc = reader.fromStream(xml_file)
# Strip out white spaces from doc
xml.dom.ext.StripXml(doc)
return doc
Now, I can play around with my 'doc' without worrying about whitespaces.
When I write it back on disk, I pretty print it again :
def write_xml(self, doc, xml_file):
# Open XML file in write mode
f = open(xml_file, "w")
# Write doc pretty printed to file
f.write(xml.dom.ext.PrettyPrint(doc, xml_file))
# Close file
f.close()
Regards,
--
Alexandre CONRAD - TLV
Research & Development
tel : +33 1 30 80 55 05
fax : +33 1 30 56 55 06
6, rue de la plaine
78860 - SAINT NOM LA BRETECHE
FRANCE
--
Alexandre CONRAD - TLV
Research & Development
tel : +33 1 30 80 55 05
fax : +33 1 30 56 55 06
6, rue de la plaine
78860 - SAINT NOM LA BRETECHE
FRANCE
More information about the XML-SIG
mailing list