[Fwd: Re: [XML-SIG] DOM seems incomplete]

Alexandre CONRAD aconrad.tlv at magic.fr
Tue Aug 10 13:45:51 CEST 2004


Forgot to send to the list...

-------- Original Message --------
Subject: Re: [XML-SIG] DOM seems incomplete
Date: Tue, 10 Aug 2004 12:40:22 +0200
From: Alexandre CONRAD <aconrad.tlv at magic.fr>
To: Markus Jostock <markus.jostock at softwareag.com>
References: <41189C74.8050902 at softwareag.com>



Markus Jostock wrote:
> Hi
> 
> I am parsing a string into a DOM. That works without problems. But when 
> I want to access childen of the first element, there seem to be none. 
> But pretty printing shows them.
> 
> Maybe you have an idea what might be going wrong?
> 
> Thanks in advance for some clues.
> 
> Kind regards
>    Markus
> 
> 
> The string I parse:
> string = '<MYXML><DOCUMENT><DOCAT INFO="" 
> STATUS="PRV"><DOCAT.HEAD.LK><LINK DOC="!NEW!" 
> /></DOCAT.HEAD.LK><RESAT.LK><LINK DOC="!NEW!" 
> /></RESAT.LK></DOCAT></DOCUMENT></MYXML>'
> 
> Parsing works without errors:
>    from xml.dom.ext.reader import Sax2
>    reader = Sax2.Reader()
>    doc = reader.fromString(string)
> 
> When I pretty print it, it looks ok:
>    from xml.dom.ext import PrettyPrint
>    PrettyPrint(doc)
> prints:
> <?xml version='1.0' encoding='UTF-8'?>
> <MYXML>
>    <DOCUMENT>
>        <DOCAT INFO='' STATUS='PRV'>
>            <DOCAT.HEAD.LK>
>                <LINK DOC='!NEW!'/>
>            </DOCAT.HEAD.LK>
>            <RESAT.LK>
>                <LINK DOC='!NEW!'/>
>            </RESAT.LK>
>        </DOCAT>
>    </DOCUMENT>
> </MYXML>
> 
> Accessing doc.firstChild is ok:
> print doc.firstChild.nodeName  prints MYXML
> 
> But if a want to access further children of <MYXML>, there are none:
> print doc.firstChild.nodeList prints <NodeList at c43968: []> or
> print doc.firstChild.firstChild prints None
> 
> Where are my children gone?

Because you are PrettyPrint'ing it parses newlines and whitespaces
(indentation) as text nodes. Try

'print doc.firstChild.firstChild.firstChild'. You should find your node
(I think, maybe you'll have to add 1 more fistChild).

In my case, I want to keep the xml file PrettyPrint'ed. So what I do is
that I parse the PrettyPrint'ed file and strip out new lines and
whitespaces before I do anything to it :

def openDoc(self, xml_file):
     # Create Reader object
     reader = Sax2.Reader()
     # Parse the document
     doc = reader.fromStream(xml_file)
     # Strip out white spaces from doc
     xml.dom.ext.StripXml(doc)
     return doc

Now, I can play around with my 'doc' without worrying about whitespaces.
When I write it back on disk, I pretty print it again :

def write_xml(self, doc, xml_file):
     # Open XML file in write mode
     f = open(xml_file, "w")
     # Write doc pretty printed to file
     f.write(xml.dom.ext.PrettyPrint(doc, xml_file))
     # Close file
     f.close()

Regards,
-- 
Alexandre CONRAD - TLV
Research & Development
tel : +33 1 30 80 55 05
fax : +33 1 30 56 55 06
6, rue de la plaine
78860 - SAINT NOM LA BRETECHE
FRANCE



-- 
Alexandre CONRAD - TLV
Research & Development
tel : +33 1 30 80 55 05
fax : +33 1 30 56 55 06
6, rue de la plaine
78860 - SAINT NOM LA BRETECHE
FRANCE



More information about the XML-SIG mailing list