[XML-SIG] Serializing DOM tree to file

Nickolay Kolev nmkolev@uni-bonn.de
Sat, 17 May 2003 09:17:48 +0200


>     from xml.com.ext.c14n import Canonicalize
>     s = Canonicalize(node)         # returns output as a string
>     Canonicalize(node, sys.stdout) # anything with a write() method


This does return something quite different that PrettyPrint. I was going
to ask a new question I had in a different question but both issues seem
to be quite involved so I will ask it here.

My situation: I have just started playing with Python and am trying to
write a simple messaging (blogging) system as an excercise. It uses XML
files as a data source. They have the following layout:

<?xml version="1.0" encoding="UTF-8"?>
<post>
	<date>16 May 2003</date>
	<category>Internal</category>
	<author>The Boss</author>
	<title>Raise</title>
	<message>You all get a salyry raise.</message>
</post>

It could hardly be any simpler. Every time a message is posted a post
object is constructed. It has a method for exposrting itself as xml:

def formatXML(self):

		self.formattedXML += '<?xml version="1.0" encoding="UTF-8"?>' \
		+ '\n<post>' \
		+ '\n\t<date>'     + self.date      +  '</date>' \
		+ '\n\t<category>' + self.category  +  '</category>' \
		+ '\n\t<author>'   + self.author    +  '</author>' \
		+ '\n\t<title>'    + self.title     +  '</title>' \
		+ '\n\t<message>'  + self.message   +  '</message>' \
		+ '\n</post>'

This formatted xml string then gets written to an xml file and the index
file containing the latest posts needs to be updated. The index.xml
looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<index>
	<post>
		...
	</post>

	<post>
		...
	</post>

	...

</index>

The way I do it is with the updateIndex method:

def updateIndex(self):

		self.rebuildXML() #does nothing relevant in this case
				
		postElement =
xml.dom.minidom.parseString(self.formattedXML).firstChild

		o = open(os.path.join(self.root, 'index.xml'), 'rw')
		f = o.read()

		doc = xml.dom.minidom.parseString(f)
		docRoot = doc.firstChild

		docRoot.insertBefore(postElement, docRoot.firstChild)

		#PrettyPrint(doc)
		#Cannonicalize(doc, sys.stdout)

This doc variable needs to be serialized to the index.xml file now. Lets
assume we have only one post element in the index file. Using
prettyprint i get what I expect (see above). Using canonicalize the
newest post gets inserted before the opening index element?? Why is
that?

The other question I have is, is the code above optimal? What can be
done to make it more elegant? How about using importNode? Would you
recommend this over what I do now? Is there somewhere I can read more on
importNode (tutorial like preferred).

Thank you very much for taking the time!

Best regards,
nmk