XML and namespaces

Alan Kennedy alanmk at hotmail.com
Tue Dec 6 07:58:09 EST 2005


[Fredrik Lundh]
 > my point was that (unless I'm missing something here), there are at
 > least two widely used implementations (libxml2 and the 4DOM domlette
 > stuff) that don't interpret the spec in this way.

Libxml2dom is of alpha quality, according to its CheeseShop page anyway.

http://cheeseshop.python.org/pypi/libxml2dom/0.2.4

This can be seen in its incorrect serialisation of the following valid DOM.

#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
document = libxml2dom.createDocument(None, "doc", None)
top = document.xpath("*")[0]
elem1 = document.createElementNS("DAV:", "myns:href")
elem1.setAttributeNS(xml.dom.XMLNS_NAMESPACE, "xmlns:myns", "DAV:")
document.replaceChild(elem1, top)
print document.toString()
#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Which produces

"""
<?xml version="1.0"?>
<myns:href
   xmlns:myns="DAV:"
   xmlns:xmlns="http://www.w3.org/2000/xmlns/"
   xmlns:myns="DAV:"
/>
"""

Which is not even well-formed XML (duplicate attributes), let alone 
namespace well-formed. Note also the invalid xml namespace "xmlns:xmlns" 
attribute. So I don't accept that libxml2dom's behaviour is definitive 
in this case.

The other DOM you refer to, the 4DOM stuff, was written by a participant 
in this discussion.

Will you accept Apache Xerces 2 for Java as a widely used DOM 
Implementation? I guarantee that it is far more widely used than either 
of the DOMs mentioned.

Download Xerces 2 (I am using Xerces 2.7.1), and run the following code 
under jython:-

http://www.apache.org/dist/xml/xerces-j/

#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
#
# This is a simple adaptation of the DOMGenerate.java
# sample from the Xerces 2.7.1 distribution.
#
from javax.xml.parsers import  DocumentBuilder, DocumentBuilderFactory
from org.apache.xml.serialize import OutputFormat, XMLSerializer
from java.io import StringWriter

def create_document():
   dbf = DocumentBuilderFactory.newInstance()
   db  = dbf.newDocumentBuilder()
   return db.newDocument()

def serialise(doc):
   format  = OutputFormat( doc )
   outbuf  = StringWriter()
   serial  = XMLSerializer( outbuf, format )
   serial.asDOMSerializer()
   serial.serialize(doc.getDocumentElement())
   return outbuf.toString()

doc = create_document()
root = doc.createElementNS("DAV:", "href")
doc.appendChild( root )
print serialise(doc)
#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Which produces

"""
<?xml version="1.0" encoding="UTF-8"?>
<href/>
"""

As I expected it would.

-- 
alan kennedy
------------------------------------------------------
email alan:              http://xhaus.com/contact/alan



More information about the Python-list mailing list