XML and namespaces

Alan Kennedy alanmk at hotmail.com
Sun Dec 11 14:59:59 EST 2005


[Paul Boddie]
 > However,
 > wouldn't the correct serialisation of the document be as follows?
 >
 > <?xml version="1.0"?>
 > <href xmlns="DAV:"><no_ns xmlns=""/></href>

Yes, the correct way to override a default namespace is an xmlns="" 
attribute.

[Paul Boddie]
 > As for the first issue - the presence of the xmlns attribute in the
 > serialised document - I'd be interested to hear whether it is
 > considered acceptable to parse the serialised document and to find that
 > no non-null namespaceURI is set on the href element, given that such a
 > namespaceURI was set when the document was created.

The key issue: should the serialised-then-reparsed document have the 
same DOM "content" (XML InfoSet) if the user did not explicitly create 
the requisite namespace declaration attributes?

My answer: No, it should not be the same.
My reasoning: The user did not explicitly create the attributes
  => The DOM should not automagically create them (according to
     the L2 spec)
  => such attributes should not be serialised
   - The user didn't create them
   - The DOM implementation didn't create them
   - If the serialisation processor creates them, that gives the
     same end result as if the DOM impl had (wrongly) created them.
  => the serialisation is a faithful/naive representation of the
     (not-namespace-well-formed) DOM constructed by the user (who
     omitted required attributes).
  => The reloaded document is a different DOM to the original, i.e.
     it has a different infoset.

The xerces and jython snippet I posted the other day demonstrates this. 
If you look closely at that code, the actual DOM implementation and the 
serialisation processor used are from different libraries. The DOM is 
the inbuilt JAXP DOM implementation, Apache Crimson(the example only 
works on JDK 1.4). The serialisation processor is the Apache Xerces 
serialiser. The fact that the xmlns="DAV:" attribute didn't appear in 
the output document shows that BOTH the (Crimson) DOM implementation AND 
the (Xerces) serialiser chose NOT to automagically create the attribute.

If you run that snippet with other DOM implementations, by setting the 
"javax.xml.parsers.DocumentBuilderFactory" property, you'll find the 
same result.

Serialisation and namespace normalisation are both in the realm of DOM 
Level 3, whereas minidom is only L2 compliant. Automagically introducing 
L3 semantics into the L2 implementation is the wrong thing to do.

http://www.w3.org/TR/DOM-Level-3-LS/load-save.html
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/namespaces-algorithms.html

[Paul Boddie]
 > In other words, ...
 >
 > What should the "Namespace is" message produce?

Namespace is None

If you want it to produce,

Namespace is 'DAV:'

and for your code to be portable to other DOM implementations besides 
libxml2dom, then your code should look like:-

 > document = libxml2dom.createDocument(None, "doc", None)
 > top = document.xpath("*")[0]
 > elem1 = document.createElementNS("DAV:", "href")

elem1.setAttributeNS(xml.dom.XMLNS_NAMESPACE, "xmlns", "DAV:")

 > document.replaceChild(elem1, top)
 > elem2 = document.createElementNS(None, "no_ns")

elem2.setAttributeNS(xml.dom.XMLNS_NAMESPACE, "xmlns", "")

 > document.xpath("*")[0].appendChild(elem2)
 > document.toFile(open("test_ns.xml", "wb"))

its-not-about-namespaces-its-about-automagic-ly'yrs,

-- 
alan kennedy
------------------------------------------------------
email alan:              http://xhaus.com/contact/alan



More information about the Python-list mailing list