[XML-SIG] SAX: Names with no namespace
Thomas B. Passin
tpassin@home.com
Tue, 20 Feb 2001 20:02:56 -0500
Martin v. Loewis wrote -
> > Actually, I had though we *had* decided, and None was the
> > concensus.
>
> That is also my recollection - there is even a PEP document somewhere;
> you can get a copy from the archives, or from Tom Passin.
>
I don't recall that anyone actually declared that it was decided, but almost
everyone who posted on this issue agreed that using "None" is the way to go.
I propose that we do declare that it has been decided - Martin, are you
willing to be the temporary benevolent dictator on this?
Here's a copy of the draft PEP:
=============================================
<?xml version='1.0'?>
<xmlpep>
<headers>
<pep_number>xmlpep-1</pep_number>
<pep_title>Values for Null Or Empty Namespace URIs</pep_title>
<pep_version>0.20</pep_version>
<cvs_version_string/>
<list_of_authors>
<author name='Thomas B. Passin' email='tpassin@home.com'/>
</list_of_authors>
<status>Draft</status>
<type>Standards Track</type>
<created>29-Jan-2001</created>
<history>
<post date='29-Jan-2001'/>
<post date='4-Feb-2001'/>
</history>
</headers>
<abstract>
This PEP specifies the proper values of the Namespace URI property
when its value might otherwise appear to be either "null", "None", or the
empty string.
Such Namespace URIs are discussed in SAX[1], DOM2[2], and XML-Namespaces[3]
These three recommendations do not appear to be in full agreement. This
fact,
and differences between Java and Python, has lead to some confusion and
some disagreement between various implementations supported by PyXML. The
language in these three Recommendations is reviewed.
The recommendation is made to use None as the URI value in all cases where
no URI applies to an element or attribute.
The XMLPEP, when approved, will apply to all namespace-aware software
maintained by the pyxml interest group.
</abstract>
<specification>
<para title='Namespace-aware applications'>
When no namespace has been declared whose scope applies to a
particular element or attribute, the application MUST report the
URI of the namespace of the element or attribute as None. When there is no
namespace prefix, the application MUST report the value of the prefix as
None.
</para>
<para title='Namespace-ignorant applications'>
This requirement does not apply for applications that are not
namespace-aware.
</para>
<para title='Applicability'>
This requirement applies to all XML processing software maintained by the
PyXML
interest group.
</para>
</specification>
<rationale>
<para title='Definitive Treatment Needed'>
This PEP is needed because of continued uncertainty among varous PyXML
developers as to the proper values to use, and because of inconsistency
among various PyXML products. Differences between Python, IDL, and Java
make an unambiguous interpretation unclear.
</para>
<para>
A definitive and consistent treatment is needed so that all the PyXML
software may be made consistent.
</para>
<para title='W3C Namespaces Recommendation'>
The Namespaces Recommendation recognizes that a namespace URI may
be given no value - called "empty" in the Recommendation - even
though a structure for a URI is provided in the document. Two relevant
passages are quoted here:
<quote>Section 2. ...
[Definition:] If the attribute name matches DefaultAttName,
then the namespace name in the attribute value is that of the
default namespace in the scope of the element to which the declaration
is attached. In such a default declaration, the attribute value
may be empty.
</quote>
<quote>5.2 Namespace Defaulting
A default namespace is considered to apply to the element where
it is declared (if that element has no namespace prefix), and to
all elements with no prefix within the content of that element.
If the URI reference in a default namespace declaration is empty,
then unprefixed elements in the scope of the declaration are not
considered to be in any namespace. Note that default namespaces
do not apply directly to attributes.
...The default namespace can be set to the empty string. This has the
same effect, within the scope of the declaration, of there being no
default namespace.
</quote>
</para>
<para>
The term "empty" is not defined further, but in the context of the
Recommendation, it must mean a missing string value. The last
fragment quoted above suggests, but does not require, that an
empty string may be returned for an "empty" URI value.
This has no direct applicability to values returned by implemenations,
since
1) the word "can" is used, rather than "must", and
2) the Recommendation seems to apply to XML documents,
not to implementations.
</para>
<para title='W3C DOM Level 2 Recommendation'>
The W3C DOM Level 2 Recommendation refers to "null" namespaces in
several places. The thrust is clear and consistent: a "null" value
is to be used to indicate a non-existent namespace URI value. Here
are some relevant extracts from the Recommendation:
<quote>Note that because the DOM does no lexical checking, the
empty string will be treated as a real namespace URI in DOM Level 2
methods. Applications must use the value null as the namespaceURI
parameter for methods if they wish to have no namespace.
</quote>
</para>
<para>
The IDL definition for the createAttributeNS() method creates an
attribute with these characteristics:
<quote>
A new Attr object with the following attributes:
Attribute Value
Node.nodeName qualifiedName
Node.namespaceURI namespaceURI
Node.prefix prefix, extracted from qualifiedName,
or null if there is no prefix
Node.localName local name, extracted from qualifiedName
Attr.name qualifiedName
Node.nodeValue the empty string
</quote>
</para>
<para>For the older, non-NS aware createAttribute() method, the
Recommendation says
<quote>...localName, prefix, and namespaceURI set to null. </quote>
</para>
<para>This is typical - a "null" is returned of there is no prefix or
URI.</para>
<para>It is clear that the IDL specifies the use of "null" for empty
namespaces,
rather that the empty string. The java binding does not specify any
particular
way value.
</para>
<para>
Thus there seems to be nothing the the DOM Recommendation that suggests
that
empty strings should be used, and there is clear language that "null"
values
should be used.
</para>
<para title='SAX2'>
The SAX2 java API clearly says that an empty string is to be
returned. The following extracts demonstrate this:
<quote>In SAX2, the startElement and endElement callbacks in a content
handler
look like this:
public void startElement (String uri, String localName,
String qName, Attributes atts)
throws SAXException;
public void endElement (String uri, String localName, String
qName)
throws SAXException;
By default, an XML reader will report a Namespace URI and a local name
for
every element, in both the start and end handler. Consider the following
example:
<html:hr xmlns:html="http://www.w3.org/1999/xhtml"/>
With the default SAX2 Namespace processing, the XML reader would report
a start and end element event with the Namespace URI
"http://www.w3.org/1999/xhtml" and the local name "hr". The XML
reader might also report the original qName "html:hr", but that
parameter might simply be an empty string.
</quote>
<quote>
<h:hello xmlns:h="http://www.greeting.com/ns/" id="a1"
h:person="David"/>
If namespaces is true and namespace-prefixes is true,
then a SAX2 XML reader will report the following:
an element with the Namespace URI "http://www.greeting.com/ns/",
the local name "hello", and the qName "h:hello";
an attribute with no Namespace URI (empty string),
no local name (empty string), and the qName "xmlns:h";
an attribute with no Namespace URI (empty string), the
local name "id", and the qName "id"; and an attribute
with the Namespace URI "http://www.greeting.com/ns/",
the local name "person", and the qName "h:person".
</quote>
</para>
<para title='Discussion of The Three Recommendations'>
To summarize, the Namespace Recommendation is essentially silent
on the subject, the DOM clearly specifies "null" values, and SAX2
clearly specifies the use of empty strings.
</para>
<para>
</para>
<para title='Arguments Favoring the Use of "None"'>
The "highest" level Recommendation is presumably the DOM.
Python offers a data object similar to "null" - the None object.
The None object can be tested for exactly as for an empty string:
<code>if uri:
doYourThing()
</code>
Alternatively, None can be tested for explicitly, as in:
<code>if uri is not None:
doYourThing()
</code>
Thus, None is flexible enough to be useful for this purpose.
</para>
<para>
Many posts to the PyXML list have favored the use of None,
although not all. Either None or the empty string would seem to
work in this context. "None" agrees with the DOM Recommendation,
and would seem (in a mnemonic sense)to suggest the absence of
a prefix or URI.
</para>
<para title='4DOM Handling of None URIs and Prefixes'>
The 4DOM code will handle a None URI correctly in many places,
since it uses tests like this typical example:
<code>
if namespaceURI and namespaceURI != XML_NAMESPACE:
# ...
</code>
This code works correctly if the namespaceURI is None.
<para>Another test used in 4DOM is as follows:
<code>def getElementsByTagNameNS(self,namespaceURI,localName):
root = self.documentElement
if root == None:
return implementation.createNodeList([])
py = root.getElementsByTagNameNS(namespaceURI,localName)
if namespaceURI == '*' or namespaceURI == root.namespaceURI:
if localName == '*' or localName == root.localName:
py.insert(0,root)
return py
</code>
The expression "namespaceURI == '*'" also evaluates correctly when
the URI is None.
</para>
<para>If handling code is consistent throughout 4DOM, then it will handle
None correctly.
</para>
<para title='SAX2'>
[Need material here]
</para>
</rationale>
<reference_implementation>[Should there be a reference here to one
particular processor, such as xmlproc?]
</reference_implementation>
<notes></notes>
<references></references>
<copyright>This PEP may be used by anyone.</copyright>
</xmlpep>