[XML-SIG] Which DOM implementation?

Mike Brown mike at skew.org
Thu Jun 24 21:09:54 EDT 2004


Derek Fountain wrote:
> Further, PyXML has another DOM package called 4DOM. That looks to be the most 
> compliant of the lot according to the table. Was is donated to the PyXML 
> project by FourThought?

Yes. It is entirely in the PyXML domain now. It is also quite slow.
Some aspects of total conformance are hard to implement, and it is
also coded to support Python 1.5.

Conformance is overrated, by the way, when what you're conforming to is partly 
JavaScript, Java & C-centric junk with no formal, mandatory levels of 
conformance defined (or even an explicit data model).

> Finally, 4Suite appears to have 3 DOM packages available, none of which 
> appears to be especially compliant. I was under the impression that cDomlette 
> was built with speed in mind. I'm not sure about pDOM and FtMD.

To clarify-

The intent is for 4Suite to have just one Domlette: a faster, lighter, 
XPath-friendlier alternative to minidom, and that's basically what it has.

DOM conformance was never a goal, although we do try where it makes sense. 
Where XPath and DOM conflict, XPath wins (e.g. namespace support is mandatory, 
lexical cruft like CDATA sections and unexpanded entity references aren't 
modeled, adjacent text nodes are automatically merged, attribute nodes 
encapsulate their values rather than having text node children, etc.). Where 
DOM L1 was clarified by L2 or L3, we go with the latest. Where DOM APIs are 
excessively Java-ish (e.g. hide as much data as possible and force people to 
use getters and setters), we prefer the Pythonic approach (e.g. just make it 
read-only if you have to, although Domlette nodes do essentially subclass 
xml.dom.Node).

Domlette was originally implemented in Python only, but for speed, a second 
implementation, written as mostly C extensions, was introduced. As it became 
more stable, this C version became the default underlying implementation used 
by the Domlette APIs, but you could always force the use of the other version 
by setting an environment variable. Both implementations are supposed to be 
identical and transparent to you, although as the chart shows, there were some 
slight differences as of 4Suite 1.0a1. I think these have been resolved.

The two implementations have three different names. The Python version was 
called pDomlette through 4Suite 0.12.0a1. Thereafter, it has been called 
FtMiniDom. The C version was introduced in 4Suite 0.11.1 and has always been 
called cDomlette.

The plan is to drop FtMiniDom after the 1.0 release. This shouldn't matter to 
anyone since the APIs don't really expose which implementation is being used, 
and the ability to select one or the other was just a convenience for 
debugging and to ensure that Domlette would be usable for everyone while the C 
version was stabilizing.

See also:

http://4suite.org/docs/timeline.html
http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/domlettes
http://uche.ogbuji.net/tech/akara/nodes/2004-06-19/033124

-Mike



More information about the XML-SIG mailing list