[XML-SIG] Which DOM implementation?
Mike Brown
mike at skew.org
Thu Jun 24 21:09:54 EDT 2004
Derek Fountain wrote:
> Further, PyXML has another DOM package called 4DOM. That looks to be the most
> compliant of the lot according to the table. Was is donated to the PyXML
> project by FourThought?
Yes. It is entirely in the PyXML domain now. It is also quite slow.
Some aspects of total conformance are hard to implement, and it is
also coded to support Python 1.5.
Conformance is overrated, by the way, when what you're conforming to is partly
JavaScript, Java & C-centric junk with no formal, mandatory levels of
conformance defined (or even an explicit data model).
> Finally, 4Suite appears to have 3 DOM packages available, none of which
> appears to be especially compliant. I was under the impression that cDomlette
> was built with speed in mind. I'm not sure about pDOM and FtMD.
To clarify-
The intent is for 4Suite to have just one Domlette: a faster, lighter,
XPath-friendlier alternative to minidom, and that's basically what it has.
DOM conformance was never a goal, although we do try where it makes sense.
Where XPath and DOM conflict, XPath wins (e.g. namespace support is mandatory,
lexical cruft like CDATA sections and unexpanded entity references aren't
modeled, adjacent text nodes are automatically merged, attribute nodes
encapsulate their values rather than having text node children, etc.). Where
DOM L1 was clarified by L2 or L3, we go with the latest. Where DOM APIs are
excessively Java-ish (e.g. hide as much data as possible and force people to
use getters and setters), we prefer the Pythonic approach (e.g. just make it
read-only if you have to, although Domlette nodes do essentially subclass
xml.dom.Node).
Domlette was originally implemented in Python only, but for speed, a second
implementation, written as mostly C extensions, was introduced. As it became
more stable, this C version became the default underlying implementation used
by the Domlette APIs, but you could always force the use of the other version
by setting an environment variable. Both implementations are supposed to be
identical and transparent to you, although as the chart shows, there were some
slight differences as of 4Suite 1.0a1. I think these have been resolved.
The two implementations have three different names. The Python version was
called pDomlette through 4Suite 0.12.0a1. Thereafter, it has been called
FtMiniDom. The C version was introduced in 4Suite 0.11.1 and has always been
called cDomlette.
The plan is to drop FtMiniDom after the 1.0 release. This shouldn't matter to
anyone since the APIs don't really expose which implementation is being used,
and the ability to select one or the other was just a convenience for
debugging and to ensure that Domlette would be usable for everyone while the C
version was stabilizing.
See also:
http://4suite.org/docs/timeline.html
http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/domlettes
http://uche.ogbuji.net/tech/akara/nodes/2004-06-19/033124
-Mike
More information about the XML-SIG
mailing list