[XML-SIG] The Zen of DOM

Uche Ogbuji uogbuji@fourthought.com
Wed, 12 Apr 2000 17:44:56 -0600


The problem is that the W3C folks throws quite a curve ball with DOM level 2.  
The level 2 document has to be created with prior knowledge of the 
document-type and document-element.  In Level 1, none of this was needed.  
Since we didn't imagine that most users would have prior knowledge of this, we 
decided that it was unlikely that they would be passing in a document to the 
Sax2 handler, so we took the doc parameter out of the initializer.

Now there is no technical reason to omit it.  If you are in a situation where 
you can construct a proper DOM Level 2 document and pass it to the Sax2 
handler, there is no reason for us not to let you.

I think I'll go ahead and reinstate the "doc" parameter to the Sax2 handler.  
Look for this in the next version, or I can send you a patch, if you're 
desparate.

Another approach is that we could have the Sax2 handler accept a document 
factory object which is inkokes to create the document.  This might be an 
unnecessarily heavyweight solution, though.

-- 
Uche Ogbuji                               Senior Software Engineer
uche.ogbuji@fourthought.com               (303)583-9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-9036
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


> I've written a couple of applications that do what you describe and for me
> it is the only alternative when it comes to reading in XML data. The 4DOM
> parser has limited support for this sort of thing, I've found. If you use
> the 4DOM Sax parser (actually a DOM parser, found in Ft/Dom/Ext/Reader), you
> can easily specify that you want to use a custom Document instance. Because
> the Document is also a node factory for the resulting DOM, you can subclass
> the default Document class and parse the XML file using your own custom
> document, with overridden methods for things like creating elements.
> Typically, an overridden element factory method (createElement or
> createElementNS) checks the namespace URI and tag name and if a match is
> found, returns some custom object like you describe, otherwise it simply
> calls the same method of its superclass, Document. The custom object
> returned by the element factory method should probably inherit from the
> Element class, but you could of course override its functionality
> arbitrarily.
> 
> Unfortunately, this is not so easy to accomplish using 4DOM, I've found,
> because the Sax parser previously mentioned doesn't seem to include
> namespace support. There is an alternative parser called Sax2 that includes
> NS support, but for some reason I've been unable to figure out, the ability
> to specify a custom Document instance in the Sax2 parser has been removed. I
> had to hack my way around this problem, resulting in a somewhat unstable
> solution (it seems like the semantics of the Sax2 parser are a bit different
> under Unix and Win32). I had to access a 'private' member variable inside
> the Sax2 parser instance in order to forcibly replace the default Document
> instance before parsing. Perhaps someone working on 4DOM can provide some
> insight into a better way, or at least an explanation to why the custom
> element factory feature has been removed in Sax2.
> 
> I have example code, although I cannot access it from where I am right now.
> Get back to me privately if you want me to send it.