[XML-SIG] Minidom bugs/questions

Uche Ogbuji uche.ogbuji@fourthought.com
Sun, 04 Feb 2001 21:46:35 -0700


> > Converting my app to use minidom was easy enough, but I found out a
> > bout a bunch of differences between the two DOM implementations.  Some
> > of these are fine with me (e.g. minidom doesn't preserve comments,
> > doesn't prefix its output with "<?xml version="1.0" ?>" when writing
> > XML output,

minidom should be fixed to put out an XML declaration, preferably with the 
encoding.  This is hardly a burden, and is *highly* recommended XML practice.

> > minidom returns Unicode strings even for ASCII input).

> > 1. The other DOM has a hasAttributes() predicate; minidom is missing
> >    this and I have to use the more expensive form "if node.attributes".
> 
> Right; that's a bug in minidom: hasAttributes was introduced in "DOM
> Level 2".
> 
> The original idea of minidom was that it should be "minimal"; clearly
> that has not worked out, so we probably should review it carefully to
> achieve completeness (with respect to "DOM 2 Core").

Well, we should think about exactly what makes minidom "mini".  It's debatable 
whether it is possible to implement all of DOM Level 2 core and still be 
"mini".  And what about DOm level 3?

> > 4. In minidom, createDocument() leaves doc.documentElement set to None;
> >    in the other DOM, doc.documentElement is initialized to an Element
> >    node created from the second argument to createDocument().  (Again,
> >    according to Fred, the DOM standard requires the latter.)
> 
> That was a surprise to me. After reading the spec and a number of
> implementations, I think the requirement is much stronger: You MUST
> pass a qualifiedName, only the namespaceURI and the doctype are
> optional. 

Yes.  This is a pain, but it is clearly fundamental to the DOM WG conceptual 
model.

> It appears to be a common trick to allow null in createDocument, so
> that the first element found during parsing can be introduced with
> appendChild, but that appears to be non-conforming (somebody please
> correct me if it is).

I think it is, even though 4DOM does this.  Mike or Jeremy will probably 
remind me if I'm missing something.  From what I see of the readers, we don't 
need this convenience.

> I could try to come up with a separate patch for that issue.
> 
> > 5. When writing XML output from a DOM tree that uses namespace
> >    attributes, minidom doesn't insert the proper "xmlns:<tag>=<URI>"
> >    attributes.  The other DOM gets this right.  (This is a bit tricky
> >    to do, although I've figured a good way to do it which I'll gladly
> >    donate to minidom if it's deemed useful.)
> 
> Yes, that is certainly desirable; minidom should support namespaces
> fully.

Of course if it isn't Level 2 compliant, it needn't do so.  I wouldn't 
consider it unreasonable to have minidom L1 only.  If users want Level 2, they 
install PyXML or other.

> > 6. When writing XML output from a DOM tree that has a default
> >    namespace, minidom writes <:tag>...</:tag> instead of
> >    <tag>...</tag> like the other DOM, and like I would have expected.
> 
> Certainly a bug. When writing out namespace declarations, dealing with
> default default namespace is really tricky (e.g. when a tree that had
> a default namespace is extended with an element with no namespace).

Horrid bug.  Those are invalid XML 1.0 NMTOKENS.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python