[XML-SIG] Setting the DOCTYPE in a new XML DOM

Sylvain Thenault Sylvain.Thenault@logilab.fr
Mon, 14 Jan 2002 10:52:09 +0100 (CET)


On 11 Jan 2002, Douglas Bates wrote:

> System: Debian 3.0 (testing) GNU/Linux with the python2.2 and
>   python2.2-xml packages installed.
> 
> ||/ Name           Version        Description
> +++-==============-==============-============================================
> ii  python2.2      2.2-2          An interactive object-oriented scripting la
> ii  python2.2-xml  0.6.6-7        XML tools for Python (2.2.x)

you should install pyxml 0.7 which contains a bunch of bugfixes

> I have been unable to determine how to set the SYSTEM in the doctype
> of a document read by the PyExpat reader.  I am rather new to this so
> it is possible that I am doing something foolish.  I have mostly been
> following demo's and examples as I haven't been able to track down a
> lot of documentation.  A sample program is
> 
> #!/usr/bin/env python2.2
> 
> from xml.dom.ext.reader.PyExpat import Reader
> from xml.dom.ext import PrettyPrint
> 
> if __name__ == "__main__":
>     reader = Reader()
>     doc = reader.fromUri("/tmp/foo1.xml")
>     PrettyPrint(doc)
> 
> The file /tmp/foo1.xml begins
> 
> <?xml version="1.0"?>
> <!DOCTYPE booklist SYSTEM "file:////home/deepayan/python/book.dtd">
> <booklist>
>   <book>
> 
> but the output file begins
> 
> <?xml version='1.0' encoding='UTF-8'?>
> <!DOCTYPE booklist>
> <booklist>
>   <book>
> 
> Can anyone tell me what I do to maintain the SYSTEM designation?

this is a bug in pyexpat. It should work if you use xmlproc instead of
pyexpat to generate your dom tree.
 
> Also, how do I set the SYSTEM designation when I create a new
> document, say as in
> 
>     def __init__(self, dom = None):
>         '''Initialize from an existing document object model
>         or create a new DOM.
>         '''
>         if dom:
>             self.doc = dom
>             self.vol = dom.getElementsByTagName('volume')[0]
>         else:
>             from xml.dom.DOMImplementation import implementation
>             self.doc = implementation.createDocument(None, None, None)
>             self.vol = self.doc.createElement('volume')
>             self.doc.appendChild(self.vol)

as said in the dom level 2 spec, the only way to set a doctype to a 
document node is while creating the document (the corresponding attribute
of document is read only): 

from xml.dom.DOMImplementation import implementation 
doctype = implementation.createDocumentType('XMI', None, 'uml13.dtd')
doc = implementation.createDocument(EMPTY_NAMESPACE, 'XMI', doctype) 

Note the createDocument return a document with it's root element (the 2
first arguments of createDocument are the namespaceURI and the qname of
the _root_), so the following line

self.doc = implementation.createDocument(None, 'volume', None)  

does the same job as your 3 lines. Moreover, some Python DOM
implementations accept namespaceURI and qname to be None in createDocument
arguments, and then return a document element without root node, while
some doesn't (as minidom for example).

hope that helps

regars

-- 
Sylvain Thenault

  LOGILAB           http://www.logilab.org