[XML-SIG] Building a DOM tree

Jeff.Johnson@icn.siemens.com Jeff.Johnson@icn.siemens.com
Thu, 25 Mar 1999 11:07:23 -0500



[Carsten Oberscheid]
>>Assuming that each Node object can be a member only of one single DOM
tree,
>>wouldn't it be possible to replace the _parent_relation member of the
>>document element by one global _parent_relation dictionary on module
level?
>>
>>   xml.dom.core._parent_relation == { id(childNode): parentNode, ... }
>>
>>This would make the reference to the document node obsolete.
>>get_parentNode() returns xml.dom.core._parent_relation[id(self)],
>>insertChild() and removeChild() must take care of the global dictionary
as
>>they do with the document element's dictionary now.

[A.M. Kuchling]
>    Hmm... hmmm... no, I can't think of any reason that wouldn't
>work.  Nodes can only have a single parent, and you can't mix nodes
>from two different document trees (unless you're Fred Drake), so key
>collisions aren't possible.  That would mean there's a single
>dictionary with lots of keys, testing Python's dictionary code a bit
>more, but dictionaries are supposed to handle that sort of thing, so
>it shouldn't cause any problems.  Shouldn't cause any problems for
>threading, either.  Hmmm...

I don't know that much about the inner working of Python so this may be a
dumb question.  How and when would the global dictionary be released?  Is
removeChild called for all nodes when I dereference a DOM document node?
I call appendChild a lot but I usually don't call removeChild, I just throw
away the whole tree.

It seems to me that if I were to call the following code, I would run out
of memory with the proposed dictionary.

def processAlotOfFiles():
     fr = FileReader()
     while 1:
          dom = fr.readFile('test.xml')

Did I miss something?

Also, I have gotten bitten several times by the fact that I can't move a
node from one tree to another.  I figured cloneNode() would allow it but it
won't.  Could we come up with a function to move a node (or copy it) from
one tree to another?  One simple example of why I need to do this is when I
have to break up a large HTML file into smaller files.  One big tree -->
many small trees.  I do it now by writing the HTML, HEAD and BODY tags as
plain text to a file and inserting HtmlLineariser.linearise() between them.
Not the most elegant solution.  While I'm complaining, is there a good
reason that HtmlWriter closes the file passed to it?  Because of that I
have to build the HTML string in memory and write it to the file.

def writeStack(self,stack,head,body,fileName):
     f = open(fileName,'w')
     f.write('<HTML>\n')
     util2.writeHtmlNode(head,f)

     # Should copy the body node to get the attributes of the original.
     #for a,v in self.getAttributes()
     #    self.body
     #    attributes
     f.write('<BODY>\n') # Take the easy way for now...

     for node in stack:
          util2.writeHtmlNode(node,f)

     f.write('</BODY>\n')
     f.write('</HTML>\n')
     f.close()

def write_html(document, stream=sys.stdout):
     "Given a DOM document, write the HTML to stream."
     w = HtmlWriter(stream)
     w.write(document)

def writeHtmlNode(node, stream=sys.stdout):
     """HtmlWriter closes the stream which is not always desirable, this
won't
     but it is probably slower because it builds a big string."""
     l = HtmlLineariser()
     stream.write(l.linearise(node))