[XML-SIG] Help with removeChild()

Mike Hammill mike@pdc.kth.se
Thu, 10 May 2001 14:25:41 +0200


Dear xml-sig,

I hope someone with a bit more experience can help me.  I'm trying to use 
xml.minidom to clean up an XML file.  In brief, how does one walk through the 
DOM tree and remove certain children using recursion?  My attempt walks the 
tree, but some children are skipped.  I believe this is because when children 
are removed, it not reflected in the calling program list of children.  Here 
is a simplified version of the problem.

XML file:
<slideshow>
<a>
</a>
<b>
</b>
    <c>
    </c>
    <d>
    </d>
<e>
</e>
<f></f>
</slideshow>

I would like to get rid of any element that has no attributes and who's text 
element is just whitespace, tabs, or linefeeds.  I wrote a little tree walker
the reduces the above to:

<?xml version="1.0" ?>
<slideshow><a/><b/><c/><d/><e/><f/></slideshow>

So far, so good.  When I apply the following code, however, the result is:
<?xml version="1.0" ?>
<slideshow><b/><d/><f/></slideshow>

That is only elements a, c, and e are eliminated.  The code is:

def trim_dom_more(node):
    if node.hasChildNodes():
        for child in node.childNodes:
            trim_dom_more(child)
    else:
        if node.nodeType == node.ELEMENT_NODE:
            if (not node.hasAttributes()) and (not node.hasChildNodes()):
                node.parentNode.removeChild(node)

I think I understand that the problem is that node.childNodes gets evaluated 
and put on the stack, but then after the removeChild, this stacked list is not 
re-evaluated so not all children are iterated through.  But how to solve that?

Any advice welcome!
Thanks
Mike