Deepcopy on XML node in python2.2 problem

sag at hydrosphere.com sag at hydrosphere.com
Mon Sep 8 15:38:38 EDT 2003


Pythonistas,

I have been using python 2.1 on a windows platform to develop some 
xml processing classes.  The input xml can be such that I want to 
take out parts of its and convert it to a text representation and put 
the text back in as a single 'text' node.  I am using 
_xmlplus.minidom.  

I am now moving the code to a Red Hat Linux machine with python 2.2, 
and I am getting an error that doesn't occur with the exact same code 
on the python 2.1 machine.  See below for output from the two 
platforms.  

Basically, I locate the base of the subnode tree that I want to 
convert and I do a deep copy on the node tree and then I convert that 
copy to a new xml object and get the string version of it back. The 
problem seems to be in deepcopy.  On windows, it returns a copy of 
the subnode tree that has a parent that is a DOM element and that has 
the children I would expect from the copy, while on the python2.2 the 
parent is a text node and the children are completely different. If I 
don't use deepcopy, parts of the original tree are removed when I do 
an appendChild call on the new subnode xml object.  

I am a very intermediate python programmer and I don't know enough to 
try to chase down what's going on in deepcopy.  Does anyone who is a 
guru have any ideas on what might be causing this?  Is it a 
difference between windows/linux or between python2.1 and 2.2?  

My goal is to convert a subnode tree to a single text node where the 
subnode tree is converted totally to xml text string.  

See below for a somewhat abbreviated case that shows the difference.  
If I run this code on python 2.1 on windows, it runs.  If I run it on 
python 2.2 on linux, it fails.  

sue 

++++++++++++++++++++++++++++++++++++
Code example:
# test deepcopy using minidom xml
# the point of all this is to carve out subsections of an xml
# and convert it to text that can stand alone as xml
# this is a small part of a set of classes for special xml handling
from _xmlplus.dom.minidom import parseString
from _xmlplus.dom import minidom
import copy
argName = 'arg'
def _convert2Text(child):
    xmlText = "<Sub/>"
    temp = parseString(xmlText)
    root = temp.documentElement
    for subNode in child.childNodes:
        cp = copy.deepcopy(subNode)
        print 'copy', cp
        print 'parent of copy', cp.parentNode
        print 'parent children', cp.parentNode.childNodes
        root.appendChild(cp)
    subText = temp.toxml()
    subText = subText.encode("utf-8")
    return subText
def _parseNode(node):
    results = {}
    if node.nodeType == node.ELEMENT_NODE:
        name = str(node.nodeName)
        attr = node.getAttribute('t')
        if not node.hasChildNodes():
            results[name] = ''
        elif len(node.childNodes) == 1 and                    
node.childNodes[0].nodeType == node.TEXT_NODE:
            value = str(node.childNodes[0].nodeValue)
            results[name] = value, attr
        else:
            for n in node.childNodes:
                results.update( _parseNode(n))
    else:
        pass
    return results
# convert it to modified version
def getArgs():
    root = _xmlDOM.documentElement
    argList = []
    for child in root.childNodes:
        if child.nodeName == argName:
            nodeValues = _parseNode(child)
            if len(nodeValues) == 1:
                try:
                    argList.append(nodeValues[argName])
                except KeyError:
                    subText = _convert2Text(child)
                    argList.append(subText)
            else:
                subText = _convert2Text(child)
                argList.append(subText)
    return argList

if __name__ == '__main__':
    print '+' * 20
    fn = r"<function>Aggregate</function>"
    kwds = r'<keywords><timeStep>1</timeStep></keywords>'
    kwds = ""
    tfn = fn + kwds
    fn = r"<function>TransformDS</function>"
    args = r'<arg>%s</arg>' % tfn
    f = "<PythonCall>" + fn + args + "</PythonCall>"
    xmlText = "%s%s" % (r'<?xml version="1.0"?>', f)
    _xmlDOM = parseString(xmlText)
    print _xmlDOM.toxml()
    args = getArgs()
    print 'after convert'
    print args
# end code

=======================
Results:
Ouput from python 2.1 on windows
+++++++++++++++++++++++
<?xml version="1.0" ?>
<PythonCall><function>TransformDS</function><arg><function>Aggregate</
function></arg></PythonCall>
copy <DOM Element: function at 20644036>
parent of copy <DOM Element: arg at 20002972>
parent children [<DOM Element: function at 20644036>]
after convert
['<?xml version="1.0" ?>\n<Sub><function>Aggregate</function></Sub>']

++++++++++++++++++++++++
Output from python 2.2 on red hat linux
++++++++++++++++++++++++
<?xml version="1.0" ?>
<PythonCall><function>TransformDS</function><arg><function>Aggregate</
function></arg></PythonCall>
copy <DOM Element: function at 141868116>
parent of copy <DOM Element: arg at 142546132>
parent children [<DOM Text node "TransformD...">]
Traceback (most recent call last):
  File "/home/sag/testBad.py", line 23, in _convert2Text
    root.appendChild(cp)
  File "/usr/lib/python2.2/site-packages/_xmlplus/dom/minidom.py", 
line 171, in appendChild
    node.parentNode.removeChild(node)
  File "/usr/lib/python2.2/site-packages/_xmlplus/dom/minidom.py", 
line 212, in removeChild
    self.childNodes.remove(oldChild)
ValueError: list.remove(x): x not in list






More information about the Python-list mailing list