modifying XML documents

Nicola Paolucci durdn at yahoo.it.oops!.invalid
Sat May 3 11:05:38 EDT 2003


Hi Alessio,

On Sat, 03 May 2003 16:42:58 +0200, Alessio Pace <puccio_13 at yahoo.it>
wrote:

>Hi, 
>I use xml.minidom to parse XML documents, and it works fine. 
>I am now trying to actually modify the XML document that I read. 
>All I need, at the moment, is to make a substitution of a subtree rooted at
>one element of the xml document with a new subtree (whose element name and
>structure obviously has to reflect the dtd declarations as the element to
>be substituted). I tried simply to substitute the NodeList with the new
>one, but I saw that attribute is read-only and while I see method to create
>elements, I don't see anything to *remove* something.. :-( How should I do
>so?

If the XML manipulation is not overly complex you can have a look at a
really useful class called xml.sax.saxutils.XMLGenerator.

I copy here and example of using this class to do some really quick
renaming of tags. The example is taken from Alex Martelli's book
'Python in a Nutshell', chapter 23.

import xml.sax, xml.sax.saxutils

def tagrenamer(infile, outfile, renaming_dict):
    base = xml.sax.saxutils.XMLGenerator

    class Renamer(base):
        def rename(self, name):
            return renaming_dict.get(name, name)
        def startElement(self, name, attrs):
            base.startElement(self, self.rename(name),
                              attrs)
        def endElement(self, name):
            base.endElement(self, self.rename(name))

    xml.sax.parse(infile, Renamer(outfile))

Here is another example that adds some attributes to an XML document
(snippet extracted from full source, might not work right away):

import xml.sax, xml.sax.saxutils
import sys,os

base = xml.sax.saxutils.XMLGenerator

class CodingStandardsAdder(base):
    """ Extends XMLGenerator to add a coding standard attribute to the
xml report """
    def __init__(self,outfile,root,module):
        base.__init__(self,outfile)
        self.entry = ''
        self.catch = 0
        self.root = root
        self.module = module
        self.currentEntry = []
        
    def checkCodingStandard(self, entry):
        """ Actual function that performs a check on an entry in CVS
"""
        return random.choice(['true','false','untested'])
        
    def characters(self,content):
        if self.catch and not content.isspace():
            self.entry = content
            self.currentEntry.append(content)
        else:
            base.characters(self,content)
        
    def startElement(self, name, attrs):
        if name in ('changed','new','removed'): 
            self.catch = 1
            self.currentEntry.append(name)
            self.currentEntry.append(attrs)
        else:
            base.startElement(self, name, attrs)
        
    def endElement(self, name):
        if name in ('changed','new','removed'): 
            na = dict(self.currentEntry[1])
            na['codingstandards-compliant'] =
self.checkCodingStandard(self.entry.strip())
            newattr = xml.sax.xmlreader.AttributesImpl(na)
            base.startElement(self, self.currentEntry[0], na)
            base.characters(self,self.currentEntry[2])
            self.catch = 0
            self.currentEntry = []
        base.endElement(self, name)



Best regards,
	Nicola Paolucci

-- 
#Remove .oops!.invalid to email or feed to Python:
'Tmljb2xhIFBhb2x1Y2NpIDxuaWNrQG5vdGp1c3RjYy5jb20+'.decode('base64')





More information about the Python-list mailing list