modifying XML documents
Nicola Paolucci
durdn at yahoo.it.oops!.invalid
Sat May 3 11:05:38 EDT 2003
Hi Alessio,
On Sat, 03 May 2003 16:42:58 +0200, Alessio Pace <puccio_13 at yahoo.it>
wrote:
>Hi,
>I use xml.minidom to parse XML documents, and it works fine.
>I am now trying to actually modify the XML document that I read.
>All I need, at the moment, is to make a substitution of a subtree rooted at
>one element of the xml document with a new subtree (whose element name and
>structure obviously has to reflect the dtd declarations as the element to
>be substituted). I tried simply to substitute the NodeList with the new
>one, but I saw that attribute is read-only and while I see method to create
>elements, I don't see anything to *remove* something.. :-( How should I do
>so?
If the XML manipulation is not overly complex you can have a look at a
really useful class called xml.sax.saxutils.XMLGenerator.
I copy here and example of using this class to do some really quick
renaming of tags. The example is taken from Alex Martelli's book
'Python in a Nutshell', chapter 23.
import xml.sax, xml.sax.saxutils
def tagrenamer(infile, outfile, renaming_dict):
base = xml.sax.saxutils.XMLGenerator
class Renamer(base):
def rename(self, name):
return renaming_dict.get(name, name)
def startElement(self, name, attrs):
base.startElement(self, self.rename(name),
attrs)
def endElement(self, name):
base.endElement(self, self.rename(name))
xml.sax.parse(infile, Renamer(outfile))
Here is another example that adds some attributes to an XML document
(snippet extracted from full source, might not work right away):
import xml.sax, xml.sax.saxutils
import sys,os
base = xml.sax.saxutils.XMLGenerator
class CodingStandardsAdder(base):
""" Extends XMLGenerator to add a coding standard attribute to the
xml report """
def __init__(self,outfile,root,module):
base.__init__(self,outfile)
self.entry = ''
self.catch = 0
self.root = root
self.module = module
self.currentEntry = []
def checkCodingStandard(self, entry):
""" Actual function that performs a check on an entry in CVS
"""
return random.choice(['true','false','untested'])
def characters(self,content):
if self.catch and not content.isspace():
self.entry = content
self.currentEntry.append(content)
else:
base.characters(self,content)
def startElement(self, name, attrs):
if name in ('changed','new','removed'):
self.catch = 1
self.currentEntry.append(name)
self.currentEntry.append(attrs)
else:
base.startElement(self, name, attrs)
def endElement(self, name):
if name in ('changed','new','removed'):
na = dict(self.currentEntry[1])
na['codingstandards-compliant'] =
self.checkCodingStandard(self.entry.strip())
newattr = xml.sax.xmlreader.AttributesImpl(na)
base.startElement(self, self.currentEntry[0], na)
base.characters(self,self.currentEntry[2])
self.catch = 0
self.currentEntry = []
base.endElement(self, name)
Best regards,
Nicola Paolucci
--
#Remove .oops!.invalid to email or feed to Python:
'Tmljb2xhIFBhb2x1Y2NpIDxuaWNrQG5vdGp1c3RjYy5jb20+'.decode('base64')
More information about the Python-list
mailing list