help regarding xml parser modules

Vinay Aiya aiya.vinay at gmail.com
Tue Mar 11 14:07:27 EDT 2008


Hello,

Can any one help for error in following code.
actually i want to map the element name with its data between start and end
tag , but i am unable to do so.

here is the code which i am trying for please do reply if i am not on right
track.

import xml.sax.handler

class BookHandler(xml.sax.handler.ContentHandler):
    def __init__(self):
        self.inTitle1 = 0
        self.inTitle2 = 0
        self.mapping1 = {}
        self.mapping2 = {}

    def startElement(self, name, attributes="NULL"):
        #attributes="None"
        if name == "emph3":
            self.buffer1 = ""
            self.inTitle1 = 1

        #  self.id = attributes["None"]
        elif name == "year":
            self.buffer2 = ""
            self.inTitle2 = 1

        def characters(self,data):
            if self.inTitle1 == 1:
                self.buffer1 += data
            elif self.inTitle2 == 1:
                self.buffer2 += data

        def endElement(self,name):
            if name == "year":
                self.inTitle2 = 0
                self.mapping2[self.name] = self.buffer2
            elif name =="emph3":
                self.inTitle1 =0
                self.mapping1[self.name] = self.buffer1


this is an xml file an example

#<s>
#
#    <emph3>Jose Joaquin Avila</emph3>
#    <year>1929</year>
#
#    <emph3>Yiye Avila</emph3>
#    <year>1941</year>
#
#</s>



This is main file

import xml.sax
import try1
import pprint
parser = xml.sax.make_parser()
handler = try1.BookHandler()
parser.setContentHandler(handler)
parser.parse("tp.xml")
pprint.pprint(handler.mapping1)
pprint.pprint(handler.mapping2)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20080311/db0764f8/attachment.html>


More information about the Python-list mailing list