Expat XML Parser

Richard Boardman rpb at soton.ac.uk
Wed Nov 28 09:42:52 EST 2001


Hello,

I've only been using Python for about a week or so, having previously used
Tcl for all my scripting needs. However, on the recommendation of a friend,
I have forayed into the world of Python.

I am trying to use the Expat XML parser to extract some data for
processing - sounds simple enough - although I am having great difficulty
finding documentation for Expat and am consequently very stuck.

Here is the code I have; mostly gleaned from the one or two examples I
managed for find for Expat. After taking out some of the extraneous bits, my
problem reads thus:

# read in XML for processing

filetoread = "xyz.xml"
sourcexml = open(filetoread, 'r')
xmlpp = sourcexml.read()

def s_el(name, attrs):
    print name, "(attributes -->", attrs, ")"
def e_el(name):
    print "   (attribute", name, "ends)"
def c_data(data):
    print "    `--->", data

prs = xml.parsers.expat.ParserCreate()
prs.StartElementHandler = s_el
prs.EndElementHandler = e_el
prs.CharacterDataHandler = c_data
prs.returns_unicode = 0
prs.Parse(xmlpp)

# end

The problem major is that I can't seem to return any of these values at
all - they will all print on the screen, but I can't actually *do* anything
with these values. I don't think it's anything to do with Expat; more my
lack of experience with this language. I can't find any documentation
explaining how Expat works.

What I'd like to do is have something that works thus:

readInXML
foreach element in XML
    if element = "abcdefg" {
        getCharacterData
        doStuff with characterData
    }

... so in the following ...

<data set="1">1 2 3 4</data>

... I could check the value is 'data', return the attribute identified as
'1' and then return the block of data in, say, a list.

A few pointers/websites/ideas/pieces of code or whatever would be *greatly*
appreciated.

Thanks very much in advance.

--

Regards,

Richard Boardman
High Performance Computing Group
Level 3 Zepler Building (59)
University of Southampton
Southampton
SO17 1BJ

e-mail: rpb01r at ecs.soton.ac.uk





More information about the Python-list mailing list