Python and XML help
Tom Bryan
tbryan at python.net
Sat Jul 27 11:05:26 EDT 2002
Mathieu wrote:
> I'm new to Python and having some problems parsing the following XML :
Suggestion:
You'll get more responses if the code you post will run or will only hit
errors that you're asking about in your post. If you're hitting an
Exception, then you should include the Python stacktrace in your post.
Often, we'll then be able to answer your question without even running your
code.
> <properties name="spam">
> <prop>
> <first>
> <item name="foo">
This document isn't even well-formed XML. It should be
<item name="foo"></item> or
<item name="foo" />
> <item name="foo2">
> </first>
>
> <second>
> <item name="bar">
> <item name="bar2">
> </second>
> </prop>
You also need a closing properties tag
</properties>
> if __name__ == "__main__":
> p=myparser()
I'm assuming you left out something like
data = open( "properties.xml" ).read()
> p.feed(data)
> p.close()
> I would like the script to write only the "name" variables that are
> within the "first" brackets and ignore the others. Please note that my
> code must stay compatible with Python 1.5 and that this example might
> not work or contain errors, because it is a simplified version of the
> code I have written.
I'm more familiar with SAX parsers in Java, but Python's XMLParser looks
similar. Generally, I would say that you should just keep a stack of
element names that your parser has seen. Then you can peek at the stack to
see where you are. I see that XMLParser seems to maintain a stack, but I'm
not sure whether you're really supposed to use it. That is, perhaps it
isn't guaranteed to exist in future versions. Look at xmllib.py in your
Python distribution and see what the comments to XMLParser say.
Anyway, using the stack inherited from XMLParser, here's how I would do what
you're saying
def start_item(self,attrs):
# Look for "item" elements only within "first" elements
# and print the value of the "name" attribute
if self.stack[-1][0] == 'item' and self.stack[-2][0] == 'first':
print attrs['name']
---Tom
More information about the Python-list
mailing list