a few more questions on XML and python

Fri Jan 4 06:31:03 EST 2002

Rajarshi Guha <rxg218 at psu.edu> wrote:

> Where can I look up the methods available and the actual tree structuire? I 
> tried reading the actual .py file but I got all mixed up - is there any 
> reference to the available methods.?

  The standard library documentation, section 13.6 "xml.dom" describes
the library's DOM support and has links to the W3C DOM specification. 
In particular, 13.6.2 "Objects in the DOM" is probably what you're
looking for when you say "reference to the available methods":
http://python.org/doc/current/lib/node438.html

> I do have another question - how can I traverse the tree and see what 
> available tags got read in?

------------------------------------------------------------------
from StringIO import StringIO
import xml.dom.minidom as dom

theXML = """<?xml version="1.0"?>
<book edition="1">
    <title>The Fascist Menace</title>
    <authors>
        <author>
            <name_first>Josef</name_first>
            <name_last>Stalin</name_last>
            <email>unclejoe at kremlin.ru</email>
        </author>
        <author>
            <name_first>Alexei</name_first>
            <name_last>Voloshnikov</name_last>
            <email>just-another-tovarisch at kremlin.ru</email>
        </author>
    </authors>
    <description><![CDATA[Our fearless leader's electrifying call to
action.]]></description>
    <pages>1941</pages>
</book>
"""

def printTagNames(node, indentationLevel=0):
    print indentationLevel * ' ' + node.tagName

    for child in node.childNodes:
        if child.nodeType == dom.Node.ELEMENT_NODE:
            printTagNames(child, indentationLevel+4)

doc = dom.parse(StringIO(theXML))

print '--- TAG DUMP ---'
printTagNames(doc.documentElement)
------------------------------------------------------------------