XML Tree Discovery (script, tool, __?)

Fredrik Lundh fredrik at pythonware.com
Wed Oct 26 03:13:09 EDT 2005


eric.pederson at gmail.com wrote:

> just namespace + tag

here's an ElementTree-based example:

    # http://effbot.org/tag/elementtree
    import elementtree.ElementTree as ET

    FILE = "example.xml"

    path = ()
    path_map = {}

    for event, elem in ET.iterparse(FILE, events=("start", "end")):
        if event == "start":
            path = path + (elem.tag,)
        else:
            path_map.setdefault(path, []).append(elem)
            path = path[:-1]
            elem.clear() # won't need the contents any more

    for path in path_map:
        print "/".join(path), len(path_map[path])

given this document:

    <document>
        <chapter>
            <title>chapter 1</title>
        </chapter>
        <chapter>
            <title>chapter 2</title>
        </chapter>
    </document>

the above script prints

    document 1
    document/chapter 2
    document/chapter/title 2

the script uses universal names for tags that live in a namespace.  for
information on how to decipher such names, see:

    http://www.effbot.org/zone/element.htm#xml-namespaces

hope this helps!

</F>






More information about the Python-list mailing list