Parsing XML - Newbie help

Fredrik Lundh fredrik at pythonware.com
Sun May 22 13:22:00 EDT 2005


"rh0dium" wrote:

> I am relatively new to python and certainly new to XML parsing.  Can
> some show me how to get the product text out of this?

didn't you ask the same question a few days ago?  did you read the
replies to that post?

assuming that your sample is correct, you need to process the almost-
but-not-quite-XML file before passing it to an XML processor.  here's an
ElementTree-based example that does that:

    # see http://effbot.org/zone/element-index.htm
    from elementtree import ElementTree as ET

    data = open(...).read() # or os.popen(...).read()

    # strip off bogus XML declaration
    import re
    m = re.match("<\?xml[^>]+>", data)
    if m:
        data = data[m.end():]

    # wrap notes in container element
    data = "<doc>" + data + "</doc>"

    tree = ET.XML(data)

    processors = []
    for elem in tree.findall(".//node"):
        if elem.get("class") == "processor":
        processors.append(elem.findtext("product"))
    print len(processors)
    print processors

given your example, this prints:

    2
    ['AMD Opteron(tm) Processor 250', 'AMD Opteron(tm) Processor 250']

</F>






More information about the Python-list mailing list