xml.dom.minidom

Paul Boddie paul at boddie.net
Tue May 21 10:44:02 EDT 2002


a2600 at hotmail.com (joe user) wrote in message news:<28c7ef7c.0205210010.4f363646 at posting.google.com>...
> 
> <?xml version='1.0'?>
> <mydoc version="1.2">
>  <title text="testdocument" />
>  <text>
>   <data>
>    Hello World
>   </data>
>  </text>
>  <footer>
>   <data>
>    Hello Footer
>   </data>
>  </footer>
> </mydoc>
> 
> 
> What I would like to do is to get the contents of the "<data>"-tag
> within the "<footer>"-tag.
> 
> If anyone has a minute over, could you please give me an example on
> how to proceed?

One of the "coolest" ways to proceed is actually to use XPath. Try the
following:

  # Your document has been parsed and is in the 'document' variable.
  # This seems to be the way to get an XPath "context".

  import xml.xpath
  context = xml.xpath.Context.Context(document)

  # Now, search for the text inside the 'data' element, itself inside
  # the 'footer' element.
  # (Note that we use some strange parameter notation here - I don't
  # know why, exactly, but it's necessary and strange errors can
  # occur if we get this wrong.)

  nodes = xml.xpath.Evaluate("//footer/data/text()", context=context)

  # What this did was to find all 'footer' elements in the document.
  # (The // means "find the following elements at all levels of
  # 'document'".) Then go inside each of these 'footer' elements to
  # find 'data' elements, and then inside those elements to find all
  # text nodes.
  # Note that all 'footer' elements without 'data' elements inside
  # them are excluded from the results.

  for node in nodes:
    print node.nodeValue

  # The above should print the values of all the nodes found.

I've been using XPath a lot recently, outside of its usual XSLT
domain, and it makes XML document processing so much more fun than
just using the DOM directly. Take a look at the W3C site
(http://www.w3.org) for more information on the XPath standard.

Paul



More information about the Python-list mailing list