[Tutor] Issues Parsing XML

Thu Mar 12 21:14:30 CET 2009

On Thu, Mar 12, 2009 at 08:47:24PM +0100, Stefan Behnel wrote:
> marc at marcd.org wrote:

[snip]
> 
> There is another "DOM Model" in the stdlib. It's called ElementTree and is
> generally a lot easier to use. For example, to find the text content of an
> element called "element_that_has_text_content" in a subtree below
> "some_element", you can do
> 
> 	print some_element.findtext(".//element_that_has_text_content")

And, if you install lxml, then you will be able to use XPath, which
is more powerful that the findtext() in ElementTree.

Stefan did not tell you about that because he is a developer who
has helped give us lxml, and perhaps he is a bit modest.

There is a bit to learn in order to use the XPath capability in
lxml.  But, if you are doing any amount of XML processing in
Python, it's likely to be worth it.

You can learn about lxml here: http://codespeak.net/lxml/

- Dave

-- 
Dave Kuhlman
http://www.rexx.com/~dkuhlman