Simple elementtree question
Stefan Behnel
stefan.behnel-n05pAM at web.de
Thu Aug 30 15:28:30 EDT 2007
IamIan wrote:
> This is in Python 2.3.5. I've had success with elementtree and other
> RSS feeds, but I can't get it to work with this format:
>
> <?xml version="1.0"?><rdf:RDF
> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:dc="http://purl.org/dc/elements/1.1/"
> xmlns:fr="http://ASPRSS.com/fr.html"
> xmlns:pa="http://ASPRSS.com/pa.html"
> xmlns="http://purl.org/rss/1.0/">
> <channel rdf:about="http://www.sample.com">
> <title>Example feed</title>
[...]
> </rdf:RDF>
>
> What I want to extract is the text in the title and link tags for each
> item (eg. <title>First story</title> and <link>http://www.sample.com/
> news/20000/news.htm</link>). Starting with the title, my test script
> is:
>
> import sys
> from urllib import urlopen
>
> import elementtree.ElementTree as ET
>
> news = urlopen("http://www.sample.com/rss/rss.xml")
> nTree = ET.parse(news)
> for item in nTree.getiterator("title"):
> print item.text
>
> Whether I try this for title or link, nothing is printed.
Your document uses namespaces. What you are looking for is not the tag "title"
without a namespace, but the tag "{http://purl.org/rss/1.0/}title" with the
default namespace.
http://effbot.org/zone/element.htm#xml-namespaces
Stefan
More information about the Python-list
mailing list