Trouble writing to database: RSS-reader
Gabriel Genellina
gagsl-py2 at yahoo.com.ar
Mon Jan 21 17:25:01 EST 2008
En Mon, 21 Jan 2008 18:38:48 -0200, Arne <arne.k.h at gmail.com> escribi�:
> On 21 Jan, 19:15, Bruno Desthuilliers <bruno.
> 42.desthuilli... at wtf.websiteburo.oops.com> wrote:
>
>> This should not prevent you from learning how to properly parse XML
>> (hint: with an XML parser). XML is *not* a line-oriented format, so you
>> just can't get nowhere trying to parse it this way.
>>
>> HTH
>
> Do you think i should use xml.dom.minidom for this? I've never used
> it, and I don't know how to use it, but I've heard it's useful.
>
> So, I shouldn't use this techinicke (probably wrong spelled) trying to
> parse XML? Should i rather use minidom?
>
> Thank you for for answering, I've learnt a lot from both of you,
> Desthuilliers and Genellina! :)
>
Try ElementTree instead; there is an implementation included with Python
2.5, documentation at http://effbot.org/zone/element.htm and another
implementation available at http://codespeak.net/lxml/
import xml.etree.cElementTree as ET
import urllib2
rssurl = 'http://www.jabber.org/news/rss.xml'
rssdata = urllib2.urlopen(rssurl).read()
rssdata = rssdata.replace('&', '&') # ouch!
tree = ET.fromstring(rssdata)
for item in tree.getiterator('item'):
print item.find('link').text
print item.find('title').text
print item.find('description').text
print
Note that this particular RSS feed is NOT a well formed XML document - I
had to replace the & with & to make the parser happy.
--
Gabriel Genellina
More information about the Python-list
mailing list