Trouble writing to database: RSS-reader

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Mon Jan 21 17:25:01 EST 2008


En Mon, 21 Jan 2008 18:38:48 -0200, Arne <arne.k.h at gmail.com> escribi�:

> On 21 Jan, 19:15, Bruno Desthuilliers <bruno.
> 42.desthuilli... at wtf.websiteburo.oops.com> wrote:
>
>> This should not prevent you from learning how to properly parse XML
>> (hint: with an XML parser). XML is *not* a line-oriented format, so you
>> just can't get nowhere trying to parse it this way.
>>
>> HTH
>
> Do you think i should use xml.dom.minidom for this? I've never used
> it, and I don't know how to use it, but I've heard it's useful.
>
> So, I shouldn't use this techinicke (probably wrong spelled) trying to
> parse XML? Should i rather use minidom?
>
> Thank you for for answering, I've learnt a lot from both of you,
> Desthuilliers and Genellina! :)
>

Try ElementTree instead; there is an implementation included with Python  
2.5, documentation  at http://effbot.org/zone/element.htm and another  
implementation available at http://codespeak.net/lxml/

import xml.etree.cElementTree as ET
import urllib2

rssurl = 'http://www.jabber.org/news/rss.xml'
rssdata = urllib2.urlopen(rssurl).read()
rssdata = rssdata.replace('&', '&') # ouch!

tree = ET.fromstring(rssdata)
for item in tree.getiterator('item'):
   print item.find('link').text
   print item.find('title').text
   print item.find('description').text
   print

Note that this particular RSS feed is NOT a well formed XML document - I  
had to replace the & with & to make the parser happy.

-- 
Gabriel Genellina




More information about the Python-list mailing list