[Tutor] Parsing an XML document using ElementTree

Stefan Behnel stefan_ml at behnel.de
Tue May 24 12:35:55 CEST 2011


Sithembewena Lloyd Dube, 24.05.2011 11:59:
> I am trying to parse an XML feed and display the text of each child node
> without any success. My code in the python shell is as follows:
>
> >>> import urllib
> >>> from xml.etree import ElementTree as ET
>
> >>> content = urllib.urlopen('
> http://xml.matchbook.com/xmlfeed/feed?sport-id=&vendor=TEST&sport-name=&short-name=Po
> ')
> >>> xml_content = ET.parse(content)
>
> I then check the xml_content object as follows:
>
> >>> xml_content
> <xml.etree.ElementTree.ElementTree instance at 0x01DC14B8>

Well, yes, it does return an XML document, but not what you expect:

   >>> urllib.urlopen('URL see above').read()
   "<response>\r\n  <error-message>you must add 'accept-encoding' as
   'gzip,deflate' to the header of your request</error-message>\r
   \n</response>"

Meaning, the server forces you to pass an HTTP header to the request in 
order to receive gzip compressed data. Once you have that, you must 
decompress it before passing it into ElementTree's parser. See the 
documentation on the gzip and urllib modules in the standard library.

Stefan



More information about the Tutor mailing list