xmltodict - TypeError: list indices must be integers, not str

Peter Otten __peter__ at web.de
Sat May 10 08:10:14 EDT 2014


flebber wrote:

> I am using xmltodict.
> 
> This is how I have accessed and loaded my file.
> 
> import xmltodict
> document = open("/home/sayth/Scripts/va_benefits/20140508GOSF0.xml", "r")
> read_doc = document.read()
> xml_doc = xmltodict.parse(read_doc)
> 
> The start of the file I am trying to get data out of is.
> 
> <meeting id="35483" barriertrial="0" venue="Gosford"
> date="2014-05-08T00:00:00" gearchanges="-1" stewardsreport="-1"
> gearlist="-1" racebook="0" postracestewards="0" meetingtype="TAB"
> rail="True" weather="Fine      " trackcondition="Dead      "
> nomsdeadline="2014-05-02T11:00:00" weightsdeadline="2014-05-05T16:00:00"
> acceptdeadline="2014-05-06T09:00:00" jockeydeadline="2014-05-06T12:00:00">
>   <club abbrevname="Gosford Race Club" code="49" associationclass="2"
>   website="http://" />
>   <race id="185273" number="1" nomnumber="7" division="0" name="GOSFORD
>   ROTARY MAIDEN HANDICAP" mediumname="MDN" shortname="MDN"
>   stage="Acceptances" distance="1600" minweight="55" raisedweight="0"
>   class="MDN       " age="~         " grade="0" weightcondition="HCP      
>   " trophy="0" owner="0" trainer="0" jockey="0" strapper="0"
>   totalprize="22000" first="12250" second="4250" third="2100"
>   fourth="1000" fifth="525" time="2014-05-08T12:30:00" bonustype="BX02    
>    " nomsfee="0" acceptfee="0" trackcondition="          " timingmethod=" 
>           " fastesttime="          " sectionaltime="          "
>   formavailable="0" racebookprize="Of $22000. First $12250, second $4250,
>   third $2100, fourth $1000, fifth $525, sixth $375, seventh $375, eighth
>   $375, ninth $375, tenth $375">
>     <condition line="1">
> 
> So thought I had it figured. Can access the elements of meeting and the
> elements of club such as by doing this.
> 
> In [5]: xml_doc['meeting']['club']['@abbrevname']
> Out[5]: u'Gosford Race Club'
> 
> However whenever I try and access race in the same manner I get errors.
> 
> In [11]: xml_doc['meeting']['club']['race']['@id']
> 
---------------------------------------------------------------------------
> KeyError                                  Traceback (most recent call
> last) <ipython-input-11-cce362d7e6fc> in <module>()
> ----> 1 xml_doc['meeting']['club']['race']['@id']
> 
> KeyError: 'race'
> 
> In [12]: xml_doc['meeting']['race']['@id']
> 
---------------------------------------------------------------------------
> TypeError                                 Traceback (most recent call
> last) <ipython-input-12-c304e2b8f9be> in <module>()
> ----> 1 xml_doc['meeting']['race']['@id']
> 
> TypeError: list indices must be integers, not str
> 
> why is accessing race @id any different to the access of club @abbrevname
> and how do I get it for race?

If I were to guess: there are multiple races per meeting, xmltodict puts 
them into a list under the "race" key, and you have to pick one:

>>> doc = xmltodict.parse("""\
... <meeting>
...    <race id="first race">...</race>
...    <race id="second race">...</race>
... </meeting>
... """)
>>> type(doc["meeting"]["race"])
<class 'list'>
>>> doc["meeting"]["race"][0]["@id"]
'first race'
>>> doc["meeting"]["race"][1]["@id"]                                                                                                                                                                              
'second race'                                                                                                                                                                                                      

So 

xml_doc['meeting']['race'][0]['@id']

or

for race in xml_doc["meeting"]["race"]:
   print(race["@id"])

might work for you.




More information about the Python-list mailing list