Newbie XML problem

jmp jeanmichel at sequans.com
Tue Dec 22 07:27:33 EST 2015


On 12/22/2015 05:29 AM, KP wrote:
>
>  From my first foray into XML with Python:
>
> I would like to retrieve this list from the XML upon searching for the 'config' with id attribute = 'B'
>
>
> config = {id: 1, canvas: (3840, 1024), comment: "a comment",
>               {id: 4, gate: 3, (0,    0, 1280, 1024)},
>               {id: 5, gate: 2, (1280, 0, 2560, 1024)},
>               {id: 6, gate: 1, (2560, 0, 3840, 1024)}}
>
> I have started to use this code - but this is beginning to feel very non-elegant; not the cool Python code I usually see...
>
> import xml.etree.ElementTree as ET
>
> tree   = ET.parse('master.xml')
> master = tree.getroot()
>
> for config in master:
>      if config.attrib['id'] == 'B':
>      ...

It much depends on

1/ the xml parser you use.
2/ the xml structure

1/ I'm happily using beautifulSoup. Using it is really simple and yield 
simple code.

2/ Whenever the code gets complicated is because the xml is not properly 
structured. For instance in you example, 'id' is an attribute of 
'config' nodes, that's fine, but for 'panel' nodes it's a child node.
There's no point using a node when only one 'id' can be specified. 
Filtering by attributes is much easier than by child nodes.

Anyway here's an example of using beautifulSoup:
python 2.7 (fix the print statement if you're using python3)

import bs4

xmlp = bs4.BeautifulSoup(open('test.xml', 'r'), 'xml')

# print all canvas
for cfg in xmlp.findAll('config'):
     print cfg.canvas.text

# find config B panel 6 coordinates
xmlp.find('config', id='B').find(lambda node: node.name=='panel' and 
node.id.text=='6').coordinates.text

# if panel id were attributes:
xmlp.find('config', id='B').find('panel', id='6').coordinates.text

If you can change the layout of the xml file it's better that you do, 
put every values as attribute whenever you can:


<config id="A" canvas="3840,1024">
   <comment> comments can span
      on multiple lines, you probably need a node
   </comment>
   <panel id="1" gate="6" coordinates="0,0,1280,124"/>
<config>


Properly structured xml will yield proper python code.

cheers,

JM









More information about the Python-list mailing list