Parsing XML Using XPATH for Python

John Gordon gordon at panix.com
Tue Dec 2 17:52:27 EST 2014


In <a1a70942-6740-4de5-b41e-57a71fb36910 at googlegroups.com> Uzoma Ojemeni <uojemeni at gmail.com> writes:

> I am new to Python - a few days old - and I would appreciate some help.

> I want write a python code to parse the below XML as below:-

> ServingCell----------NeighbourCell
> L41_NBR3347_1----------L41_NBR3347_2
> L41_NBR3347_1----------L41_NBR3347_3
> L41_NBR3347_1----------L41_NBR3349_1
> L41_NBR3347_1----------L41_NBREA2242_1

> <LteCell id="L41_NBR3347_1">
>  <attributes>
>   <absPatternInfoTdd><unset/></absPatternInfoTdd>
>   <additionalSpectrumEmission>1</additionalSpectrumEmission>
>   <additionalSpectrumEmissionList><unset/></additionalSpectrumEmissionList>
>   <LteSpeedDependentConf id="0">                   
>     <attributes>                                    
>      <tReselectionEutraSfHigh>lDot0</tReselectionEut
>      <tReselectionEutraSfMedium>lDot0</tReselectionE
>     </attributes>                                   
>    </LteSpeedDependentConf>                         
>    <LteNeighboringCellRelation id="L41_NBR3347_2">  
>     <attributes>                                    
>      <absPatternInfo><unset/></absPatternInfo>      
>    </LteNeighboringCellRelation>                    
>    <LteNeighboringCellRelation id="L41_NBR3347_3">  
>     <attributes>                                    
>      <absPatternInfo><unset/></absPatternInfo>      
>    </LteNeighboringCellRelation>                    
>    <LteNeighboringCellRelation id="L41_NBR3349_1">  
>     <attributes>                                    
>      <absPatternInfo><unset/></absPatternInfo>                            
>    </LteNeighboringCellRelation>                    
>    <LteNeighboringCellRelation id="L41_NBREA2242_1">
>     <attributes>                                    
>      <absPatternInfo><unset/></absPatternInfo>      
>      <absPatternInfoTdd><unset/></absPatternInfoTdd>

In plain English, it looks like you want to do this:

1. Print a header.
2. For each <LteCell> element:
3. Find the child <attributes> element.
4. For each child <LteNeighboringCellRelation> element:
5. Print the "id" attributes of the <LteCell> element and the
   <LteNeighboringCellRelation> element.

Translated to python, that would look something like this:

# import the xml library code
import xml.etree.ElementTree as ET

# load your XML file
tree = ET.parse('cells.xml')

# get the root element
root = tree.getroot()

# print a header
print("ServingCell------NeighbourCell")

# find each <LteCell> child element
for serving_cell in root.findall('LteCell'):

   # find the <attributes> child element
   attributes = serving_cell.find('attributes')

   # find each <LteNeighboringCellRelation> child element
   for neighbor in attributes.findall('LteNeighboringCellRelation'):

       # print the id's of the serving and neighbor cells
       print("%s------%s" % (serving_cell.attrib['id'], neighbor.attrib['id']))

-- 
John Gordon         Imagine what it must be like for a real medical doctor to
gordon at panix.com    watch 'House', or a real serial killer to watch 'Dexter'.





More information about the Python-list mailing list