[Tutor] XML parsing
Asif Iqbal
vadud3 at gmail.com
Thu Mar 29 22:03:37 EDT 2018
On Thu, Mar 29, 2018 at 9:40 PM, Asif Iqbal <vadud3 at gmail.com> wrote:
>
>
> On Thu, Mar 29, 2018 at 3:41 PM, Peter Otten <__peter__ at web.de> wrote:
>
>> Asif Iqbal wrote:
>>
>> > On Thu, Mar 29, 2018 at 3:56 AM, Peter Otten <__peter__ at web.de> wrote:
>> >
>> >> Asif Iqbal wrote:
>> >>
>> >> > I am trying to extract all the *template-name*s, but no success yet
>> >> >
>> >> > Here is a sample xml file
>> >> >
>> >> > <collection xmlns:y="http://tail-f.com/ns/rest">
>> >> > <template-metadata xmlns="http://networks.com/nms">
>> >> > <template-name>ALLFLEX-BLOOMINGTON</template-name>
>> >> > <type>post-staging</type>
>> >> > <device-type>full-mesh</device-type>
>> >> > <provider-tenant>ALLFLEX</provider-tenant>
>> >> > <subscription xmlns="http://networks.com/nms">
>> >> > <solution-tier>advanced-plus</solution-tier>
>> >> > <bandwidth>1000</bandwidth>
>> >> > <is-analytics-enabled>true</is-analytics-enabled>
>> >> > <is-primary>true</is-primary>
>> >> > </subscription>
>> >> > ....
>> >> > </collection>
>> >> >
>> >> > with open('/tmp/template-metadata') as f:
>> >> > import xml.etree.ElementTree as ET
>> >> > root = ET.fromstring(f.read())
>> >> >
>> >> > print len(root)
>> >> > print root[0][0].text
>> >> > for l in root.findall('template-metadata'):
>> >> > print l
>> >> >
>> >> >
>> >> > 392
>> >> > ALLFLEX-BLOOMINGTON
>> >> >
>> >> >
>> >> > It prints the length of the tree and the first element of the first
>> >> child,
>> >> > but when I try to loop through to find all the 'template-name's
>> >> > it does not print anything.
>> >> >
>> >> > What am I doing wrong?
>> >>
>> >> You have to include the namespace:
>> >>
>> >> for l in root.findall('{http://networks.com/nms}template-metadata'):
>> >>
>> >
>> > How do I extract the 'template-name' ?
>>
>> I hoped you'd get the idea.
>>
>> > This is what I tried
>> >
>> > for l in root.findall('{http://networks.com/nms}template-metadata'):
>>
>> Rinse and repeat:
>>
>> > print l.find('template-name').text
>>
>> should be
>>
>> print l.find('{http://networks.com/nms}template-name').text
>>
>> >
>> > I am following the doc
>> > https://docs.python.org/2/library/xml.etree.elementtree.html section
>> > 19.7.1.3 findall example
>> >
>> > I get this error attribute error 'NoneType' object has no attribute
>> text.
>> > I do not understand why l.find('template-name') is NoneType.
>>
>> Take the time to read
>>
>> https://docs.python.org/2/library/xml.etree.elementtree.html
>> #parsing-xml-with-namespaces
>
>
> Thanks for the links and hints.
>
> I got it working now
>
> I used ns = { 'nms' : 'http://networks.com/nms
> <http://networks.com/nms%7Dtemplate-name').text>' }
>
> And then l.find('nms:template-name', ns)
>
> I also want to extract the namespace and I see this gets me the namespace
>
> str(root[0]).split('{')[1].split('}')[0]
>
> Is there a better way to extract the name space?
>
>
>
This worked
ns = { 'nms' : root[0].tag.split('}')[0].split('{')[1] }
for l in root.findall('nms:template-metadata', ns):
print l.find('nms:template-name', ns).text
Although I think manually creating the ns dictionary looks cleaner :-)
>
>>
>>
>> > Here is complete code with output.
>> >
>> >
>> > import xml.etree.ElementTree as ET
>> >
>> > xmlfile='''
>> > <collection xmlns:y="http://tail-f.com/ns/rest">
>> > <template-metadata xmlns="http://networks.com/nms">
>> > <template-name>ALLFLEX-BLOOMINGTON</template-name>
>> > <type>post-staging</type>
>> > <device-type>full-mesh</device-type>
>> > <provider-tenant>ALLFLEX</provider-tenant>
>> > <subscription xmlns="http://networks.com/nms">
>> > <solution-tier>advanced-plus</solution-tier>
>> > <bandwidth>1000</bandwidth>
>> > <is-analytics-enabled>true</is-analytics-enabled>
>> > <is-primary>true</is-primary>
>> > </subscription></template-metadata></collection>'''
>> >
>> > root = ET.fromstring(xmlfile)
>> > print root.tag
>> > print root[0][0].text
>> > for l in root.findall('{http://networks.com/nms}template-metadata'):
>> > print l.find('template-name').text
>> >
>> > collection
>> > ALLFLEX-BLOOMINGTON
>> >
>> >
>> ------------------------------------------------------------
>> ---------------
>> AttributeError
>> > Traceback (most recent call
>> > last)<ipython-input-18-73bd6770766a> in <module>() 19 print
>> > root[0][0].text 20 for l in
>> > root.findall('{http://networks.com/nms}template-metadata'):---> 21
>> > print l.find('template-name').text
>> > AttributeError: 'NoneType' object has no attribute 'text'
>>
>>
>> _______________________________________________
>> Tutor maillist - Tutor at python.org
>> To unsubscribe or change subscription options:
>> https://mail.python.org/mailman/listinfo/tutor
>>
>
>
>
> --
> Asif Iqbal
> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
>
>
--
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
More information about the Tutor
mailing list