Lamaizm... XML problem...

Stefan Behnel stefan.behnel-n05pAM at web.de
Tue Oct 9 09:11:03 EDT 2007


durumdara wrote:
> from xml.dom import minidom
[...]
>            t_props = t_comp.getElementsByTagName('prop')
>            for t_prop in t_props:
>                attrs = t_prop.attributes.keys()
>                print attrs
>                print t_prop.nodeName
>                print t_prop.nodeType
>                print [t_prop.nodeValue]
>                sys.exit()
> -------------------
> The result is:
> 
>>>>
> <xml.dom.minidom.Document instance at 0x0172A3C8>
> <DOM Element: properties at 0x1739378>
> [u'id', u'name']
> prop
> 1
> [None]
>>>>
> -------------------
> 
> The source file is:
> 
> <?xml version="1.0" encoding="windows-1250"?>
> <langfile>
> <properties>
> <form>
> <component>
> <prop id="aaaa" name ="bbbb">cccc</prop>
> </component>
> </form>
> </properties>
> <constants>
> </constants>
> </langfile>
> -------------------
> I can get the attrs, but I can't get nodeValue (cccc)... I got None for it.

The W3C DOM treats text as nodes, so you have to check the children of t_prop
to find the text node and then read its nodeValue.

Alternatively, consider using an XML library that actually helps users in
working with XML, such as ElementTree or lxml.

http://codespeak.net/lxml

Stefan



More information about the Python-list mailing list