string manipulation.

Mark Tolonen metolone+gmane at gmail.com
Tue Jul 27 10:30:41 EDT 2010


"gerardob" <gberbeglia at gmail.com> wrote in message 
news:29276755.post at talk.nabble.com...
>
> I am trying to read an xml using minidom from python library xml.dom
>
> This is the xml file:
> ---------------------------------
> <rm_structure>
> <resources>
> <resource>
> AB
> <Capacity>100</Capacity>
> <NumberVirtualClasses>
> 2
> </NumberVirtualClasses>
> </resource>
> </resources>
> </rm_structure>
> ----------------------------------
> This is the python code:
> --------------------------------
> from xml.dom import minidom
> doc= minidom.parse("example.xml")
> resources_section = doc.getElementsByTagName('resources')
> list_resources = resources_section[0].getElementsByTagName('resource')
>
> for r in list_resources:
> name = r.childNodes[0].nodeValue
>        print name
> print len(name)
> ---------------------------------
> The problem is that the nodeValue stored in the variable 'name' is not 
> "AB"
> (what i want) but instead it is a string that has length of 8 and it seems
> it include the tabs and/or other things.
> How can i get the string "AB" without the other stuff?
> Thanks.

Whitespace in XML is significant.  If the file was:

    <rm_structure>
    <resources>
    <resource>AB<Capacity>100</Capacity>
    <NumberVirtualClasses>2</NumberVirtualClasses>
    </resource>
    </resources>
    </rm_structure>

You would just read 'AB'.  If you don't control the XML file, then:

    print name.strip()

will remove leading and trailing whitespace.

-Mark





More information about the Python-list mailing list