get element text in DOM?

Manlio Perillo NOmanlio_perilloSPAM at libero.it
Sat Nov 13 04:36:37 EST 2004


On Wed, 10 Nov 2004 17:11:09 -0200, Juliano Freitas
<jubafre at atlas.ucpel.tche.br> wrote:

>How can i get the text between the <teste> tags??
>
>>>> xml = """<root><teste> texto </teste></root>"""
>>>> from xml.dom import minidom
>>>> document = minidom.parseString(xml)
>>>> document
><xml.dom.minidom.Document instance at 0x4181df0c>
>>>> minidom.getElementsByTagName('teste')
>
>>>> element = document.getElementsByTagName('teste')
>>>> element
>[<DOM Element: teste at 0x418e110c>]
>>>> element[0].nodeType
>1
>


Here is an useful function I have written:

def getText(node, recursive = False):
    """
    Get all the text associated with this node.
    With recursive == True, all text from child nodes is retrieved
    """
    L = ['']
    for n in node.childNodes:
        if n.nodeType in (dom.Node.TEXT_NODE,
dom.Node.CDATA_SECTION_NODE):
            L.append(n.data)
        else:
            if not recursive:
                return None
            L.append( get_text(n) )
                
    return ''.join(L)



>>> print getText(element[0])




Regards   Manlio Perillo



More information about the Python-list mailing list