Getting elements and text with lxml

Stefan Behnel stefan_ml at behnel.de
Sat May 17 11:17:29 EDT 2008


J. Pablo Fernández wrote:
> I have an XML file that starts with:
> 
> <vortaro>
> <art mrk="$Id: a.xml,v 1.10 2007/09/11 16:30:20 revo Exp $">
> <kap>
>   <ofc>*</ofc>-<rad>a</rad>
> </kap>
> 
> out of it, I'd like to extract something like (I'm just showing one
> structure, any structure as long as all data is there is fine):
> 
> [("ofc", "*"), "-", ("rad", "a")]

    >>> root = etree.fromstring(xml)
    >>> l = []
    >>> for el in root.iter():    # or root.getiterator()
    ...     l.append((el, el.text))
    ...     l.append(el.text)

or maybe this is enough:

    list(root.itertext())

Stefan



More information about the Python-list mailing list