converting text and spans to an ElementTree

Steven Bethard steven.bethard at gmail.com
Tue May 22 02:02:34 EDT 2007


I have some text and a list of Element objects and their offsets, e.g.::

     >>> text = 'aaa aaa aaabbb bbbaaa'
     >>> spans = [
     ...     (etree.Element('a'), 0, 21),
     ...     (etree.Element('b'), 11, 18),
     ...     (etree.Element('c'), 18, 18),
     ... ]

I'd like to produce the corresponding ElementTree. So I want to write a 
get_tree() function that works like::

     >>> tree = get_tree(text, spans)
     >>> etree.tostring(tree)
     '<a>aaa aaa aaa<b>bbb bbb<c /></b>aaa</a>'

Perhaps I just need some more sleep, but I can't see an obvious way to 
do this. Any suggestions?

Thanks,

STeVe



More information about the Python-list mailing list