converting text and spans to an ElementTree
Steven Bethard
steven.bethard at gmail.com
Tue May 22 14:34:56 EDT 2007
attn.steven.kuo at gmail.com wrote:
> On May 21, 11:02 pm, Steven Bethard <steven.beth... at gmail.com> wrote:
>> I have some text and a list of Element objects and their offsets, e.g.::
>>
>> >>> text = 'aaa aaa aaabbb bbbaaa'
>> >>> spans = [
>> ... (etree.Element('a'), 0, 21),
>> ... (etree.Element('b'), 11, 18),
>> ... (etree.Element('c'), 18, 18),
>> ... ]
>>
>> I'd like to produce the corresponding ElementTree. So I want to write a
>> get_tree() function that works like::
>>
>> >>> tree = get_tree(text, spans)
>> >>> etree.tostring(tree)
>> '<a>aaa aaa aaa<b>bbb bbb<c /></b>aaa</a>'
>>
>> Perhaps I just need some more sleep, but I can't see an obvious way to
>> do this. Any suggestions?
>
> It seems you're looking to construct an Interval Tree:
>
> http://en.wikipedia.org/wiki/Interval_tree
No, I'm looking to construct an ElementTree from intervals. ;-) Could
you elaborate on how an Interval Tree would help me?
STeVe
More information about the Python-list
mailing list