inserting bracketings into a string
Michael Loritsch
loritsch at gmail.com
Wed Nov 17 02:22:47 EST 2004
Steven Bethard <steven.bethard at gmail.com> wrote in message news:<mailman.6460.1100648300.5135.python-list at python.org>...
> I'm trying to insert some bracketings in a string based on a set of
> labels and associated start and end indices. For example, I'd like to
> do something like:
>
> >>> text = 'abcde fgh ijklmnop qrstu vw xyz'
> >>> spans = [('A', 0, 9), ('B', 6, 9), ('C', 25, 31)]
> >>> insert_bracketings(text, spans)
> '[A abcde [B fgh]] ijklmnop qrstu [C vw xyz]'
>
> My current implementation looks like:
>
> >>> def insert_bracketings(text, spans):
> ... starts = [start for _, start, _ in spans]
> ... ends = [end for _, _, end in spans]
> ... indices = sorted(set(starts + ends))
> ... splits = [(text[start:end], start, end)
> ... for start, end in zip([None] + indices, indices + [None])]
> ... start_map, end_map = {}, {}
> ... for label, start, end in spans:
> ... start_map.setdefault(start, []).append('[%s ' % label)
> ... end_map.setdefault(end, []).append(']')
> ... result = []
> ... for string, start, end in splits:
> ... if start in start_map:
> ... result.extend(start_map[start])
> ... result.append(string)
> ... if end in end_map:
> ... result.extend(end_map[end])
> ... return ''.join(result)
> ...
>
> but it seems like there ought to be an easier way. Can anyone help me?
>
> Thanks in advance,
>
> Steve
Below is a little more readable and compact implementation that
produces the same result. I'm not entirely sure if it qualifies as
'better', but I do believe it is ultimately more readable.
def insert_brackets(text, spans):
brackets = []
for span in spans:
brackets.append((span[1], ("".join(('[', span[0], " ")))))
brackets.append((span[2], ']'))
brackets.sort() #Note: (n, '[X ') < (n, ']')
answer = []
lastIndex = 0
for bracket in brackets:
if lastIndex == bracket[0]: #Repeated index
answer.append(bracket[1])
else: #Non repeated index
answer.extend((text[lastIndex:bracket[0]], bracket[1]))
lastIndex = bracket[0]
return "".join(answer)
Regards,
Michael Loritsch
More information about the Python-list
mailing list