finding indices in a sequence of parentheses
Steven Bethard
steven.bethard at gmail.com
Sun May 29 16:32:14 EDT 2005
I have a list of strings that looks something like:
lst = ['0', '0', '(*)', 'O', '(*', '*', '(*', '*))', '((*', '*)', '*)']
The parentheses in the labels indicate where an "annotation" starts and
ends. So for example, the label '(*)' at index 2 of the list means that
I have an annotation at (2, 2), and the labels '(*', '*', '(*', '*))' at
indices 4 through 7 mean that I have an annotation at (4, 7) and an
annotation at (6, 7).
I'd like to determine all indices at which I have an annotation. So for
the data above, I want the indices:
(2, 2), (4, 7), (6, 7), (8, 9) and (8, 10)
Here's what I'm doing now:
py> def indices(lst):
... stack = []
... for i, s in enumerate(lst):
... if s == 'O':
... continue
... stack.extend([i]*s.count('('))
... if '*' in s and not stack:
... raise Exception('No start for %r at %i' % (s, i))
... for _ in range(s.count(')')):
... try:
... yield stack.pop(), i
... except IndexError:
... raise Exception('No start for %r at %i' % (s, i))
... if stack:
... raise Exception('No ends for starts at %r' % stack)
...
py> list(indices(['0', '0', '(*)', 'O', '(*', '*', '(*', '*))', '((*',
'*)', '*)', '0']))
[(2, 2), (6, 7), (4, 7), (8, 9), (8, 10)]
I think that works right, but I'm not certain. So two questions:
(1) Can anyone see anything wrong with the code above? and
(2) Does anyone see an easier/clearer/simpler[1] way of doing this?
Thanks,
STeVe
[1] Yes, I know easier/clearer/simpler are subjective terms. It's okay,
I'm only looking for opinions here anyway. =)
More information about the Python-list
mailing list