inserting bracketings into a string

Andrew A. Cox Andrew at ferndale.net.nz
Tue Nov 16 19:49:11 EST 2004


Steve, here is my way of doing it (see code below) it is sort of the
same way your were doing it but a little cleaner (I hope)

#Modify to put starts before ends or vice versa
START_TYPE = 2
END_TYPE = 1

def insert_bracketings(initalText, spans):
	ans = ""
	parts = []
	for span in spans:
		#For every span there is a start part and an end apart
		parts.append( (span[1], START_TYPE, span[0]) )
		parts.append( (span[2], END_TYPE, span[0]) )
	parts.sort() #List will be sorted by location, type then letter)
	lastlocation = 0
	for part in parts:
		addString = ""
		if part[1] == START_TYPE:
			addString = "[%s " % part[2]
		elif part[1] == END_TYPE:
			addString = "]"
		else:
			pass #Maybe raise error??
		#Could use a list here (maybe faster)
		ans += initalText[lastlocation:part[0]] + addString
lastlocation = part[0]
	return ans

print insert_bracketings(text, [('A', 0, 6), ('B', 6, 9), ('C', 25, 31)]
)
print '[A abcde [B fgh]] ijklmnop qrstu [C vw xyz]'

-----Original Message-----
From: Steven Bethard [mailto:steven.bethard at gmail.com] 
Sent: Wednesday, 17 November 2004 12:38 p.m.
To: python-list at python.org
Subject: inserting bracketings into a string

I'm trying to insert some bracketings in a string based on a set of
labels and associated start and end indices.  For example, I'd like to
do something like:

>>> text = 'abcde fgh ijklmnop qrstu vw xyz'
>>> spans = [('A', 0, 9), ('B', 6, 9), ('C', 25, 31)]
>>> insert_bracketings(text, spans)
'[A abcde [B fgh]] ijklmnop qrstu [C vw xyz]'

My current implementation looks like:

>>> def insert_bracketings(text, spans):
... 	starts = [start for _, start, _ in spans]
... 	ends = [end for _, _, end in spans]
... 	indices = sorted(set(starts + ends))
... 	splits = [(text[start:end], start, end)
... 	          for start, end in zip([None] + indices, indices +
[None])]
... 	start_map, end_map = {}, {}
... 	for label, start, end in spans:
... 		start_map.setdefault(start, []).append('[%s ' % label)
... 		end_map.setdefault(end, []).append(']')
... 	result = []	
... 	for string, start, end in splits:
... 		if start in start_map:
... 			result.extend(start_map[start])
... 		result.append(string)
... 		if end in end_map:
... 			result.extend(end_map[end])
... 	return ''.join(result)
... 

but it seems like there ought to be an easier way.  Can anyone help me?

Thanks in advance,

Steve
-- 
When you're being strangled, everything you do is anaerobic exercise!
        --- Adam Olshefsky
-- 
http://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list