a RegEx puzzle
Kent Johnson
kent37 at tds.net
Fri Mar 11 13:06:46 EST 2005
Charles Hartman wrote:
> I'm still shaky on some of sre's syntax. Here's the task: I've got
> strings (never longer than about a dozen characters) that are guaranteed
> to be made only of characters 'x' and '/'. In each string I want to find
> the longest continuous stretch of pairs whose first character is 'x' and
> the second is either mark. So if marks = '/xx/xxx///', the "winning"
> stretch begins at position 2 and is 6 characters long ('x/xxx/'), which
> requires finding a second match that overlaps the first match (which is
> just 'xx' in position 1). (When there are multiple "winning" stretches,
> I might want to adjudicate among them, but that's a separate problem.) I
> hope this is clear enough.
>
> Here's the best I've come up with so far:
>
> pat = sre.compile('(x[x/])+')
> (longest, startlongest) = max([(fnd.end()-fnd.start(), fnd.start()) for
> i in range(len(marks))
> for fnd in pat.finditer(marks,i)])
It's pretty simple to put re.search() into a loop where subsequent searches start from the character
after where the previous one matched. Here is a solution that uses a general-purpose longest match
function:
import re
# RE solution
def longestMatch(rx, s):
''' Find the longest match for rx in s.
Returns (start, length) for the match or (None, None) if no match found.
'''
start = length = current = 0
while True:
m = rx.search(s, current)
if not m:
break
mStart, mEnd = m.span()
current = mStart + 1
if (mEnd - mStart) > length:
start = mStart
length = mEnd - mStart
if length:
return start, length
return None, None
pairsRe = re.compile(r'(x[x/])+')
for s in [ '/xx/xxx///', '//////xx//' ]:
print s, longestMatch(pairsRe, s)
More information about the Python-list
mailing list