How to get the "longest possible" match with Python's RE module?

Bryan Olson fakeaddress at nowhere.org
Tue Sep 12 21:29:02 EDT 2006


Licheng Fang wrote:
> Basically, the problem is this:
> 
>>>> p = re.compile("do|dolittle")
>>>> p.match("dolittle").group()
> 'do'
> 
> Python's NFA regexp engine trys only the first option, and happily
> rests on that. There's another example:
> 
>>>> p = re.compile("one(self)?(selfsufficient)?")
>>>> p.match("oneselfsufficient").group()
> 'oneself'
> 
> The Python regular expression engine doesn't exaust all the
> possibilities, but in my application I hope to get the longest possible
> match, starting from a given point.
> 
> Is there a way to do this in Python?

Yes. Here's a way, but it sucks real bad:


     def longest_match(re_string, text):
	regexp = re.compile('(?:' + re_string + ')$')
         while text:
	    m = regexp.match(text)
             if m:
                 return m
             text = text[:-1]
         return None


-- 
--Bryan



More information about the Python-list mailing list