How to get the "longest possible" match with Python's RE module?

MonkeeSage MonkeeSage at gmail.com
Tue Sep 12 02:04:48 EDT 2006


Licheng Fang wrote:
> Oh, please do have a look at the second link I've posted. There's a
> table comparing the regexp engines. The engines you've tested probably
> all use an NFA implementation.

Sorry! *blush* I admit I skipped over your links. I'll have a look now.

BTW, just an idea that may or may not work. What about finding all
matches that meet the absolute baseline pattern, then taking the
longest of them...something like this mabye:

def matcher(strings, pattern):
  out = ''
  reg = re.compile(pattern)
  for string in strings:
    match = reg.match(string).group()
    if (len(match) >= len(out)): # should this use > or >= ?
      out = match
  return out # empty is no matches, else longest match

p = ['dodad', 'dolittle', 'dodaday']
print matcher(p, r'do.*')
# dolittle

Just a thought...

Regards,
Jordan




More information about the Python-list mailing list