Pattern matching with string and list

Michael Spencer mahs at telcopartners.com
Mon Dec 12 19:09:52 EST 2005


olaufr at gmail.com wrote:
> Hi,
> 
> I'd need to perform simple pattern matching within a string using a
> list of possible patterns. For example, I want to know if the substring
> starting at position n matches any of the string I have a list, as
> below:
> 
> sentence = "the color is $red"
> patterns = ["blue","red","yellow"]
> pos = sentence.find($)
> # here I need to find whether what's after 'pos' matches any of the
> strings of my 'patterns' list
> bmatch = ismatching( sentence[pos:], patterns)
> 
> Is an equivalent of this ismatching() function existing in some Python
> lib?
> 
> Thanks,
> 
> Olivier.
> 
As I think you define it, ismatching can be written as:

  >>> def ismatching(sentence, patterns):
  ...     re_pattern = re.compile("(%s)\Z" % "|".join(patterns))
  ...     return bool(re_pattern.match(sentence))
  ...
  >>> ismatching(sentence[pos+1:], patterns)
  True
  >>> ismatching(sentence[pos+1:], ["green", "blue"])
  False
  >>>
(For help with regular expressions, see: http://www.amk.ca/python/howto/regex/)


or, you can ask the regexp engine to starting looking at a point you specify:

  >>> def ismatching(sentence, patterns, startingpos = 0):
  ...     re_pattern = re.compile("(%s)\Z" % "|".join(patterns))
  ...     return bool(re_pattern.match(sentence, startingpos))
  ...
  >>> ismatching(sentence, patterns, pos+1)
  True
  >>>


but, you may be able to save the separate step of determining pos, by including 
it in the regexp, e.g.,

  >>> def matching(patterns, sentence):
  ...     re_pattern = re.compile("\$(%s)" % "|".join(patterns))
  ...     return bool(re_pattern.search(sentence))
  ...
  >>> matching(patterns, sentence)
  True
  >>> matching(["green", "blue"], sentence)
  False
  >>>

then, it might be more general useful to return the match, rather than the 
boolean value - you can still use it in truth testing, since a no-match will 
evaluate to False

  >>> def matching(patterns, sentence):
  ...     re_pattern = re.compile("\$(%s)" % "|".join(patterns))
  ...     return re_pattern.search(sentence)
  ...
  >>> if matching(patterns, sentence): print "Match"
  ...
  Match
  >>>


Finally, if you are going to be doing a lot of these it would be faster to take 
the pattern compilation out of the function, and simply use the pre-compiled 
regexp, or as below, its bound method: search:

  >>> matching = re.compile("\$(%s)\Z" % "|".join(patterns)).search
  >>> matching(sentence)
  <_sre.SRE_Match object at 0x01847E60>
  >>> bool(_)
  True
  >>> bool(matching("the color is $red but there is more"))
  False
  >>> bool(matching("the color is $pink"))
  False
  >>> bool(matching("the $color is $red"))
  True
  >>>

HTH

Michael







More information about the Python-list mailing list