Pattern matching with string and list

Tom Anderson twic at urchin.earth.li
Mon Dec 12 19:42:30 EST 2005


On Mon, 12 Dec 2005 olaufr at gmail.com wrote:

> I'd need to perform simple pattern matching within a string using a list 
> of possible patterns. For example, I want to know if the substring 
> starting at position n matches any of the string I have a list, as 
> below:
>
> sentence = "the color is $red"
> patterns = ["blue","red","yellow"]
> pos = sentence.find($)

I assume that's a typo for "sentence.find('$')", rather than some new 
syntax i've not learned yet!

> # here I need to find whether what's after 'pos' matches any of the
> strings of my 'patterns' list
> bmatch = ismatching( sentence[pos:], patterns)
>
> Is an equivalent of this ismatching() function existing in some Python
> lib?

I don't think so, but it's not hard to write:

def ismatching(target, patterns):
 	for pattern in patterns:
 		if target.startswith(pattern):
 			return True
 	return False

You don't say what bmatch should be at the end of this, so i'm going with 
a boolean; it would be straightforward to return the pattern which 
matched, or the index of the pattern which matched in the pattern list, if 
that's what you want.

The tough guy way to do this would be with regular expressions (in the re 
module); you could do the find-the-$ and the match-a-pattern bit in one 
go:

import re
patternsRe = re.compile(r"\$(blue)|(red)|(yellow)")
bmatch = patternsRe.search(sentence)

At the end, bmatch is None if it didn't match, or an instance of re.Match 
(from which you can get details of the match) if it did.

If i was doing this myself, i'd be a bit cleaner and use non-capturing 
groups:

patternsRe = re.compile(r"\$(?:blue)|(?:red)|(?:yellow)")

And if i did want to capture the colour string, i'd do it like this:

patternsRe = re.compile(r"\$((?:blue)|(?:red)|(?:yellow))")

If this all looks like utter gibberish, DON'T PANIC! Regular expressions 
are quite scary to begin with (and certainly not very regular-looking!), 
but they're actually quite simple, and often a very powerful tool for text 
processing (don't get carried way, though; regular expressions are a bit 
like absinthe, in that a little helps your creativity, but overindulgence 
makes you use perl).

In fact, we can tame the regular expressions quite neatly by writing a 
function which generates them:

def regularly_express_patterns(patterns):
 	pattern_regexps = map(
 		lambda pattern: "(?:%s)" % re.escape(pattern),
 		patterns)
 	regexp = r"\$(" + "|".join(pattern_regexps) + ")"
 	return re.compile(regexp)

patternsRe = regularly_express_patterns(patterns)

tom

-- 
limited to concepts that are meta, generic, abstract and philosophical --
IEEE SUO WG



More information about the Python-list mailing list