How can I exclude a word by using re?

Jeff Schwab jeffrey.schwab at rcn.com
Sun Aug 14 11:24:33 EDT 2005


could ildg wrote:
> In re, the punctuation "^" can exclude a single character, but I want
> to exclude a whole word now. for example I have a string "hi, how are
> you. hello", I want to extract all the part before the world "hello",
> I can't use ".*[^hello]" because "^" only exclude single char "h" or
> "e" or "l" or "o". Will somebody tell me how to do it? Thanks.

import re

def demonstrate(regex, text):
	pattern = re.compile(regex)
	match = pattern.search(text)

	print " ", text
	if match:
		print "    Matched  '%s'" % match.group(0)
		print "    Captured '%s'" % match.group(1)
	else:
		print "    Did not match"

# Option 1: Match it all, but capture only the part before "hello."  The 
(.*?)
# matches as few characters as possible, so that this pattern would end 
before
# the first hello in "hello hello".

pattern = r"(.*?)hello"
print "Option 1:", pattern
demonstrate( pattern, "hi, how are you. hello" )

# Option 2: Don't even match the "hello," but make sure it's there.
# The first of these calls will match, but the second will not.  The
# (?=...) construct is using a feature called "forward look-ahead."

pattern = r"(.*)(?=hello)"
print "\nOption 2:", pattern
demonstrate( pattern, "hi, how are you. hello" )
demonstrate( pattern, "hi, how are you. ",     )



More information about the Python-list mailing list