re

Russell Blau russblau at hotmail.com
Wed Jun 4 13:02:51 EDT 2008


"Diez B. Roggisch" <deets at nospam.web.de> wrote in message 
news:6anvi4F38ei08U1 at mid.uni-berlin.de...
> David C. Ullrich schrieb:
>> Say I want to replace 'disc' with 'disk', but only
>> when 'disc' is a complete word (don't want to change
>> 'discuss' to 'diskuss'.) The following seems almost
>> right:
>>
>>   [^a-zA-Z])disc[^a-zA-Z]
>>
>> The problem is that that doesn't match if 'disc' is at
>> the start or end of the string. Of course I could just
>> combine a few re's with |, but it seems like there should
>> (or might?) be a way to simply append a \A to the first
>> [^a-zA-Z] and a \Z to the second.
>
> Why not
>
> ($|[\w])disc(^|[^\w])
>
> I hope \w is really the literal for whitespace - might be something 
> different, see the docs.

No, \s is the literal for whitespace. 
http://www.python.org/doc/current/lib/re-syntax.html

But how about:

text = re.sub(r"\bdisc\b", "disk", text_to_be_changed)

\b is the "word break" character, it matches at the beginning or end of any 
"word" (where a word is any sequence of \w characters, and \w is any 
alphanumeric
character or _).

Note that this solution still doesn't catch "Disc" if it is capitalized.

Russ






More information about the Python-list mailing list