I can't understand re.sub

Jussi Piitulainen harvesting at is.invalid
Tue Dec 1 00:28:33 EST 2015


Erik writes:
> On 30/11/15 08:51, Jussi Piitulainen wrote:
[- -]
>> If you wish to,
>> say, replace "spam" in your foo with "REDACTED" but leave it intact in
>> "May the spammer be prosecuted", a regex might be attractive after all.
>
> But that's not what the OP said they wanted to do. They said
> everything was very fixed - they did not want a general purpose human
> language text processing solution ... ;)

Language processing is not what I had in mind here. Merely this, that
there is some sort of word boundary, be it punctuation, whitespace, or
an end of the string:

   >>> re.sub(r'\bspam\b', '****', 'spamalot spam')
   'spamalot ****'

That's not perfect either, but it's simple and might be somewhat
proportional to the problem.

A real solution should be aware of the actual structure of those lines,
assuming they follow some defined syntax.



More information about the Python-list mailing list