Regex to change multiple lines

Chris Angelico rosuav at gmail.com
Thu Sep 3 10:39:00 EDT 2020


On Fri, Sep 4, 2020 at 12:16 AM Termoregolato <waste at is.invalid> wrote:
>
> Hi. I've on file, containing multiple lines, and I need to change every
> occurrence of a sequence between two chars, in this case "%%".
>
> -- original
> This is the %%text that i must modify%%, on a line, %%but also
> on the others%% that are following
>
> I need to change to
>
> -- desidered
> This is the <del>text that i must modify</del>, on a line, <del>but also
> on the others</del> that are following
>
> I've tryed with this small code
>
> rebarredtext = r'~~([^>]*)~~'

The trouble here is that the bit in the middle will take as much as it
possibly can, even though that includes the marker. (BTW, you're
inconsistent here as to whether the marker is "%%" or "~~".) I'm not
sure why you're excluding ">" from this group, but I presume that
that's intentional.

To solve this, you can make the asterisk "non-greedy" by adding a
question mark after it:

rebarredtext = r'~~([^>]*?)~~'

Now it matches as *little* as it possibly can.

Additionally, you can get the regex engine to do the work of replacing, too.

re.sub(rebarredtext, r'<del>\1</del>', txt)

That'll do the whole loop and everything; it replaces every match of
the regex with the given replacement text, where "\1" takes whatever
was inside the parentheses.

Regular expressions are incredibly powerful, but unfortunately hard to
debug, so all you can do is test and tinker :)

ChrisA


More information about the Python-list mailing list