re.sub(): replace longest match instead of leftmost match?

Ian Kelly ian.g.kelly at gmail.com
Fri Dec 16 12:57:06 EST 2011


On Fri, Dec 16, 2011 at 10:36 AM, MRAB <python at mrabarnett.plus.com> wrote:
> On 16/12/2011 16:49, John Gordon wrote:
>>
>> According to the documentation on re.sub(), it replaces the leftmost
>> matching pattern.
>>
>> However, I want to replace the *longest* matching pattern, which is
>> not necessarily the leftmost match.  Any suggestions?
>>
>> I'm working with IPv6 CIDR strings, and I want to replace the longest
>> match of "(0000:|0000$)+" with ":".  But when I use re.sub() it replaces
>> the leftmost match, even if there is a longer match later in the string.
>>
>> I'm also looking for a regexp that will remove leading zeroes in each
>> four-digit group, but will leave a single zero if the group was all
>> zeroes.
>>
> How about this:
>
> result = re.sub(r"\b0+(\d)\b", r"\1", string)

Close.

pattern = r'\b0+([1-9a-f]+|0)\b'
re.sub(pattern, r'\1', string, flags=re.IGNORECASE)

Cheers,
Ian



More information about the Python-list mailing list