re.sub(): replace longest match instead of leftmost match?

Roy Smith roy at panix.com
Fri Dec 16 13:36:17 EST 2011


In article <jcfsrk$skh$1 at reader1.panix.com>,
 John Gordon <gordon at panix.com> wrote:

> I'm working with IPv6 CIDR strings, and I want to replace the longest
> match of "(0000:|0000$)+" with ":".  But when I use re.sub() it replaces
> the leftmost match, even if there is a longer match later in the string.
> 
> I'm also looking for a regexp that will remove leading zeroes in each
> four-digit group, but will leave a single zero if the group was all
> zeroes.

Having done quite a bit of IPv6 work, my opinion here is that you're 
trying to do The Wrong Thing.

What you want is an IPv6 class which represents an address in some 
canonical form.  It would have constructors which accept any of the 
RFC-2373 defined formats.  It would also have string formatting methods 
to convert the internal form into any of these formats.

Then, instead of attempting to regex your way directly from one string 
representation to another, you would do something like:

addr_string = "FEDC:BA98:7654:3210:FEDC:BA98:7654:321"
print IPv6(addr_string).to_short_form()



More information about the Python-list mailing list