regexp help

Paul McGuire ptmcg at austin.rr.com
Fri May 9 18:54:21 EDT 2008


On May 9, 5:19 pm, globalrev <skanem... at yahoo.se> wrote:
> i want to a little stringmanipulationa nd im looking into regexps. i
> couldnt find out how to do:
> s = 'poprorinoncoce'
> re.sub('$o$', '$', s)
> should result in 'prince'
>
> $ is obv the wrng character to use bu what i mean the pattern is
> "consonant o consonant" and should be replace by just "consonant".
> both consonants should be the same too.
> so mole would be mole
> mom would be m etc

from re import *
vowels = "aAeEiIoOuU"
cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])[%s]\1" % (cons,vowels))
print encodeRe.sub(r"\1",s)

This is actually a little more complex than you asked - it will search
for any consonant-vowel-same_consonant triple, and replace it with the
leading consonant.  To meet your original request, change to:

from re import *
cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])o\1" % cons)
print encodeRe.sub(r"\1",s)

Both print "prince".

-- Paul

(I have a pyparsing solution too, but I just used it to prototype up
the solution, then coverted it to regex.)



More information about the Python-list mailing list