regexp help
Paul McGuire
ptmcg at austin.rr.com
Fri May 9 18:54:21 EDT 2008
On May 9, 5:19 pm, globalrev <skanem... at yahoo.se> wrote:
> i want to a little stringmanipulationa nd im looking into regexps. i
> couldnt find out how to do:
> s = 'poprorinoncoce'
> re.sub('$o$', '$', s)
> should result in 'prince'
>
> $ is obv the wrng character to use bu what i mean the pattern is
> "consonant o consonant" and should be replace by just "consonant".
> both consonants should be the same too.
> so mole would be mole
> mom would be m etc
from re import *
vowels = "aAeEiIoOuU"
cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])[%s]\1" % (cons,vowels))
print encodeRe.sub(r"\1",s)
This is actually a little more complex than you asked - it will search
for any consonant-vowel-same_consonant triple, and replace it with the
leading consonant. To meet your original request, change to:
from re import *
cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])o\1" % cons)
print encodeRe.sub(r"\1",s)
Both print "prince".
-- Paul
(I have a pyparsing solution too, but I just used it to prototype up
the solution, then coverted it to regex.)
More information about the Python-list
mailing list