substitution
Wilbert Berendsen
wbsoft at xs4all.nl
Thu Jan 21 10:08:21 EST 2010
Op donderdag 21 januari 2010 schreef MRAB:
> For longest first you need:
>
> keys = sorted(mapping.keys(), key=len, reverse=True)
Oh yes, I cut/pasted the wrong line :-)
Just for clarity:
import re
mapping = {
"foo" : "bar",
"baz" : "quux",
"quuux" : "foo"
}
# sort the keys, longest first, so 'aa' gets matched before 'a', because
# in Python regexps the first match (going from left to right) in a
# |-separated group is taken
keys = sorted(mapping.keys(), key=len, reverse=True)
rx = re.compile("|".join(keys))
repl = lambda x: mapping[x.group()]
s = "fooxxxbazyyyquuux"
rx.sub(repl, s)
>> One thing remaining: if the replacement keys could contain non-alphanumeric
>> characters, they should be escaped using re.escape:
>> rx = re.compile("|".join(re.escape(key) for key in keys))
>>
>Strictly speaking, not all non-alphanumeric characters, but only the
>special ones.
True, although the re.escape function simply escapes all non-alphanumeric
characters :)
And here is a factory function that returns a translator given a mapping. The
translator can be called to perform replacements in a string:
import re
def translator(mapping):
keys = sorted(mapping.keys(), key=len, reverse=True)
rx = re.compile("|".join(keys))
repl = lambda m: mapping[m.group()]
return lambda s: rx.sub(repl, s)
#Usage:
>>> t = translator(mapping)
>>> t('fooxxxbazyyyquuux')
'barxxxquuxyyyfoo'
w best regards,
Wilbert Berendsen
--
http://www.wilbertberendsen.nl/
More information about the Python-list
mailing list