mutliple search and replace

Bengt Richter bokr at oz.net
Thu Mar 28 21:00:03 EST 2002


On 28 Mar 2002 04:53:02 GMT, bokr at oz.net (Bengt Richter) wrote:

>On Wed, 27 Mar 2002 18:13:48 -0800, "Emile van Sebille" <emile at fenx.com> wrote:
>
>>Trung Hoang
>>> Suppose i have a map (or two lists). is it possible to make a sweep
>>through
>>> a string to find all occurances of key's in the map then replace them
>>all in
>>> one sweep?
>>>
>>
>>>>> d = {'one':'een','two':'twee','three':'drie','four':'vier'}
>>>>> s = "one two buckle my shoe three four je t'adore"
>>>>> ' '.join([d.get(w,w) for w in s.split()])
>>"een twee buckle my shoe drie vier je t'adore"
>>
>>
>Nice. Here's a variant that doesn't use delimiters or
>modify white space (key order could make a difference,
>depending on key overlap):
>
> >>> import re
> >>> d = {'one':'een','two':'twee','three':'drie','four':'vier'}
> >>> resp = re.compile('('+'|'.join(d.keys())+')')
> >>> s = "one two buckle my shoe three four je t'adore"
> >>> ''.join([d.get(w,w) for w in resp.split(s)])
> "een twee buckle my shoe drie vier je t'adore"
> >>> s = "one.two+buckle_my_shoe,three-four&je_t'adore"
> >>> ''.join([d.get(w,w) for w in resp.split(s)])
> "een.twee+buckle_my_shoe,drie-vier&je_t'adore"
>

Here is something to deal with the key order problem (i.e., if one
key is a substring of another. You have to sort by length to get
maximal or minimal matching. This is for maximal:

 >>> def dirinterp(s,d):
 ...     import re
 ...     klist = [(-len(k),k) for k in d.keys()]
 ...     klist.sort()
 ...     resp = re.compile('('+'|'.join([k for l,k in klist])+')')
 ...     return ''.join([kv.get(w,w) for w in resp.split(s)])
 ...
 >>> subdir = {"sub":"<was sub>","substring":"<was substring>"}
 >>> dirinterp("Will sub or substring be replaced?",subdir)
 'Will <was sub> or <was substring> be replaced?'
 >>> dirinterp("Will substring or sub be replaced?",subdir)
 'Will <was substring> or <was sub> be replaced?'

This is hardly tested at all, but I guess it could be handy.
No reason you couldn't have keys in the form '$xxx' or '$XXX$',
etc., either ;-) What else needs to be fixed? (Yes, think of a better name ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list