Cleaning up a string

Peter Otten __peter__ at web.de
Tue Jul 24 16:17:58 EDT 2007


James Stroud wrote:

> I dashed off the following function to clean a string in a little
> program I wrote:
> 
> def cleanup(astr, changes):
>    for f,t in changes:
>      atr = astr.replace(f, t)
>    return astr
> 
> where changes would be a tuple, for example:
> 
> changes = (
>               ('%', '\%'),
>               ('$', '\$'),
>               ('-', '_')
>            )
> 
> 
> If these were were single replacements (like the last), string.translate
> would be the way to go. As it is, however, the above seems fairly
> inefficient as it potentially creates a new string at each round. Does
> some function or library exist for these types of transformations that
> works more like string.translate or is the above the best one can hope
> to do without writing some C? I'm guessing that "if s in astr" type
> optimizations are already done in the replace() method, so that is not
> really what I'm getting after.

unicode.translate() supports this kind of replacement...

>>> u"a % b $ c-d".translate(dict((ord(a), unicode(b)) for a, b in changes))
u'a \\% b \\$ c_d'

and re.compile(...).sub() accepts a function:

>>> def replace(match, lookup=dict(changes)):
...     return lookup[match.group()]
...
>>> re.compile("([$%-])").sub(replace, "a % b $ c-d")
'a \\% b \\$ c_d'

Peter



More information about the Python-list mailing list