[Python-ideas] string.replace should accept a list as a first argument

Steven D'Aprano steve at pearwood.info
Wed Oct 7 03:30:07 CEST 2015


On Tue, Oct 06, 2015 at 06:34:16PM +0200, M.-A. Lemburg wrote:
> On 06.10.2015 21:25, Emil Rosendahl Petersen wrote:
> > I think string.replace should be changed accept a list as a first
> > argument.
> > 
> > That way, if I had this string:
> > 
> > "There are a lot of undesirable people in this filthy world"
> > 
> > Then I could do this, replace(['undesirable', 'filthy'], ''), in case
> > that's what I wanted to do.
> > 
> > Now, string.replace doesn't accept a list as its first argument, and
> > complains about implicit conversion.

[Emil]
> > Is there any great obstacle to just having the function loop over that
> > list, calling itself in case we get a list argument instead of a str?

Looping over each replacement item is the wrong solution. Think of 
the result when one of the search strings is a substring of the 
replacement:

py> source = "I ate a chicken salad, and she had a ham sandwich."
py> for term in ["ham", "turkey", "chicken", "spam"]:
...     source = source.replace(term, "spam and cheese")
...
py> print(source)
I ate a spam and cheese and cheese salad, and she had a spam and cheese 
and cheese sandwich.


You need to be a bit more careful about how to do the replacements.


> > Doesn't that seem like the more obvious behaviour? To me the results of
> > running the above code should be unsurprising, if this change was
> > implemented: "there are a lot of people in this world".

[MAL] 
> I think the "one obvious way" of doing a multi-replace is to
> use the re module, since implementing this efficiently is
> non-trivial.
> 
> String methods are meant to be basic (high performance)
> operations.

A similar issue was discussed last month, in the context of str.split 
rather than replace, and I talked about the pitfalls of using the re 
module:

https://mail.python.org/pipermail/python-ideas/2015-September/036586.html


The implementation isn't hard, but it's just tricky enough that some 
people will get it wrong, and just useful enough that a helper function 
will be a good idea. The question is, should that helper function be a 
string method, in the standard library, or merely something that you add 
to your own projects?

def replace_all(source, old, new, count=None):
    if isinstance(old, str):
        return source.replace(old, new, count)
    elif isinstance(old, tuple)
        regex = '|'.join(re.escape(s) for s in old)
        return new.join(re.split(regex, source, count))


-- 
Steve


More information about the Python-ideas mailing list