[Python-ideas] string.replace should accept a list as a first argument

MRAB python at mrabarnett.plus.com
Wed Oct 7 04:24:40 CEST 2015


On 2015-10-07 02:30, Steven D'Aprano wrote:
> On Tue, Oct 06, 2015 at 06:34:16PM +0200, M.-A. Lemburg wrote:
>> On 06.10.2015 21:25, Emil Rosendahl Petersen wrote:
>> > I think string.replace should be changed accept a list as a first
>> > argument.
>> >
>> > That way, if I had this string:
>> >
>> > "There are a lot of undesirable people in this filthy world"
>> >
>> > Then I could do this, replace(['undesirable', 'filthy'], ''), in case
>> > that's what I wanted to do.
>> >
>> > Now, string.replace doesn't accept a list as its first argument, and
>> > complains about implicit conversion.
>
> [Emil]
>> > Is there any great obstacle to just having the function loop over that
>> > list, calling itself in case we get a list argument instead of a str?
>
> Looping over each replacement item is the wrong solution. Think of
> the result when one of the search strings is a substring of the
> replacement:
>
> py> source = "I ate a chicken salad, and she had a ham sandwich."
> py> for term in ["ham", "turkey", "chicken", "spam"]:
> ...     source = source.replace(term, "spam and cheese")
> ...
> py> print(source)
> I ate a spam and cheese and cheese salad, and she had a spam and cheese
> and cheese sandwich.
>
>
> You need to be a bit more careful about how to do the replacements.
>
>
>> > Doesn't that seem like the more obvious behaviour? To me the results of
>> > running the above code should be unsurprising, if this change was
>> > implemented: "there are a lot of people in this world".
>
> [MAL]
>> I think the "one obvious way" of doing a multi-replace is to
>> use the re module, since implementing this efficiently is
>> non-trivial.
>>
>> String methods are meant to be basic (high performance)
>> operations.
>
> A similar issue was discussed last month, in the context of str.split
> rather than replace, and I talked about the pitfalls of using the re
> module:
>
> https://mail.python.org/pipermail/python-ideas/2015-September/036586.html
>
>
> The implementation isn't hard, but it's just tricky enough that some
> people will get it wrong, and just useful enough that a helper function
> will be a good idea. The question is, should that helper function be a
> string method, in the standard library, or merely something that you add
> to your own projects?
>
> def replace_all(source, old, new, count=None):
>      if isinstance(old, str):
>          return source.replace(old, new, count)
>      elif isinstance(old, tuple)
>          regex = '|'.join(re.escape(s) for s in old)
>          return new.join(re.split(regex, source, count))
>
>
Again, using the regex module, you can split on a named list, without
having to worry about sorting or escaping the items. :-)



More information about the Python-ideas mailing list