Find and Replace Simplification

Devyn Collier Johnson devyncjohnson at gmail.com
Sat Jul 20 08:41:44 EDT 2013


On 07/20/2013 07:16 AM, Joshua Landau wrote:
> On 19 July 2013 18:29, Serhiy Storchaka <storchaka at gmail.com> wrote:
>> 19.07.13 19:22, Steven D'Aprano написав(ла):
>>
>>> I also expect that the string replace() method will be second fastest,
>>> and re.sub will be the slowest, by a very long way.
>>
>> The string replace() method is fastest (at least in Python 3.3+). See
>> implementation of html.escape() etc.
> def escape(s, quote=True):
>      if quote:
>          return s.translate(_escape_map_full)
>      return s.translate(_escape_map)
>
> I fail to see how this supports the assertion that str.replace() is
> faster. However, some quick timing shows that translate has a very
> high penalty for missing characters and is a tad slower any way.
>
> Really, though, there should be no reason for .translate() to be
> slower than replace -- at worst it should just be "reduce(lambda s,
> ab: s.replace(*ab), mapping.items()¹, original_str)" and end up the
> *same* speed as iterated replace. But the fact that it doesn't have to
> re-build the string every replace means that theoretically it should
> be a lot faster.
>
> ¹ I realise this won't actually work for several reasons, and doesn't
> support things like passing in lists as mappings, but you could
> trivially support the important builtin types² and fall back to the
> original for others, where the pure-python __getitem__ is going to be
> the slowest part anyway.
>
> ² List, tuple, dict, str, bytes -- so basically just mappings and
> ordered iterables
Thanks Joshua Landau! str.replace() does appear to be best, so that is 
the suggestion that I will implement.

Mahalo,

DCJ



More information about the Python-list mailing list