Find and Replace Simplification

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Jul 19 12:22:19 EDT 2013


On Fri, 19 Jul 2013 09:22:48 -0400, Devyn Collier Johnson wrote:

> I have some code that I want to simplify. I know that a for-loop would
> work well, but can I make re.sub perform all of the below tasks at once,
> or can I write this in a way that is more efficient than using a
> for-loop?
> 
> DATA = re.sub(',', '', 'DATA')
> DATA = re.sub('\'', '', 'DATA')
> DATA = re.sub('(', '', 'DATA')
> DATA = re.sub(')', '', 'DATA')


I don't think you intended to put DATA in quotes on the right hand side. 
That makes it literally the string D A T A, so all those replacements are 
no-ops, and you could simplify it to:

DATA = 'DATA'

But that's probably not what you wanted.

My prediction is that this will be by far the most efficient way to do 
what you are trying to do:

py> DATA = "Hello, 'World'()"
py> DATA.translate(dict.fromkeys(ord(c) for c in ",'()"))
'Hello World'

That's in Python 3 -- in Python 2, using translate will still probably be 
the fastest, but you'll need to call it like this:

import string
DATA.translate(string.maketrans("", ""), ",'()")

I also expect that the string replace() method will be second fastest, 
and re.sub will be the slowest, by a very long way.

As a general rule, you should avoiding using regexes unless the text you 
are searching for actually contains a regular expression of some kind. If 
it's merely a literal character or substring, standard string methods 
will probably be faster.


Oh, and a tip for you:

- don't escape quotes unless you don't need to, use the other quote.

s = '\''  # No, don't do this!
s = "'"  # Better!

and vice versa.




-- 
Steven



More information about the Python-list mailing list