How to escape strings for re.finditer?

Cameron Simpson cs at cskk.id.au
Mon Feb 27 18:54:43 EST 2023


On 28Feb2023 00:11, Jen Kris <jenkris at tutanota.com> wrote:
>When matching a string against a longer string, where both strings have spaces in them, we need to escape the spaces. 
>
>This works (no spaces):
>
>import re
>example = 'abcdefabcdefabcdefg'
>find_string = "abc"
>for match in re.finditer(find_string, example):
>    print(match.start(), match.end())
>
>That gives me the start and end character positions, which is what I want. 
>
>However, this does not work:
>
>import re
>example = re.escape('X - cty_degrees + 1 + qq')
>find_string = re.escape('cty_degrees + 1')
>for match in re.finditer(find_string, example):
>    print(match.start(), match.end())
>
>I’ve tried several other attempts based on my reseearch, but still no 
>match. 

You need to print those strings out. You're escaping the _example_ 
string, which would make it:

     X - cty_degrees \+ 1 \+ qq

because `+` is a special character in regexps and so `re.escape` escapes 
it. But you don't want to mangle the string you're searching! After all, 
the text above does not contain the string `cty_degrees + 1`.

My secondary question is: if you're escaping the thing you're searching 
_for_, then you're effectively searching for a _fixed_ string, not a 
pattern/regexp. So why on earth are you using regexps to do your 
searching?

The `str` type has a `find(substring)` function. Just use that! It'll be 
faster and the code simpler!

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Python-list mailing list