How to escape strings for re.finditer?

avi.e.gross at gmail.com avi.e.gross at gmail.com
Mon Feb 27 19:06:04 EST 2023


MRAB makes a valid point. The regular expression compiled is only done on the pattern you are looking for and it it contains anything that might be a command, such as an ^ at the start or [12] in middle, you want that converted so NONE OF THAT is one. It will be compiled to something that looks for an ^, including later in the string, and look for a real [ then a real 1 and a real 2 and a real ], not for one of the choices of 1 or 2. 

Your example was 'cty_degrees + 1' which can have a subtle bug introduced. The special character is "+" which means match greedily as many copies of the previous entity as possible. In this case, the previous entity was a single space. So the regular expression will match 'cty degrees' then match the single space it sees because it sees a space followed ny a plus  then not looking for a plus, hits a plus and fails. If your example is rewritten in whatever way re.escape uses, it might be 'cty_degrees \+ 1' and then it should work fine.

But converting what you are searching for just breaks that as the result will have a '\+" whish is being viewed as two unrelated symbols and the backslash breaks the match from going further.



-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of MRAB
Sent: Monday, February 27, 2023 6:46 PM
To: python-list at python.org
Subject: Re: How to escape strings for re.finditer?

On 2023-02-27 23:11, Jen Kris via Python-list wrote:
> When matching a string against a longer string, where both strings have spaces in them, we need to escape the spaces.
> 
> This works (no spaces):
> 
> import re
> example = 'abcdefabcdefabcdefg'
> find_string = "abc"
> for match in re.finditer(find_string, example):
>      print(match.start(), match.end())
> 
> That gives me the start and end character positions, which is what I want.
> 
> However, this does not work:
> 
> import re
> example = re.escape('X - cty_degrees + 1 + qq') find_string = 
> re.escape('cty_degrees + 1') for match in re.finditer(find_string, 
> example):
>      print(match.start(), match.end())
> 
> I’ve tried several other attempts based on my reseearch, but still no match.
> 
> I don’t have much experience with regex, so I hoped a reg-expert might help.
> 
You need to escape only the pattern, not the string you're searching.
--
https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list