How to escape strings for re.finditer?

avi.e.gross at gmail.com avi.e.gross at gmail.com
Mon Feb 27 21:16:01 EST 2023


Jen,

Can you see what SOME OF US see as ASCII text? We can help you better if we get code that can be copied and run as-is.

 What you sent is not terse. It is wrong. It will not run on any python interpreter because you somehow lost a carriage return and indent.

This is what you sent:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
find_string = re.escape('abc_degree + 1') for match in re.finditer(find_string, example):
    print(match.start(), match.end())

This is code indentedproperly:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
find_string = re.escape('abc_degree + 1') 
for match in re.finditer(find_string, example):
    print(match.start(), match.end())

Of course I am sure you wrote and ran code more like the latter version but somewhere in your copy/paste process, ....

And, just for fun, since there is nothing wrong with your code, this minor change is terser:

>>> example = 'X - abc_degree + 1 + qq + abc_degree + 1'
>>> for match in re.finditer(re.escape('abc_degree + 1') , example):
...     print(match.start(), match.end())
... 
...     
4 18
26 40

But note once you use regular expressions, and not in your case, you might match multiple things that are far from the same such as matching two repeated words of any kind in any case including "and and" and "so so" or finding words that have multiple doubled letter as in the  stereotypical bookkeeper. In those cases, you may want even more than offsets but also show the exact text that matched or even show some characters before and/or after for context.


-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of Jen Kris via Python-list
Sent: Monday, February 27, 2023 8:36 PM
To: Cameron Simpson <cs at cskk.id.au>
Cc: Python List <python-list at python.org>
Subject: Re: How to escape strings for re.finditer?


I haven't tested it either but it looks like it would work.  But for this case I prefer the relative simplicity of:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
find_string = re.escape('abc_degree + 1') for match in re.finditer(find_string, example):
    print(match.start(), match.end())

4 18
26 40

I don't insist on terseness for its own sake, but it's cleaner this way.  

Jen


Feb 27, 2023, 16:55 by cs at cskk.id.au:

> On 28Feb2023 01:13, Jen Kris <jenkris at tutanota.com> wrote:
>
>> I went to the re module because the specified string may appear more than once in the string (in the code I'm writing).
>>
>
> Sure, but writing a `finditer` for plain `str` is pretty easy (untested):
>
>  pos = 0
>  while True:
>  found = s.find(substring, pos)
>  if found < 0:
>  break
>  start = found
>  end = found + len(substring)
>  ... do whatever with start and end ...
>  pos = end
>
> Many people go straight to the `re` module whenever they're looking for strings. It is often cryptic error prone overkill. Just something to keep in mind.
>
> Cheers,
> Cameron Simpson <cs at cskk.id.au>
> --
> https://mail.python.org/mailman/listinfo/python-list
>

-- 
https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list