How to escape strings for re.finditer?

Thomas Passin list1 at tompassin.net
Tue Feb 28 16:26:04 EST 2023


On 2/28/2023 2:40 PM, David Raymond wrote:
> With a slight tweak to the simple loop code using .find() it becomes a third faster than the RE version though.
> 
> 
> def using_simple_loop2(key, text):
>      matches = []
>      keyLen = len(key)
>      start = 0
>      while (foundSpot := text.find(key, start)) > -1:
>          start = foundSpot + keyLen
>          matches.append((foundSpot, start))
>      return matches
> 
> 
> using_simple_loop: [0.1732664997689426, 0.1601669997908175, 0.15792609984055161, 0.1573973000049591, 0.15759290009737015]
> using_re_finditer: [0.003412699792534113, 0.0032823001965880394, 0.0033694999292492867, 0.003354900050908327, 0.0033336998894810677]
> using_simple_loop2: [0.00256159994751215, 0.0025471001863479614, 0.0025424999184906483, 0.0025831996463239193, 0.0025555999018251896]

On my system the difference is way bigger than that:

KEY = '''it doesn't matter, but in other cases it will.'''

using_simple_loop2: [0.0004955999902449548, 0.0004844000213779509, 
0.0004862999776378274, 0.0004800999886356294, 0.0004792999825440347]

using_re_finditer: [0.002840900036972016, 0.0028330000350251794, 
0.002701299963518977, 0.0028105000383220613, 0.0029977999511174858]

Shorter keys show the least differential:

KEY = 'in'

using_simple_loop2: [0.001983499969355762, 0.0019614999764598906, 
0.0019617999787442386, 0.002027600014116615, 0.0020669000223279]

using_re_finditer: [0.002787900040857494, 0.0027620999608188868, 
0.0027723999810405076, 0.002776700013782829, 0.002946800028439611]

Brilliant!

Python 3.10.9
Windows 10 AMD64 (build 10.0.19044) SP0



More information about the Python-list mailing list