Help with regular expression using findall and .*?

darrell dgallion1 at yahoo.com
Sat Sep 14 09:28:56 EDT 2002


Here's an example with backtracking turned off
>>> s="""a\nb\n1""" 
>>> re.findall("[^\n]+?\d", s) 
['a\nb\n1'] 

Which is not correct.
What ever pattern proceeds the '+' must be reevaluated as the pattern moves 
forward. sre handles this though recursion.

import re
s2=('macro\n'+'a'*20000+'\norcam\n')*10
s2split=re.split("macro\n|\norcam\n",s2)
for r in s2split:
    print r


This should be fast also.
--Darrell


czrpb wrote:
> Harvey:
> 
> Great thanks!! And thanks for sticking to my question's requirements.
> <wink!>
> 
> Ok, this is what we thought around here. But what I do not understand is
> why any backtracking data is being kept? The '?' in '.*?' means it is
> non-greedy right? When would backtracking ever occur using '.*?'? What am
> I missing?
> 




More information about the Python-list mailing list