Negative regular expressions (searching for "i" not inside command)

Terry Reedy tjreedy at udel.edu
Thu Aug 28 17:45:05 EDT 2008



Bart Kastermans wrote:
> I have a file in which I am searching for the letter "i" (actually
> a bit more general than that, arbitrary regular expressions could
> occur) as long as it does not occur inside an expression that matches
> \\.+?\b (something started by a backslash and including the word that
> follows).

You should either make sure that the opposite, a match of \\.+?\b inside 
a match of your target re, cannot occur, or consider what you want to 
happen if it can.

> More concrete example, I have the string "\sin(i)" and I want to match
> the argument, but not the i in \sin.
> 
> Can this be achieved by combining the regular expressions?  I do not
> know the right terminology involved, therefore my searching on the
> Internet has not led to any results.
> 
> I can achieve something like this by searching for all i and then
> throwing away those i that are inside such expressions.

If you do not need the original position in the text of each match, and 
you are not concerned about target matches encompassing splitter 
matches, you could switch the order of searching.

for fragment in re.split(text, r'\\.+?\b'):
   <search fragment for target>

 >  I am now just wondering if these two steps can be combined into one.

Perhaps find \\.+?\b or target, with only the latter captured, but I 
will leave that to someone else.

tjr




More information about the Python-list mailing list