re.search slashes

Scott David Daniels scott.daniels at acm.org
Sat Feb 4 16:09:06 EST 2006


pyluke wrote:
> Scott David Daniels wrote:
>> pyluke wrote:
>>> I... want to find lines with ... "\[" but not instances of "\\["
>>
>> If you are parsing with regular expressions, you are running a marathon.
>> If you are doing regular expressions without raw strings, you are running
>> a marathon barefoot.
> I'm not sure what you mean by running a marathon.

I'm referring to this quote from: http://www.jwz.org/hacks/marginal.html
     "(Some people, when confronted with a problem, think ``I know, I'll
     use regular expressions.'' Now they have two problems.)"

 > I do follow your statement on raw strings, but that doesn't seem
 > to be the problem.

It is an issue in the readability of your code, not the cause of the
code behavior that you don't like.  In your particular case, this is
all made doubly hard to read since your patterns and search targets
include back slashes.

> \[
>   \nabla \cdot u = 0
> \]
> 
> I don't want to find the following
> 
> \begin{tabular}{c c}
>   a & b \\[4pt]
>   1 & 2 \\[3pt]
> \end{tabular}
> 

how about:  r'(^|[^\\])\\\['
Which is:
     Find something beginning with either start-of-line or a
     non-backslash, followed (in either case) by a backslash
     and ending with an open square bracket.

Generally, (for the example) I would have said a good test set
describing your problem was:

     re.compile(pattern).search(r'\[   ') is not None
     re.compile(pattern).search(r' \[ ') is not None
     re.compile(pattern).search(r'\\[   ') is None
     re.compile(pattern).search(r' \\[   ') is None

--Scott David Daniels
scott.daniels at acm.org



More information about the Python-list mailing list