re.search slashes

Xavier Morel xavier.morel at masklinn.net
Sat Feb 4 13:45:05 EST 2006


Scott David Daniels wrote:
> pyluke wrote:
>> I'm parsing LaTeX document and want to find lines with equations blocked 
>> by "\[" and "\]", but not other instances of "\[" like "a & b & c \\[5pt]"
>> so, in short, I was to match "\[" but not "\\]" ....  I've tried:
>> check_eq = re.compile('(?!\%\s*)\\\\\[')
>  > check_eq.search(line)
>  > this works in finding the "\[" but also the "\\["
> 
> If you are parsing with regular expressions, you are running a marathon.
> If you are doing regular expressions without raw strings, you are running
> a marathon barefoot.
> 
> Notice:  len('(?!\%\s*)\\\\\[') == 13
>           len(r'(?!\%\s*)\\\\\[') == 15
> 
>> so I would think this would work
>> check_eq = re.compile('(?![\%\s*\\\\])\\\\\[')
>> check_eq.search(line)
>>
>> but it doesn't.  Any tips?
> Give us examples that should work and that should not (test cases),
> and the proper results of those tests.  Don't make people trying to
> help you guess about anything you know.
> 
> --Scott David Daniels
> scott.daniels at acm.org

To add to what scott said, two advices:
1. Use Kodos, it's a RE debugger and an extremely fine tool to generate 
your regular expressions.
2. Read the module's documentation. Several time. In your case read the 
"negative lookbehind assertion" part "(?<! ... )" several time, until 
you understand how it may be of use to you.



More information about the Python-list mailing list