Regular expression that skips single line comments?

MRAB google at mrabarnett.plus.com
Mon Jan 19 11:29:44 EST 2009


martinjamesevans at gmail.com wrote:
> I am trying to parse a set of files that have a simple syntax using
> RE. I'm interested in counting '$' expansions in the files, with one
> minor consideration. A line becomes a comment if the first non-white
> space character is a semicolon.
> 
> e.g.  tests 1 and 2 should be ignored
> 
> sInput = """
> ; $1 test1
>     ; test2 $2
>     test3 ; $3 $3 $3
> test4
> $5 test5
>    $6
>   test7 $7 test7
> """
> 
> Required output:    ['$3', '$3', '$3', '$5', '$6', '$7']
> 
> 
> The following RE works fine but does not deal with the commented
> lines:
> 
> re.findall(r"(\$.)", sInput, re.I)
> 
> e.g. ['$1', '$2', '$3', '$3', '$3', '$5', '$6', '$7']
> 
> 
> My attempts at trying to use (?!;) type expressions keep failing.
> 
> I'm not convinced this is suitable for a single expression, so I have
> also attempted to first find-replace any commented lines out without
> much luck.
> 
> e.g. re.sub(r"^[\t ]*?;.*?$", r"", sInput, re.I+re.M)
> 
> 
> Any suggestions would be appreciated. Thanks
> 
You could use:

 >>> re.findall(r"^\s*;.*|(\$.)", sInput, re.M)
['', '', '$3', '$3', '$3', '$5', '$6', '$7']

and then ignore the empty strings.



More information about the Python-list mailing list