Regular expression that skips single line comments?

martinjamesevans at gmail.com martinjamesevans at gmail.com
Mon Jan 19 11:08:01 EST 2009


I am trying to parse a set of files that have a simple syntax using
RE. I'm interested in counting '$' expansions in the files, with one
minor consideration. A line becomes a comment if the first non-white
space character is a semicolon.

e.g.  tests 1 and 2 should be ignored

sInput = """
; $1 test1
    ; test2 $2
    test3 ; $3 $3 $3
test4
$5 test5
   $6
  test7 $7 test7
"""

Required output:    ['$3', '$3', '$3', '$5', '$6', '$7']


The following RE works fine but does not deal with the commented
lines:

re.findall(r"(\$.)", sInput, re.I)

e.g. ['$1', '$2', '$3', '$3', '$3', '$5', '$6', '$7']


My attempts at trying to use (?!;) type expressions keep failing.

I'm not convinced this is suitable for a single expression, so I have
also attempted to first find-replace any commented lines out without
much luck.

e.g. re.sub(r"^[\t ]*?;.*?$", r"", sInput, re.I+re.M)


Any suggestions would be appreciated. Thanks

Martin



More information about the Python-list mailing list