Regular expression that skips single line comments?
Steven D'Aprano
steven at REMOVE.THIS.cybersource.com.au
Mon Jan 19 16:24:50 EST 2009
On Mon, 19 Jan 2009 08:08:01 -0800, martinjamesevans wrote:
> I am trying to parse a set of files that have a simple syntax using RE.
> I'm interested in counting '$' expansions in the files, with one minor
> consideration. A line becomes a comment if the first non-white space
> character is a semicolon.
Since your data is line-based, surely the simplest, clearest and most
natural solution is to parse each line individually instead of trying to
process the entire input with a single RE?
def extract_dollar_expansions(sInput):
accumulator = []
for line in sInput.split('\n'):
line = line.lstrip()
if line.startswith(';'):
continue
accumulator.extend(re.findall(r"(\$.)", line))
return accumulator
(Aside: why are you doing a case-insensitive match for a non-letter? Are
there different upper- and lower-case dollar signs?)
>>> extract_dollar_expansions(sInput)
['$3', '$3', '$3', '$5', '$6', '$7']
--
Steven
More information about the Python-list
mailing list