Regular expression that skips single line comments?

Steven D'Aprano steven at REMOVE.THIS.cybersource.com.au
Mon Jan 19 16:24:50 EST 2009


On Mon, 19 Jan 2009 08:08:01 -0800, martinjamesevans wrote:

> I am trying to parse a set of files that have a simple syntax using RE.
> I'm interested in counting '$' expansions in the files, with one minor
> consideration. A line becomes a comment if the first non-white space
> character is a semicolon.

Since your data is line-based, surely the simplest, clearest and most 
natural solution is to parse each line individually instead of trying to 
process the entire input with a single RE?


def extract_dollar_expansions(sInput):
    accumulator = []
    for line in sInput.split('\n'):
        line = line.lstrip()
        if line.startswith(';'):
            continue
        accumulator.extend(re.findall(r"(\$.)", line))
    return accumulator


(Aside: why are you doing a case-insensitive match for a non-letter? Are 
there different upper- and lower-case dollar signs?)



>>> extract_dollar_expansions(sInput)
['$3', '$3', '$3', '$5', '$6', '$7']




-- 
Steven



More information about the Python-list mailing list