simplest way to strip a comment from the end of a line?

eric eric at ericaro.net
Thu Dec 4 15:08:51 EST 2008


On Dec 4, 5:15 pm, eric <e... at ericaro.net> wrote:
> On Dec 4, 4:50 pm, Joe Strout <j... at strout.net> wrote:
>
> > I have lines in a config file which can end with a comment (delimited  
> > by # as in Python), but which may also contain string literals  
> > (delimited by double quotes).  A comment delimiter within a string  
> > literal doesn't count.  Is there any easy way to strip off such a  
> > comment, or do I need to use a loop to find each # and then count the  
> > quotation marks to its left?
>
> > Thanks,
> > - Joe
>
> Hi,
>
> if the string literal you wan't to escape, is not escaped (i.e
> contains \" ) then a regexp like
>
> .*?(?:".*?".*?)*#(?P<comment> .*?)$
>
> (not tested)
> .*?  everything but keep it greedy
> ".*?" the string literal not escaped


well it works too

import re

test ='''this is a test 1
this is a test 2 #with a comment
this is a '#gnarlier' test #with a comment
this is a "#gnarlier" test #with a comment
'''

splitter = re.compile(r'(?m)^(?P<data>.*?(".*?".*?)*)(?:#.*?)?$')

def com_strip(text):
    return [x[0] for x in splitter.findall(test) ]

# raw implementation
for line in test.split('\n'):
    print line, '->', re.match(r'(?P<data>.*?(".*?".*?)*)(?:#.*?)?$',
line).group("data")

# with a function
for line in com_strip(test):
    print line

and here is the console output

this is a test 1 -> this is a test 1
this is a test 2 #with a comment -> this is a test 2
this is a '#gnarlier' test #with a comment  -> this is a '
this is a "#gnarlier" test #with a comment  -> this is a "#gnarlier"
test
 ->
this is a test 1
this is a test 2
this is a '
this is a "#gnarlier" test





More information about the Python-list mailing list